UVa Digital Initiatives Terminology
Definitions and local usage overviews
for standards, formats, protocols, and services (and their acronyms) that we
use or support in the Library's digital initiatives and daily operations.
- Cataloging and Metadata: Crosswalk, Dublin
Core, Finding Aid, FRBR,
GDMS, Header, MARC,
Metadata, METS, MODS, Virgo,
VRA Core
- Collections: Collection Master, Delivery,
Delivery Master, Derivative,
Documentary Image, Page Image
- Document Encoding: DDI, DTD,
EAD, Element, FGDC,
HTML, OAI, PDF,
QuickTime, Schema, SGML,
Tag, TEI, Unicode,
XML
- Images and Multimedia: Flash, GIF,
JPEG, JPEG 2000, MPEG,
MrSID, QuickTime, TIFF,
Wavelet
- Database Technology: Content Model, Digital Library Repository, Disseminator, Fedora, Ingestion, MySQL, FileMaker,
ODBC, SQL
- Structured Data: DTD, Element,
GIS, HTML, Schema,
SGML, Tag, XML
- Searching: Open Text, OpenURL,
SRW, Tamino, XPAT,
XPath/XQuery, Z39.50
- Scripting and Programming: CSS, CGI
, Java, JavaScript, Object-oriented,
Open Source, Perl, PHP,
XSLT
- Remote Access and Services: Applet, Authentication,
CSS, CrossRef, DOI,
Domain, FTP, Handle,
HTTP, IP Address, Metasearch,
OpenURL, Persistent Identifier,
Plugin, RSS, Search
Engine, Servlet, SOAP, Streaming,
SRU/SRW, TCP/IP, URL,
URN, UVa Anywhere, UVa
Proxy Server, Web Services, XSLT,
Z39.50
Departments, Committees, and Affiliated Centers
- DAS: Digital Access Services. Includes the Digital Library Repository
and electronic journal and database access.
- DCRT: Digital Content Review Team. Integrating selection of digital
content into the selection process.
- DLPS: Digital Library Production Services. Core digital collection
production.
- DLR&D: Digital Library Research and Development. Research and
programming for new digital initiatives.
- DML: Digital Media Lab (part of the Robertson Media Center). Faculty
and student digital production consulting and facilities for working with
images, audio, and video. http://www.lib.virginia.edu/clemons/RMC/dml.html
- Etext: Electronic Text Center. Faculty and student consulting and
facilities for the creation of electronic texts. http://etext.lib.virginia.edu/
- Geostat: Geospatial & Statistical Data Center. Access to the
map and data collections, and consulting for faculty and students on the
use and production of data resources. http://fisher.lib.virginia.edu/
- IATH: Institute for Advanced Technology in the Humanities. A Center
that is part of Arts and Sciences and housed in the Library, which provides
faculty fellows with consulting, technical support, and applications programming
for the creation of research projects in digital form. http://www.iath.virginia.edu/
- MSG. Metadata Steering Group. The Library working group that makes
decisions about metadata mapping and usage. http://www.lib.virginia.edu/digital/metadata/msg.html
- RMC: Robertson Media Center, Clemons Library. Access to the video
and audio collections, and consulting for faculty and students on the use
and production of digital media resources through the Digital Media Lab.
http://www.lib.virginia.edu/clemons/RMC/
- RMDS: Rare Materials Digital Services. Digitization of items from
Special Collections. http://www.lib.virginia.edu/speccol/services/digitalservices.html
- SEDI: Original name for the Brown Science & Engineering Library Research Computing Lab. Faculty and student digital production consulting and facilities
for working with digital resources. http://www.lib.virginia.edu/science/rescomp/
- VCDH: Virginia Center for Digital History. A Center that is part
of Arts and Sciences and housed in the Library, which is charged with creating
new forms of digital American historical scholarship and public service
and outreach through faculty fellowships. http://www.vcdh.virginia.edu/
Applet
- What it is: A small Java program that is sent
as a separate file along with a Web page. Java applets, usually intended
for running on a client, can result in such services as performing a calculation
for a user or positioning an image based on user interaction.
- Local use: One example is the "ImageViewer" applet used by
the Repository used in the delivery of the image collections,
which allows users to zoom into an image and move it around on screen to
see details.
Authentication
- What it is: A process that verifies that an individual, computer,
or information object is who or what it purports to be, in order to provide
access to material that is restricted by license.
- Local use: All electronic journals, databases, and licensed image
sets have access restrictions built into our subscriptions, requiring that
we authenticate our users. Most often, this is by IP Address,
which identifies that their computers are located within the UVa Domain.
In some cases, access is restricted to groups (such as electronic reserves
for courses) or individuals, so authentication is handled by individual
logons. The Repository is being designed to handle authentication
for individual users and files. UVa has two processes in place - UVa
Proxy Server and UVa Anywhere.
- What it is: Common Gateway Interface. An
interface that Web servers use to invoke server scripts, or programs.
- Interface specified by: National Center for Supercomputing Applications,
University of Illinois. http://hoohoo.ncsa.uiuc.edu/cgi/
- Local use: All Library Web servers run CGI scripts.
Collection Master
- What it is: A term used to describe digital files that have not
been transformed or modified, after their original capture by scanning.
- Local use: The Library keeps collection masters as part of its
digital preservation strategy.
Content Model
- What it is: A Fedora concept used for object management. Content models represent classes of data objects, which can be single units of content, complex data objects, or even aggregations of data objects.
- Local use: The Digital Library Repository is structured using a number of local content models that fit out collection needs, and will continue to add content models as new formats are added to the collections.
CrossRef
- What it is: CrossRef is a citation linking backbone -- a collaborative
reference linking service that allows the user to click on a citation and
be taken directly to the target content. CrossRef is built upon a system
of DOIs in a knowledgebase that tracks the location of
the objects that they represent.
- Standard specified by: The Publishers International Linking Association
(PILA). http://www.crossref.org/
- Local use: Many of the Library's licensed resources have OpenURLs
that take advantage of DOIs managed through CrossRef. The Library is a member
of CrossRef.
Return to top
Crosswalk
- What it is: A chart or table that represents the semantic mapping
of fields or data elements in one data standard to fields or data elements
in another standard that have a similar function or meaning. Crosswalks
enable dissimilar databases to be searched simultaneously and to effectively
convert data from one metadata standard to another.
- Local use: The Library associates a number of different metadata
standards to one another through crosswalks, including Dublin
Core, EAD, GDMS, MARC,
METS, TEI, and the VRA
Core.
CSS
- What it is: Cascading Style Sheets. A mechanism
for adding a custom style to Web documents. The CSS file specifies how text
enclosed in a particular tag should look, including font, color, and spacing.
- Standard specified by: World Wide Web (W3) Consortium. http://www.w3.org/Style/CSS/
- Local use: The Library has developed a series of CSS styles for
use by departments and Libraries to ensure consistent styles in the content
of their Web pages. CSS files are also used in combination with XSLT
files in delivering XML content on the web.
DDI
- What it is: Data Documentation Initiative.
A developing standard for the content, presentation, transport, and preservation
of metadata about datasets in the social and behavioral sciences, using
a XML DTD. The term "codebook"
is used to describe these files.
- Standard specified by: The Data Documentation Initiative, at http://www.icpsr.umich.edu/DDI/.
- Local use: A set of prototype DDI files were created in 2001 for
social science data resources in the Geostat collection; DDI is under review
for local production use.
Delivery
- What it is: The process by which digital content is presented to
the user. Delivery of content can be managed by XSLT or CSS stylesheets
or another interface mechanism for display on the web or via other means.
- Local use: All publicly-used Library digital content has at least
one delivery method.
Delivery Master
- What it is: A term used to describe data, usually images, that
has been modified for presentation purposes. This clean up might include
removing dust and scratches, improving general appearance, and increasing
the contrast of the image so that it displays better on screen. Delivery
masters do not usually display well in a web browser or are too resource
intensive for daily use.
- Local use: Delivery Masters are one step removed from Collection
Masters, and are themselves used as the source for creating Derivatives.
Derivative
- What it is: A term used to describe web deliverable files, generally
images, which are created from a Delivery Master.
Derivatives can includes multiple sizes and formats as required by the delivery
applications.
- Local use: Derivatives delivered by the Library vary by collection,
but fall into three general categories: Preview (a small, thumbnail JPEG);
Screen (a screen-sized JPEG image); and Max (a larger, higher resolution
file that supports zooming; either a JPEG or a MrSid.
Return to top
Digital Library Repository (aka Central Digital Repository)
- What it is: A digital library system that provides a means of uniquely
identifying each piece of digital content (and groups of related content
or collections), managing the content and the associated access rights,
support discovery of those collections, and delivery to users. The UVa Digital
Repository (or just Repository or "The Repo" or "Cenrepo")
uses the Fedora system as its underlying infrastructure.
- Software developed and supported by: UVa Library.
- Local use: The first two phases of the Repository, providing cross-collection
access to a number of digital image collections, as well as to some text
collections, were tested in the 2004-2005 and 2005-2006 academic years. The production services and tools will launch in fall 2006.
Disseminator
- What it is: A Fedora concept used for object management and delivery. The objects contain linkages between datastreams (internally managed or external media files), metadata (inline or external), system metadata (including a PID a persistent identifier that is unique to the Repository), and disseminators that bind the data objects to behavior objects managed by Fedora that provide software processes (behaviors) that can be used with the datastreams. Behaviors encode the varying functionality that an end-user or another system would require or encounter in its use of an object in a Fedora repository. The disseminators include the varying programmatic mechanisms needed to execute those behaviors for the varying types of objects.
- Local use: The Digital Library Repository is structured using a number of local disseminators linked to local content models that fit out collection needs, and will continue to add disseminators and content models as new functionality is needed and formats are added to the collections.
Documentary Image
- What it is: A term used to described any image that is not a page
image; an image documenting a work of art, a building, or a place, an ethnographic
documentary images, etc.
- Local use: The Art and Architecture image collections, ethnographic
images, etc.
- What it is: Digital Object Identifier. DOIs
are names (characters and/or digits) assigned to objects of intellectual
property (physical, digital or abstract) such as electronic journal articles,
images, learning objects, ebooks, images, any kind of content. They are
used to provide current information, including where they (or information
about them) can be found on the Internet. Information about a digital object
may change over time, including where to find it, but its DOI will not change.
A DOI is a Persistent Identifier.
- Format specified by: The International DOI Foundation. http://www.doi.org/
- Local use: DOIs are not currently in use at the UVa Library, but
many of our licensed resources have DOIs assigned to them. The UVa Library
has signed on to CrossRef, an reference linking
service, which allows us to assign DOIs when and if it becomes part of our
operations.
Domain
- What it is: The address that identifies an Internet or other network
site. On the Internet, domain names act as mnemonic aliases for IP
addresses, a hierarchical numeric addressing system that enables Internet
hosts to be uniquely identified. Domain names consist of at least two parts;
the top-level domain, which specifies host addresses at a national or broad
sector level (e.g. ".com" for businesses and ".edu"
for educational institutions), and the sub-domain that is registered to
a specific organization or individual within that domain (e.g. "virginia"
for UVa). Domain names are hierarchical, and UVa has the authority to issue
sub-domain names, such as "www.lib.virginia.edu" or "infocomm.lib.virginia.edu"
within "lib." A domain additionally reflects the range of IP addresses
contained in it.
- Local use: Due to licensing restrictions, the Library often limits
access to resources such as electronic journals and databases "by domain,"
limiting availability to computers whose addresses are within UVa's range
of addresses.
DTD
- What is it: Document Type Definition. A DTD
is a formal description of a particular type of document. It sets out what
names are to be used for the different types of element,
where they may occur, how they can be used, and how they all fit together.
DTD were originally used with SGML; and are now used
with XML. Elements (categories of information) and tags
(the markup that identifies chunks of content as a particular element) are
set in the DTD. Usage is also specified, such as vocabulary used to elaborate
on element types, whether elements are required, whether elements can contain
other elements, and whether they can repeat. A Schema
is a newer type of DTD.
- Format specified by: The grammar or the construction of a DTD for
XML is set by the World Wide Web Consortium (W3C) at http://www.w3.org/TR/REC-xml.
Individual DTDs are defined by many organizations in the community.
- Local Use: Extensively used at the Library for many years. DTDs
that are used for the creation of resources include the EAD,
TEI, and METS. DDI
and the VRA Core are planned for use.
Dublin Core
- What it is: The Dublin Core metadata element
set is a standard for cross-domain information resource description. It
can be used in Web pages to describe their content, or as a minimal metadata
standard for cataloging digital files.
- Standard developed by: The Dublin Core Metadata Initiative. http://dublincore.org/
- Local Use: The Dublin Core plays an important role in relating
elements from many metadata standards to each other. The standards that
we use in cataloging collections are mapped (identified as essentially equivalent)
to elements in the Dublin Core, so that the Dublin Core can serve as the
center of a crosswalk between the various standards.
- What it is: Encoded Archival Description.
A markup format for describing archival finding aids.
- Standard specified by: The Library of Congress and the Society
of American Archivists. http://lcweb.loc.gov/ead/
- Local use: Special Collections prepares their finding aides using
EAD. The Library delivers those files, along with finding aids for a number
of other archives through the Virginia Heritage site at http://www.lib.virginia.edu/vhp/.
Return to top
Element
- What it is: An element in HTML, SGML,
or XML is a fundamental component of the structure of
a document. In HTML, elements are used to provide formatting structure (paragraph
breaks, bold face, indentation, etc.) In SGML or XML, elements can be used
to add metadata about a work (such as the author, title, and date of an
electronic text), denote structure (such as a paragraph or a page break
as well as formatting), or categorize part of the content (proper name,
geographic location, etc.) Elements can contain plain text, other elements,
or both. To denote or mark up the various elements in a document, you use
tags.
- Standard defined by: Defined within HTML and the various DTD
standards for SGML and XML.
- Local Use: In all HTML, SGML, and XML documents.
Fedora
- What it is: Flexible and Extensible Digital
Object and Repository Architecture. A digital library
management system infrastructure that contains tracking information about
digital collections and their relationships with each other. Objects are
associated with disseminators (methods of accessing, delivering, and displaying
the content) that can provide access to the collections via Web browsers
as well as for use in software applications.
- Software developed by: Originally proposed at Cornell University
and jointly developed by Cornell and the UVa Library. http://www.fedora.info/
- Local use: Fedora is the underlying infrastructure for the Library's
Central Digital Repository.
FGDC Standards Reference Model
- What it is: Federal Geographic Data Committee.
A related group of metadata standards providing a common set of terminology
and definitions for the documentation of digital geospatial data. The standard
establishes the names of data elements and compound elements (groups of
data elements) to be used for these purposes, the definitions of these compound
elements and data elements, and information about the values that are to
be provided for the data elements.
- Standard specified by: The Federal Geographic Data Committee (FGDC).
http://www.fgdc.gov/standards/standards.html
- Local use: The Geostat Center uses the FGDC standard to describe
its GIS collections.
FileMaker
- What it is: A desktop database program known for its ease of use
that supports relationships between tables, fine-grained user privileges,
basic scripting, and the live publishing of its databases online with no
additional programs. FileMaker Pro v.6 automatically generates HTML
and can import and export XML data with an XSL
transformation. A SQL database. Supports ODBC.
- Software developed by: FileMaker, Inc. http://www.filemaker.com/
- Local use: Rare Materials Digital Services has developed in-house
digital access tracking databases and public databases such as the Holsinger
Database; the Digital Media Lab manages and delivers some course image collections
online for faculty using FileMaker; the Fine Arts Library is using a custom
FileMaker database developed at Brown University called IRIS for its image
collection management and cataloging.
Finding Aid
- What it is: An archival descriptive tool such as an inventory or
register, or sometimes called a calendar. Finding aids typically take the
form of hierarchical, narrative descriptions of collections of manuscript
materials or archival records.
- Local use: Finding aids were once typed up or created using a word
processor and printed out for patron use. UVa Special Collections creates
electronic finding aides using the EAD standard, and
makes them available for searching on the web.
- What it is: A proprietary software format for interactive applications
on the web, ranging from simple animations to large-scale applications that
provide access to data.
- Format specified by: Macromedia supplies the authoring tool (Flash
MX) and the plugin (Flash Player) required for viewing
Flash animation and applications in Web browsers. http://www.macromedia.com/software/flash/
and http://www.macromedia.com/software/flashplayer/
- Local use: Used by the Digital Media Lab and IATH to develop interactive
and animated interfaces to research databases on the web.
Return to top
FRBR
- What it is: Functional Requirements for Bibliographic
Records (pronounced "fur-burr"). An entity-relationship
cataloging model that is intended to be independent of any cataloging code
or implementation. The primary concepts are Work, Expression, Manifestation,
and Item. A Work is a distinct intellectual or artistic creation, realized
through an Expression, the intellectual or artistic realization of a work
in the form of alphanumeric notation, musical notation, choreographic notation,
sound, image, object, movement, etc., or any combination. A Manifestation
describes and represents physical entities, that is all the items that have
the same content and carrier. The Item is an individual copy of a Manifestation.
- Standard developed by: International Federation of Library Associations
and Institutions (IFLA). http://www.ifla.org/VII/s13/frbr/frbr.htm
- Local use: Not currently in local use at UVa.
FTP
- What it is: File Transfer Protocol. A method
of transferring files between computers on the Internet. The process for
uploading (moving files to) and downloading (copying files from) a server.
- Protocol specified by: World Wide Web Consortium (W3C). http://www.w3.org/Protocols/rfc959/Overview.html
- Local use: Files are moved between desktop machines and the servers
via a secure FTP application.
GDMS
- What it is: General Descriptive Modeling Scheme,
developed by the UVa Library as the element set for descriptive, administrative,
and technical metadata.
- Standard specified by: UVa Library.
- Software using GDMS: GDMS Tool, a locally developed Java application,
for marking up GDMS XML files.
- Local use: The GDMS documentation is available at: http://www.lib.virginia.edu/dlbackstage/resndev/metadata.html
- What it is: Generalized Image Format. A format
for representing still and simple animated images, used widely in Web pages
and supported by all Web browsers.
- Format specified by: CompuServe, last updated in 1990. CompuServe
does not maintain a site describing GIF, but the specification can be found
several in places, including as
a text file on the W3C site.
- Software using GIF formats: Virtually all graphical Web browsers
and graphical editing programs.
- Local use: GIF and JPEG formats are both used
for Web page graphics on the Library's web sites.
- Notes on use: GIFs support frame-by-frame animation, and transparent
areas. However, no more than 256 colors can appear in a single GIF, making
the format unsuitable for color photographs or other images that require
fine color gradations.
GIS
- What it is: Geographic Information Systems.
GIS is used for storage, retrieval, mapping, and analysis of geographic
data. Spatial "features" (locations) are stored in a coordinate
system (latitude/longitude, etc.), which references a particular place on
the earth. Descriptive attributes in tabular form are associated with spatial
features. Spatial data and associated attributes in the same coordinate
system can then be layered together for mapping and analysis. Not the same
as GPS (Global Positioning System): GPS identifies real locations; GIS organizes
data about them.
- Standard specified by: The Federal Geographic Data Committee (FGDC)
developed the National Spatial Data Infrastructure standard for the formatting
of GIS data, as well as the metadata standard. http://www.fgdc.gov/
- Local use: The Geostat Center and the Science and Engineering Library
collect and develop GIS-based resources, and support the use of GIS at UVa.
Handle
- What it is: A comprehensive system for assigning, managing, and
resolving persistent identifiers, known as "handles," for digital
objects and other resources on the Internet. The protocols enable a distributed
computer system to store handles of digital resources and resolve those
handles into the information necessary to locate and access the resources.
Handles can be used as Uniform Resource Names (URNs)
- Standard specified by: The Corporation for National Research Initiatives
(CNRI). http://www.handle.net/
- Local use: Handles are not in use at the UVa Library.
Return to top
Header
- What it is: The header is the first section of a file, which includes
metadata embedded by the creator of a digital information resource for description
and management purposes. While this is often used to index a file for discovery
and retrieval, it is not necessarily displayed as part of the content.
- Local use: This can include Dublin Core elements
included in the top of a Web page, the TEI header within
an electronic text file, or the technical metadata in the header of a TIFF
image file.
HTML
- What it is: Hypertext Markup Language.
The standard format for Web documents. There are a number of successive
versions of this standard (tags and syntax change over time), some of which
are better supported by some browsers than others. Of particular note is
XHTML, HTML that follows the same strict "well-formedness"
requirements as XML.
- Format specified by: The World Wide Web Consortium. http://www.w3.org/MarkUp/
- Software for use with HTML: Dreamweaver, Note Tab, or any ASCII
text edit can be used to create HTML files. All Web browsers of course read
and display HTML.
- Local use: Used throughout the Library's web environment for static
pages, for content delivered dynamically from databases, and web-delivered
XML content. We apply local custom styles to HTML using cascading
stylesheets. We are moving to XHTML as we can.
- Notes on use: We strongly suggest NOT using Microsoft Word to create
documents and then save them as HTML. If you must re-purpose a Word File,
save the file as HTML and then open the file in Dreamweaver and use its
tool (under the Commands menu) to "Clean up Word HTML."
- What it is: Hypertext Transfer Protocol.
The standard protocol for requesting documents or operations via a Web browser.
- Protocol specified by: The World Wide Web Consortium. http://www.w3.org/Protocols/
- Software using this protocol: All Web servers and browsers.
- Notes on use: Because Web browsers and servers are ubiquitous,
HTTP has become the de-facto standard protocol used to request operations
remotely using a Web browser. The exact details of HTTP are invisible to
most users of the Web, and authors of Web documents.
- What it is: The term used to describe the process through which objects are added into the Digital Library Repository.
- Local use: The term is generically used to describe the loading of digital content into any sort of management system. At the UVa Library the term is used to describe the process where scripts are run to prepare metadata files and add digital objects into the Repository for management and delivery.
IP Address
- What it is: Unique numerical identifier given to each computer
on the network and server on the Internet. The IP address is the address
through which you find resources and how data finds its way from a web site
back to your computer. A domain of IP addresses is
the range of IP addresses assigned to the institution reflected by that
domain. A URL is a mnemonic alias for a server's IP address
and the location of files in its directory structure.
- Local use: AT UVa, all IP addresses are assigned dynamically within
the proper range each time we start our machines. Users off Grounds can
use UVa Proxy Server to assign their machine a UVa
IP Address if needed for authentication purposes.
- What it is: An object-oriented programming language
designed to be secure, and portable across different operating systems.
- Language specified by: Sun Microsystems, with additional tools
provided by various third parties. http://java.sun.com/
- Software using this language: Web browsers will run Java programs,
unless users have turned off Java features. Java programs can also be run
as programs on any machine that can support the appropriate Java environment.
- Local use: Some digital library software is implemented in Java,
including Fedora, the GDMS Tool,
and the image re-sizer applet used in delivering images
from the Repository.
- What it is: A scripting language designed for use on Web pages
or servers.
- Language specified by: Netscape. Microsoft has a competing product
called JScript that includes features that may not work on non-Microsoft
browsers. http://developer.netscape.com/tech/javascript/index.html
- Software using this language: Most Web browsers will run JavaScript
programs to varying degrees, unless users have turned off JavaScript features.
- Local use: Almost all Library Web pages and services delivered
through the repository use JavaScript for menus.
Return to top
- What it is: The image format controlled by the Joint Photographic
Experts Group. Technically, the image files are actually called
JFIF (JPEG File Interchange Format), and use a gracefully stepped-down compression
scheme. Common usage describes these file as JPEG images.
- Format specified by: The Joint Photographic Experts Group. http://www.jpeg.org/
- Software using JPEG format: Most full-featured graphics editors,
and most graphical Web browsers, support JPEG's basic image format, JFIF.
Support is more limited for other JPEG formats.
- Local use: JPEG format images are used for Web page graphics on
UVa Library web sites, as well as for the delivery of documentary images
and page images from the collections.
- Notes on use: JPEG is particularly useful for displaying photographs
and other images on the Web that do not use a limited color palette or sharply
defined boundaries. It uses a compression algorithm that can be optimized
for either image quality or compactness. JPEG (the group) is working on
new standards (JPEG2000) that support lossless compression and wavelet compression
(a compression technique also used by MrSID).
- What it is: The wavelet-compressed image
format controlled by the Joint Photographic Experts
Group. Also referred to as J2.
- Format specified by: The Joint Photographic Experts Group. http://www.jpeg.org/jpeg2000/
- Software using JPEG format: Most recent full-featured graphics
editors. Web browsers require a plug-in to handle JPEG 2000 files.
- Local use: The Library will introduce JPEG 2000 production to in
part replace MrSid files in 2005.
- What it is: Machine Readable Cataloging,
now called the MARC21 standard in its current incarnation. The standard
format for library catalog records. Related standards include MARCXML
(XML tags for the full MARC field set) and MODS
("Metadata Object Description Schema" -- XML tags for a minimal
MARC field set).
- Format specified by: The Library of Congress. http://lcweb.loc.gov/marc/
- Software using this format: Sirsi Unicorn (used to create records
describing our collections) and Sirsi WebCat, the software that runs Virgo.
- Local use: Used for all descriptions of the Library's collections
by the Cataloging department. Local documentation is available at http://www.lib.virginia.edu/cataloging/manual/.
Metadata
- What it is: Literally, "data about data," metadata includes
data associated with either an information system or an information object
for purposes of description, administration, legal requirements, technical
functionality, use and usage, and preservation. In other words, metadata
is cataloging, with the primary difference that metadata can either be external
to the item being described, such as is a MARC record,
or be contained within the item being described, such as Dublin
Core elements included in the top of a Web page or the TEI
header within an electronic text file.
- Format specified by: There are literally hundreds of metadata standards
specified by national and international organizations.
- Local use: Some days it feels like we use them all. Our primary
metadata standards in use at the Library are: DDI, Dublin
Core, EAD, FGDC, GDMS,
MARC, METS, TEI,
and the VRA Core.
Metasearch (aka Federated Search)
- What it is: A collection of distributed databases are searched
in such as way (a query is "federated" across all the selected
databases) that users see the search and the results as if they were searching
a single database; made possible by metadata crosswalks
and protocols such as Z39.50 and OpenURL.
- Local Use: A metasearch tool is planned that can send search Virgo,
the Repository, and licensed resources simultaneously.
METS
- What it is: Metadata Encoding and Transmission
Standard. A standard for encoding descriptive, administrative, and
structural metadata regarding objects within a digital library, expressed
using XML. Can be extended to use elements from other descriptive standards
as necessary.
- Format specified by: The Library of Congress. http://www.loc.gov/standards/mets/
- Software using this standard: The Fedora
system maintains controls of digital library objects using METS encoded
records.
- Local use: We have used METS to store information about our objects managed in the Repository.
MODS
- What it is: Metadata
Object Description Schema. A standard for encoding descriptive, administrative, and
structural metadata regarding objects within a digital library, expressed
using XML. MODS is particularly aimed at encoding MARC data, but many profiles exist that map MODS to many additional metadata formats.
- Format specified by: The Library of Congress. http://www.loc.gov/standards/mods/
- Software using this standard: MODS is becoming the common format for data sharing using the OAI protocol.
- Local use: We will add MODS records to our OAI operations in 2006.
Return to top
MPEG
- What it is: Moving Picture Experts Group,
pronounced m-peg, is a family of digital video compression standards and
file formats developed by the group. MPEG generally produces better-quality
video through a high compression rate by storing only the changes from one
frame to another, instead of each entire frame. MPEG files can be decoded
by special hardware or by software. There are five major MPEG standards.
The MPEG-1 standard provide a video resolution of 352-by-240 at 30
frames per second (fps). This produces video quality slightly below the
quality of conventional VCR videos. MPEG-2 offers resolutions of
720x480 and 1280x720 at 60 fps, with full CD-quality audio. This is sufficient
for all the major TV standards. MPEG-4 is a graphics and video compression
algorithm standard that is based on MPEG-1 and MPEG-2 and Apple QuickTime
technology. Wavelet-based MPEG-4 files are smaller
than JPEG or QuickTime files, so they are designed to transmit video and
images over a narrower bandwidth and can mix video with text, graphics and
2-D and 3-D animation layers. MPEG-7 is a standard for describing
the multimedia content data that supports some degree of interpretation
of the informations meaning, which can be passed onto, or accessed
by, a device or a computer code. MPEG-7 is not aimed at any one application
in particular; rather, the elements that MPEG-7 standardizes support as
broad a range of applications as possible. MPEG-21 is a standard
framework for networked digital multimedia that includes an REL and a Rights
Data Dictionary. Unlike other MPEG standards that describe compression coding
methods, MPEG-21 describes a standard that defines the description of content
and also processes for accessing, searching, storing and protecting the
copyrights of content.
- Format specified by: The ISO MPEG. http://www.iso.org/iso/en/prods-services/popstds/mpeg.htm.
- Local use: The Digital Media Lab produces and delivers local and
licensed MPEG video files.
- What it is: Multi-Resolution Seamless Image
Database. A proprietary format for representing highly compressed
images (using wavelet compression), and a set of
tools to display and manipulate them.
- Format specified by: LizardTech. Originally developed by Los Alamos
National Laboratory and the U. S. Geological Survey. http://www.lizardtech.com/support/faq/general_mrsid.php
- Software using this format: A number of disseminator applications
for the Repository translate MrSid files for viewing
in Web browsers.
- Local use: We create high quality MrSid files for the digital
image collection (this is especially useful for images of maps where supporting
many zoom-in steps is key), and render deliverable images for Web browsers
through our Repository. In 2006 we will switch some or all of our production
to JPEG 2000.
- What it is: An extremely powerful, high-performance database system.
MySQL is possibly the world's most popular open source (noncommercial and
freely available) database. A relational SQL database
that is ODBC-compliant.
- Software developed by: MySQL AB. http://www.mysql.com/
- Local use: Used for all web-based delivery of database content,
(including the Central Digital Repository and Virgo.
- What it is: Open Archives Initiative. Standards
for the encoding, harvesting (capture from remote sites), and the construction
of repositories of metadata records describing local or remote collections.
OAI provides standards and best practice guidelines for the creation of
OAI tools by other organizations and institutions.
- Standard and Protocol developed and maintained by: Open Archives
Initiative Steering Committee. http://www.openarchives.org/
- Local use: Fedora has the ability to "expose"
metadata as an OAI data service provider. The UVa Library has harvested
metadata from other repositories in the general area of American Studies
as a part of the Mellon-funded American Studies grant, and developed prototype
scripts to analyze the relevance of the harvested metadata records. Metadata
records for remote objects may be loaded into the Digital Library Repository.
Object-oriented
- What it is: Having to do with or making use of objects; an object
in this sense is a component containing both data and instructions for the
operations to be performed on that data, as well as relationships between
objects. In object-oriented programming, these reusable components are linked
together in various ways to create applications. One of the principal advantages
of object-oriented programming techniques is that they enable programmers
to create modules that do not need to be changed when a new type of object
is added, making object-oriented programs easier to modify. To create object-oriented
programs, one needs an object-oriented programming language. Java
is an object-oriented language.
- Local Use: The Fedora Repository
is developed using an object structure with Java to store data as objects.
- What it is: Open Database Connectivity.
The ODBC protocol makes it possible to access any data from any application,
regardless of which database is handling the data. ODBC manages this by
inserting a middle layer, called a database driver, between an application
and the database. This layer translates the application's data queries into
commands that the database understands. For this to work, both the application
and the database must be "ODBC-compliant" -- that is, the application
must be capable of issuing ODBC commands and the database must be capable
of responding to them. Queries are formatted using SQL.
- Protocol specified by: Microsoft Corporation. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/HTML/_core_odbc.asp
- Local Use: All interactive Library Web pages that provide access
to databases use the ODBC protocol behind the scenes.
Return to top
- What it is: A concept through which programming code is made available
through a license that supports the users making changes to the code. Any
changes are submitted to a group managing the open source product for possible
incorporation into the official version. Development and support is handled
cooperatively by a group of distributed programmers, usually on a volunteer
basis.
- Local Use: The Library takes advantage of open source products
such as MySQL and PHP, as well as
releasing Fedora as an open source product.
Open Text
- What it is: Open Text is a SGML-aware search
engine. Open Text provides support for word and phrase searching, indexing
of SGML elements and attributes, fast retrieval, and open systems integration.
Open Text (often referred to a OT5, or Open Text version 5) has not been
a supported product for some time, but has been licensed for redevelopment
by the University of Michigan as XPAT. Open Text is
different from an index like Google that cannot distinguish among parts
of a text or Web page because Google indexes all words without structure.
Open Text indexes all words and retains the meaningful structure and categorization
added by the SGML markup allowing both keyword searching and database-like
searching by the elements.
- Software developed by: No longer supported.
- Local use: Open Text and XPAT are used extensively by the Library in the
discovery and delivery of electronic texts and in the Repository.
OpenURL
- What it is: OpenURL is a protocol for interoperability between
a remote information resource and a local system that offers licensed access.
It is a URL that transports metadata or keys to access metadata for the
object for which the OpenURL is provided. Electronic journal and database
publishers are beginning to support OpenURL access to their resources, and
vendors, such as Ex Libris and Sirsi are marketing URL Resolvers (programs
that translate requests).
- Protocol specified by: The NISO Committee for the Standardization
of OpenURL. http://www.sfxit.com/openurl/
and http://library.caltech.edu/openurl/default.htm
- Local use: The Library launched its Find@UVA OpenURL Resolver in fall 2004.
Page Image
- What it is: A scanned image of a page of text.
- Local use: Used to create PDFs for electronic reserves delivered
through Toolkit, as well as in the delivery of electronic texts from the
Etext Center and from the Repository.
- What it is: Portable Document Format. A format
for publishing documents, designed primarily for encoding their appearance
on screen and on paper. PDF-A, an archival standard for PDF, is under development.
- Format specified by: Adobe Software. http://www.adobe.com/products/acrobat/adobepdf.html
PDF/A: http://www.aiim.org/standards.asp?ID=25013
- Software using this format: Adobe's Acrobat suite and all Adobe
software can create PDFs; freely available Adobe Acrobat reader for Web
browsers.
- Local use and support: Used for delivery of electronic reserves
through Toolkit. Some electronic texts can be delivered as PDFs.
Persistent Identifier
- What it is: A persistent identifier is name for a resource which
will remain the same regardless of where the resource is located. Links
to the resource will continue to work even if it is moved. Examples include
DOIs, OpenURL, Handles,
URNs (Uniform Resource Name), and PURLs (Persistent URLs).
- Local use: The Digital Library Repository assigns its own internal
persistent IDs to the objects that it manages. The Library is implementing
Sirsi's OpenURL Resolver, which also takes advantage of DOIs.
Return to top
- What it is: An interpreted programming language often used for
Web scripts, text processing, and rapid prototyping. When invoked by Web
servers, Perl scripts are called via the CGI interface
- Language specified by: Perl Mongers. http://www.perl.org/
- Local use: Perl is used to write scripts that analyze data for
delivery in Web browsers. Perl scripts are often used in combination with
CSS (to format output) or XSLT (to
translate output). Etext and Geostat particularly use Perl as part of their
delivery. Perl scripts are also often written as simple programs that can
transform data taken from one source (such as Virgo records) into other
formats (such as XML files).
- What it is: A scripting language where PHP commands are included
in a Web page (such as commands to retrieve content from databases) and
are then are executed on the web server to generate dynamic HTML pages.
This is an Open Source product.
- Language supported by: The PHP Group. http://www.php.net/
- Local use: One of the programming languages used to provide access
to databases over the web, used most often at the Library with a MySQL database.
Plugin
- What it is: A piece of software that adds extra features to a larger
piece of software. The most frequent context for plugins facilitates access
non-HTML code in via Web browsers. The Flash Viewer,
the Adobe Acrobat PDF Reader, and the QuickTime
Player are plugins. Plugins for ubiquitous format on the Web are generally
available freely.
- Software provided by: The vendors that control the proprietary
format documents to be viewed. Many standard plugins are installed on Library
machines upon setup.
- Local use: The Library does not automatically install updated plugins
when they become available because there are too many new versions appearing
too frequently, often without widespread notice.
QuickTime
- What it is: A format for encoding and delivering motion media and
audio files. Files are stored on a QuickTime server and are "streamed"
(gradually delivered on the fly without downloading) to the web browser.
- Format supported by: Apple Computer. The Server and the Player
software are freely available, but the software for creating QuickTime files
is a commercial product. http://www.apple.com/quicktime/
- Local use: QuickTime is extensively used for the creation of media
files by the Digital Media Lab.
RSS
- What it is: Short for RDF Site Summary or
Rich Site Summary, an XML format for syndicating Web
content that is often referred to as a newsfeed. A Web site that wants to
allow other sites to re-publish some of its content creates an RSS document
and registers the document with an RSS publisher. A user that can read RSS-distributed
content can use the content on a different site. Syndicated content includes
such data as events listings, news stories, headlines, project updates,
excerpts from discussion forums or even corporate information.
- Format supported by: RSS was originally developed by Netscape,
but no longer maintains the standard. Current control over standard is unclear.
http://www.xml.com/pub/a/2002/12/18/dive-into-xml.html
or http://blogs.law.harvard.edu/tech/rss
- Local use: The University myuva Uportal uses RSS to redistribute
news, including the Cavalier Daily. The Robertson Media Center uses a RSS feed to keep users updated about new additions to the collection. RSS feeds on new collections are planned for the Repository.
Schema
- What it is: A DTD is for specifying the structure
(only) of an XML file: it gives the names of the elements
and their attributes that can be used, and how they fit together. DTDs are
designed for use with traditional text documents, not rectangular or tabular
data. A XML Schema provides a means of specifying formal data typing and
validation of content in terms of those data types, so that document type
designers can provide criteria for checking the data content of elements
as well as the markup itself. Schemas are themselves written as XML files.
- Format specified by: Word Wide Web Consortium (W3C). http://www.w3.org/TR/xmlschema-0/
- Local Use: No local use at UVa yet, although we are planning ahead
to ensure that our Digital Library Repository and its
related systems can support Schemas in additions to DTDs.
Return to top
Search Engine
- What it is: A program that indexes documents and allows users to
search the index. In the context of the Web, the term usually refers to
a facility for searching a large index of Web pages, such as Google.
- Local Use: Part of the functionality of Virgo is the WebCat search
engine that in part provides search access to the records describing the
collections. The Library web site uses Google to index and search its pages.
The Etext Center provides search access to their collections using the Open
Text search engine. The XPAT search engine has been
licensed for testing with the Repository.
Servlet
- What it is: A Java program that resides and
executes on a server to add functionality to the server or support processing
of data on the server. The term was coined in the context of the Java applet,
a small program that is sent as a separate file along with a Web page.
- Local use: The Repository uses Servlets for some of its functionality.
SGML
- What it is: Standard Generalized Markup Language.
An older standard format for representing structured documents and data
that was the predecessor to XML. Developed originally
in the 1960s and 1970s for publishing, HTML is derived
from SGML.
- Standard defined by: ISO, but is not available online through them.
http://xml.coverpages.org/sgmlsyn/sgmlsyn.htm
and http://www.oasis-open.org/cover/general.html
- Local Use: Standards defined for SGML include TEI,
and EAD. Electronic texts were marked up by the Etext
Center in TEI for SGML from 1992 to 2001. All SGML resources
in the collections are expected to be converted to XML over time, and as
allowed by license.
Sirsi Unicorn and WebCat
- What it is: Software managing our library acquisitions and cataloging
(Unicorn), and circulation and Online Public Access Catalog (OPAC -- WebCat)
- Software provided by: Sirsi Corporation. See their Unicorn page:
http://www.sirsi.com/Sirsiproducts/unicorn.html
- Local use and support: Used across all areas of the Library, for
acquisitions, cataloging, circulation, and reference. The OPAC provides
access to information about the collection to staff, faculty, students and
the public. Cataloging records are in the MARC format.
SOAP
- What it is: SOAP (Simple Object Access Protocol) is a lightweight
protocol for exchanging messages between computer software, typically in
the form of software components. SOAP is based on XML. SOAP can be run on
top of all the Internet Protocols, but HTTP is the most common. SOAP is
one of the enabling protocols for Web Services.
- Protocol specified by: World Wide Web Consortium (W3C). http://www.w3.org/2000/xp/Group/
- Local Use: Fedora and the Digital
Library Repository utilize SOAP as the protocol for its Web Services.
Streaming
- What it is: Streaming is a method for delivering video and audio
files over the Web. Streaming does not download an entire movie; instead,
it siphons out a thin, one-way data stream at a constant rate that plays
the broadcast in real time. A streamed one-minute movie plays in exactly
one minute. As long as the connection has enough bandwidth to handle the
data stream, the movie will play. After the data is displayed, it is discarded.
Viewers can see the broadcast again only by requesting it from the streaming
server.
- Local Use: Streaming QuickTime media files
are created extensively by the Digital Media Lab. The Library stores its
streaming media files and a server supported by ITC.
Return to top
- What it is: Structured Query Language. SQL
is a standard interactive and programming language for getting information
from and updating a database. Interactions take the form of a command language
that lets you select, insert, update, and find out the location of data.
- Language defined by: A national (ANSI) and international (ISO)
standard. http://web.ansi.org/ and http://www.iso.ch/iso/en/ISOOnline.openerpage
- Local Use: All databases in use at the Library are SQL databases.
SRU/SRW (aka Zing)
- What it is: A web services implementation of the Z39.50
protocol that specifies a client/server-based protocol for searching and
retrieving information from remote databases. It specifies procedures and
structures for a client system to search a database provided by a server,
retrieve database records identified by a search, scan a term list, and
sort a result set. Access control, resource control, extended services,
and a "help" facility are also supported. The protocol addresses
communication between corresponding information retrieval applications,
the client and server (which may reside on different computers); it does
not address interaction between the client and the end-user.
- Protocol specified by: The Library of Congress ZING (Z39.50 International:
Next Generation) group. http://lcweb.loc.gov/z3950/agency/zing/srw/
- Local Use: Many licensed electronic journals and databases are
accessed through Z39.50. The SRW implementation of Z39.50 will likely have
a role in developed a federated search tool that
can send search Virgo, the Repository,
and licensed resources simultaneously.
Tag
- What it is: An element in HTML or XML
is a fundamental component of the structure of a document. Elements can
contain plain text, other elements, or both. To denote or mark up the various
elements in a document, you use tags. Tags consist of a left angle bracket
(<), a tag name, and a right angle bracket (>). Tags are usually paired
(e.g., <H1> and </H1>) to start and end the tag instruction.
The end tag looks just like the start tag except a slash (/) precedes the
text within the brackets.
- Standard defined by: Defined within the standards for HTML and
XML.
- Local Use: In all HTML and XML documents.
Tamino
- What it is: An XML database used to index, manage, and search
XML documents.
- Product supplied by: Software AG. http://www.softwareag.com/tamino/default.htm
- Local Use: Tamino was tested at the Library, but not continued
in production.
TCP/IP
- What it is: Transmission Control Protocol/
Internet Protocol. The standardized suite of network protocols
that enables information systems to link to other information systems on
the Internet, regardless of their computer platform. TCP and IP are two
software communication standards used to allow multiple computers to talk
to each other.
- Protocol specified by: ISO. http://www.iso.ch/ISO/en/ISOOnline.frontpage
- Local use: All workstations and servers at UVa communicate with
the network and each other via TCP/IP.
- What it is: Text Encoding Initiative. A format
for representing text documents, designed primarily for encoding their logical
structure. The format and rules are expressed as a DTD
against which TEI files are checked for conformity to the rules. TEI encoded
files include a TEI Header that contains the basic metadata that describes
the content (author, title, date, publisher, etc.) in addition to the markup
that structures the text itself.
- Format specified by: The Text Encoding Initiative Consortium.
UVa is one of the founding organizations. http://www.tei-c.org/
- Software using this format: Note Tab, Word Perfect, and any ASCII
text editor can be used to mark up (create) TEI files.
- Local use: TEI has been used to encode electronic texts through
the Etext Center since 1992. The original "official" TEI is a SGML
encoded format, and a subset known as TEI Lite was created for a minimum
standard format. TEI P4 and TEIXLite are the current standards available
for XML encoding. UVa has created its own set of rules
(an "extension") for the use of the TEI structure that we call
the "DLPS DTD."
Return to top
- What it is: A standard format for representing images, suitable
for archival use.
- Format specified by: Adobe. The specification for the latest standard
version (6.0, standardized in 1992) can be found in this
PDF document from Adobe.
- Software using this format: All full-featured image capture and
manipulation programs, such as PhotoShop, can read or write TIFF files.
Most Web browsers do not have built-in TIFF support.
- Local use: TIFF is the format used for all preservation, collection,
and delivery masters for all image collections. Some delivery applications
for electronic texts display the page images using TIFF-to-GIF, a program
developed at the University of Michigan that translates TIFF to GIF files
on-the-fly (as needed) for viewing in Web browsers.
Unicode
- What it is: The Unicode Standard is the universal character-encoding
standard used for representation of text for computer processing. Unicode
provides a unique numeric code (a code point) for every character, no matter
what the platform, no matter what the program, no matter what the language.
- Standard developed by: The Unicode Consortium. http://www.unicode.org/
- Local use: Used for the encoding in non-Roman language electronic
text resources, such as the Japanese Text Initiative and the Tibetan and
Himalayan Digital Library.
- What it is: Uniform (or Universal) Resource
Locator. The standard type of reference used in World Wide Web hyperlinks.
- Format specified by: The World Wide Web Consortium. http://www.w3.org/Addressing/
- Software using this format: All standard Web browsers, including
Netscape and Internet Explorer.
- Local use: Used for all Library Web page access.
- What it is: Uniform Resource Name. A standard
for addressing objects with the institutional commitment to serve as persistent,
location-independent resource identifiers, in place of a URL.
- Format specified by: Internet Engineering Task Force. http://www.ietf.org/
- Local use: URNs are not in use at the UVa Library. Some resources
licensed by the Library may employ URNs.
UVa Anywhere
- What it is: Licensed Library resources can normally be used only
from computers on Grounds; UVa Anywhere is one of the ways that you can
authenticate (prove your UVa affiliation) to allow use from home or while
traveling. ITC provides a "virtual private network" that puts
your computer temporarily on the UVa network. Register with ITC to put a
certificate on your computer, then download and install the network software.
- Provided by: ITC at UVa. Instructions are available at http://www.itc.virginia.edu/desktop/pki/vpn/.
- Local use: UVa Anywhere is a new service as of spring 2003. At
this time, it is available only for Windows PCs. A Mac version of the software
should be available later in the year. Once it is set up, UVa Anywhere provides
reliable, fast access with none of the database-specific problems seen with
proxy. It also works with non-Web resources, so that EndNote users can connect
off Grounds as easily as on. Setup is somewhat more complicated than for
proxy and requires downloading 4.2 MB of software. There can be problems
with software firewalls and with some especially ISPs, especially AOL and
Juno.
UVa Proxy Server
- What it is: Licensed Library resources can normally be used only
from computers on Grounds; the Proxy Server is one of the ways that you
can authenticate (prove your UVa affiliation) to allow use from home or
while raveling. ITC maintains a proxy server that provides access to selected
Web resources. Setting up to use it is a two-step process: register with
ITC for a proxy account, then configure your Web browser for proxy use.
- Provided by: ITC at UVa. Instructions are available at http://www.itc.virginia.edu/desktop/proxy/.
- Local use: If you have an Internet service provider, proxy can
provide good access to licensed Web resources. It is easy to set up and
has little effect on the speed of your Web browsing, but problems can happen
unpredictably with some versions of Netscape and IE when connecting to particular
databases.
Return to top
Virgo -- also see Sirsi
- The name for the UVa OPAC (online public access catalog) and suite of
online user services.
VRA Core
- What it is: The VRA Core Categories, Version 3.0, consist of a
single metadata element set that can be applied as
many times as necessary to create records to describe works of visual culture
as well as the images that document them. The VRA Core can be used to design
a set of databases; a XML Schema
is under development in 2004-2005.
- Format specified by: The Visual Resources Association Data Standards
Committee. http://www.vraweb.org/vracore3.htm
- Local Use: The underlying standard for the IRIS collection management
system used by the Fine Arts Library for its image collections.
Web Services
- What it is: The term Web services describes a standardized way
of integrating Web-based applications using XML encoding (WSDL - Web Services
Description Language), communication protocols (SOAP),
and programming scripts over the Internet.
- Format specified by: World Wide Web Consortium (W3C). http://www.w3.org/2002/ws/
- Local Use: The Fedora system is built using
Web Services.
Wavelet
- What is it: A wavelet is a mathematical function useful in image
compression. Wavelets can compress images to a greater extent than is generally
possible with other methods. In some cases, a wavelet-compressed image can
be as small as about 25 percent the size of a similar-quality image using
JPEG.
- Local use: The Library uses MrSid, a wavelet-compressed
image, in the delivery of its image collections. The Library will also begin
production with JPEG 2000, another wavelet compressed
format, in 2005.
- What it is: Extensible Markup Language. A
set of rules for creating tagged ASCII text files representing structured
documents and data. XML files describe heir own structure and content by
making a reference to a DTD or a XML Schema
that indicate the rules followed in the document and its tagset. The rules
can follow a published standard or locally defined use. XML has stringent
rules for what is called well-formedness -- all XML documents,
no matter what particular scheme you are using, must follow certain structural
rules. XML files should also be "valid," requiring that the files
be parsed (checked) against a DTD or Schema. A rule might be: My document
has an element called text, and within every text theres one and only
one author, and one and only one title. A parser is a program that
checks the document against the rules and makes sure that the document follows
those rules, and is valid.
- Format specified by: The World Wide Web Consortium. The basic XML
specification is now standardized, including the implementation of XML DTDs,
XML Schema, XQueries, and XLink and XPointer and XPath
(external and internal pointer statements). http://www.w3.org/XML/
- Software using this format: Note Tab, Word Perfect, and any ASCII
text editor can be used to mark up (create) XML files. Recent versions of
Internet Explorer can display XML; otherwise, XML must be translated by
XSLT for viewing in a browser.
- Local use: The primary method for encoding all digital library
encoding projects since 2002.
XPAT
- What it is: XPAT is an SGML/XML-aware
search engine. XPAT provides support for word and phrase searching, indexing
of SGML elements and attributes, a baseline of support for valid and well-formed
XML (including Unicode support), fast retrieval, and open systems integration.
XPAT is based on the search engine previously marketed as PAT and as Open
Text. XPAT is different from an index like Google that cannot distinguish
among parts of a text or Web page because Google indexes all words without
structure. XPAT indexes all words and retains the meaningful structure and
categorization added by the XML markup allowing both keyword searching and
database-like searching by the elements.
- Software provided by: University of Michigan. http://dlxs.org/products/xpat.html
- Local Use: XPAT has been purchased by the Library as an update
to Open Text, which is used extensively in the discovery and delivery of
electronic texts.
Return to top
XPath and XQuery
- What it is: XPath is a language for querying internal parts of
an XML document, providing basic facilities for manipulation
of strings, numbers, and true-false (Boolean) conditions. XQuery is an extension
of XPath that provides a human-readable query syntax.
- Format specified by: The World Wide Web Consortium. http://www.w3.org/TR/xpath
- Local use: The search engine for the repository and XML-encoded
electronic texts must support XPath and XQuery to provide useful searching
for users.
XSL and XSLT
- What it is: XSL is a language for expressing stylesheets in XML.
XSLT (XSL Transformation) is part of XSL. In addition to XSLT, XSL includes
an XML vocabulary for specifying formatting. XSL specifies the styling of
an XML document by using XSLT to describe how the document is transformed
into another XML document that uses the formatting vocabulary. XSLT can
also use CSS to augment styles.
- Format specified by: World Wide Web Consortium (W3C). http://www.w3.org/Style/XSL/
- Local Use: XSL and XSLT are used extensively for the delivery
of electronic texts and datasets, as well as for all delivery through the
Central Digital Repository.
- What it is: A Protocol that specifies a client/server-based protocol
for searching and retrieving information from remote databases. It specifies
procedures and structures for a client system to search a database provided
by a server, retrieve database records identified by a search, scan a term
list, and sort a result set. Access control, resource control, extended
services, and a "help" facility are also supported. The protocol
addresses communication between corresponding information retrieval applications,
the client and server (which may reside on different computers); it does
not address interaction between the client and the end-user.
ZING (Z39.50 International: Next Generation) is available for review in
2004. The web services implementation of z39.50
is SRW.
- Protocol specified by: The Library of Congress and the Z39.50 Implementers
Group (ZIG). http://www.loc.gov/z3950/agency/
- Local Use: Many licensed electronic journals and databases are
accessed through Virgo via Z39.50. Z39.50 will have a role in developed
a federated search tool that can send search Virgo,
the Repository, and licensed resources simultaneously.
Return to top
|
|