![]() |
|
|
|
Central Eurasian Information
Resource Text Database
Data Dictionary
Dec. 4, 2000
Metadata
liaison: Diana Brooking
|
Field
Name |
DC
Mapping |
Searchable? |
Authority
File |
Comments |
|
Digital
Collection |
Source |
No |
none |
Contains
the name of the CONTENT database of which the digital text is a part.
(Automatically supplied by CONTENT software) Central
Eurasian Information Resource--Text Database |
|
Title |
Title |
Yes |
none |
The
title of the part of the text that has been scanned for the database. The
title should be taken from the text itself, that is, a Russian text will have
a Russian title. Use LC transliteration without the diacritics. If
there is no title available at all on the text or from another source (like a
table of contents), the inputter will have to supply a brief title that
describes the text in English, and put it in square brackets. Example: [Advertising brochure for Siberian
pickled mushrooms] (If
the scanned text is of an encyclopedia or journal article or a book chapter,
the title will be for that smaller part and the title of the fuller work that
the text came from will be recorded in the field "Text Source.") |
|
Author |
Creator |
Yes |
LCAF |
Name
of the author of the text. Input as lastname, firstname. If more than
one name, separate by <br> |
|
Language
of Text |
Language |
Yes |
The
language of the text. Use the language names spelled out in bold from MARC21,
not the codes. (The list would probably only need to be consulted for
the names of more obscure languages.) Examples:
Russian, English, Turkmen, Yakut. |
|
|
Publication
Date |
Date |
Yes |
none |
Date
the text was created or published. Generally the year only. Set Data type
to Date. Examples:
1984 1910 |
|
Notes |
Description |
Yes |
none |
May
contain a brief narrative summary of the text, its significance, etc. Any
information of importance that is not represented elsewhere, or that needs
further explanation can go in this field. |
|
Subjects |
Subject |
Yes |
TGM-1/LCSH
(LCAF
for names) |
Subject
terms that denote what the text is about. Since the image portion of the CEIR
project is using TGM-1 as its controlled subject vocabulary, terms for the
text Subjects field will also be taken first from LC's Thesaurus for Graphic
Materials I (use the terms as simple descriptors without subdivision). N.B.
Replace the ampersand ("&") in TGM with the word
"and". Since
TGM-1 was created for visual materials, it is very likely that subject terms
needed for texts will not be found in TGM-1, so the second source to consult
for terms will be LCSH. Again,
LCSH may be used as simple descriptors without subdivision. If
more than one subject, separate by <br> |
|
Historical
period |
Coverage |
Yes |
|
The
historical period discussed or covered by the content of the text (NOT the
publication date). Prefer
a named time period rather than a numeric identifier such as a date range if
possible.If more than one phrase used, separate by <br> Examples:
16th century |
|
Country |
Coverage |
Yes |
BGN |
Use
the following coverage fields for noting the geographic area discussed in the
text. |
|
Region |
Coverage |
Yes |
" |
Could
be used to create canned searches. This field will not always be filled in? Example:
Dalnevostochny ekonomicheskii raion |
|
Oblast/Province |
Coverage |
Yes |
" |
Will
be used to create canned searches. |
|
Rayon/District |
Coverage |
Yes |
" |
This
field will not always be filled in. |
|
City/Town |
Coverage |
Yes |
" |
|
|
Text
Source |
Source |
Yes? |
none |
This
will be a complete bibliographic citation to the source of the scanned text.
Will include the date of publication. The citation could also refer to a
journal or book that an article has been scanned from. Database
owner did not anticipate users searching this field. |
|
Obect
Type |
Type |
Yes |
A
controlled list, with terms derived from AAT and other thesauri, is being
created for the use of inputters. Terms
should be used in the singular. Examples
could include article, treaty, pamphlet... |
|
|
Text
No. |
Identifier |
Yes |
none |
An accession number for the image to help with administration of the
database--for staff use only probably. Examples: T1, T2, T3, etc. |
|
Rights |
Rights |
No |
LCAF |
Information
about the copyright holder for the text. |
Issues:
Transliteration
and diacritics: CONTENT cannot support
diacritics at all. So I anticipate typing words in as they should be, just
minus diacritics. Transliteration will be an issue, since different authority
files use different transliteration systems. Consistency within this database may
not be achieved. LC transliteration would be used for personal, corporate, and
other names, as LCAF will be used as the authority file for these terms. (See
below for geographic names.) LC transliteration will also be used for titles
and other bibliographic information, and also in the Subjects field (which used
LCSH and LCAF as the authority files).
Controlled
vocabulary for geographic names: The GIS portion of the project will be using the
BGN/GEOnet forms of geographic names.
For consistency within the project, BGN forms should be used in the
CONTENT text portion of the project as well in the coverage fields. (But for
cross-database consistency within CONTENT, LCAF forms in our local hierarchical
format would be the choice. In addition to choosing different forms of names
sometimes, BGN and LC do not use the same transliteration system.)
Examples:
GEOnet
LC Other
CONTENT databases
Russia Russia (Federation)
Moskva Moscow Russia (Federation)--Moscow
Smolenskaya
Oblast Smolenskaia oblast
MIG
said to use BGN as authority file for Coverage fields to maintain consistency
within the project.