Central Eurasian Information Resource - Text Database

Central Eurasian Information Resource Text Database
Data Dictionary
Dec. 4, 2000

Metadata liaison: Diana Brooking

Field Name

DC Mapping

Searchable?

Authority File

Comments

Digital Collection

Source

No

none

Contains the name of the CONTENT database of which the digital text is a part. (Automatically supplied by CONTENT software)

 

Central Eurasian Information Resource--Text Database

Title

Title

Yes

none

The title of the part of the text that has been scanned for the database. The title should be taken from the text itself, that is, a Russian text will have a Russian title. Use LC transliteration without the diacritics.

 

If there is no title available at all on the text or from another source (like a table of contents), the inputter will have to supply a brief title that describes the text in English, and put it in square brackets. Example:  [Advertising brochure for Siberian pickled mushrooms]

 

(If the scanned text is of an encyclopedia or journal article or a book chapter, the title will be for that smaller part and the title of the fuller work that the text came from will be recorded in the field "Text Source.")

Author

Creator

Yes

LCAF

Name of the author of the text.

 

Input as lastname, firstname. If more than one name, separate by <br>

Language of Text

Language

Yes

MARC21 Code List for Languages

The language of the text. Use the language names spelled out in bold from MARC21, not the codes. (The list would probably only need to be consulted for the names of more obscure languages.)

 

Examples: Russian, English, Turkmen, Yakut.

Publication Date

Date

Yes

none

Date the text was created or published. Generally the year only. Set Data type to Date.

 

Examples: 1984

1910

 

Notes

Description

Yes

none

May contain a brief narrative summary of the text, its significance, etc. Any information of importance that is not represented elsewhere, or that needs further explanation can go in this field.

Subjects

Subject

Yes

TGM-1/LCSH

(LCAF for names)

Subject terms that denote what the text is about. Since the image portion of the CEIR project is using TGM-1 as its controlled subject vocabulary, terms for the text Subjects field will also be taken first from LC's Thesaurus for Graphic Materials I (use the terms as simple descriptors without subdivision). N.B. Replace the ampersand ("&") in TGM with the word "and".

 

Since TGM-1 was created for visual materials, it is very likely that subject terms needed for texts will not be found in TGM-1, so the second source to consult for terms will be LCSH.

Again, LCSH may be used as simple descriptors without subdivision.

 

If more than one subject, separate by <br>

Historical period

Coverage

Yes

 

The historical period discussed or covered by the content of the text (NOT the publication date).

 

Prefer a named time period rather than a numeric identifier such as a date range if possible.If more than one phrase used, separate by <br>

 

Examples: 16th century

Country

Coverage

Yes

BGN

Use the following coverage fields for noting the geographic area discussed in the text.

Region

Coverage

Yes

"

Could be used to create canned searches. This field will not always be filled in?

 

Example: Dalnevostochny ekonomicheskii raion

Oblast/Province

Coverage

Yes

"

Will be used to create canned searches.

Rayon/District

Coverage

Yes

"

This field will not always be filled in.

City/Town

Coverage

Yes

"

 

Text Source

Source

Yes?

none

This will be a complete bibliographic citation to the source of the scanned text. Will include the date of publication. The citation could also refer to a journal or book that an article has been scanned from.

 

Database owner did not anticipate users searching this field.

Obect Type

Type

Yes

MIG list

A controlled list, with terms derived from AAT and other thesauri, is being created for the use of inputters.

Terms should be used in the singular.

 

Examples could include article, treaty, pamphlet...

Text No.

Identifier

Yes

none

An accession number for the image to help with administration of the database--for staff use only probably. Examples: T1, T2, T3, etc.

Rights

Rights

No

LCAF

Information about the copyright holder for the text.

 

 

 

Issues:

Transliteration and diacritics: CONTENT cannot support diacritics at all. So I anticipate typing words in as they should be, just minus diacritics. Transliteration will be an issue, since different authority files use different transliteration systems. Consistency within this database may not be achieved. LC transliteration would be used for personal, corporate, and other names, as LCAF will be used as the authority file for these terms. (See below for geographic names.) LC transliteration will also be used for titles and other bibliographic information, and also in the Subjects field (which used LCSH and LCAF as the authority files).

Controlled vocabulary for geographic names: The GIS portion of the project will be using the BGN/GEOnet forms of geographic names.  For consistency within the project, BGN forms should be used in the CONTENT text portion of the project as well in the coverage fields. (But for cross-database consistency within CONTENT, LCAF forms in our local hierarchical format would be the choice. In addition to choosing different forms of names sometimes, BGN and LC do not use the same transliteration system.)

Examples:

GEOnet

LC

Other CONTENT databases

Russia

Russia (Federation)

 

Moskva

Moscow

Russia (Federation)--Moscow

Smolenskaya

Oblast

Smolenskaia oblast

MIG said to use BGN as authority file for Coverage fields to maintain consistency within the project.