| |
Organization of the Standard
Numbered Sections
The standard is organized in a hierarchy of data elements and compound elements that define the
information content for metadata to document a set of digital geospatial data. The starting point is
"metadata" (section 0). The compound element "metadata" is composed of other compound elements
representing different concepts about the data set. Each of these compound elements has a numbered
section in the standard. In each numbered section, these compound elements are defined by other
compound elements and data elements. The section "contact information" is a special section that specifies
the data elements for contacting individuals and organizations. This section is used by other sections, and
is defined once for convenience.
Each section begins with the name and definition of the compound element that defines the section. The
name and definition are followed by production rules (see below) that define this compound element in
terms of data elements, either directly or by the use of intermediate compound elements. When
intermediate compound elements are used, the production rules for these elements also are provided in this
part of the section.
Additional information about the organization of the Standard follows:
- The production rules are followed by a list of names and definitions of compound elements and data
elements used in the section.
- Section and element numbers are provided for user navigation of the standard. They are neither
authoritative nor intended for use in implementation and are subject to change in future revisions of
the standard.
Compound Elements
A compound element is a group of data elements and other compound elements.
All compound elements are described by data elements, either directly or
through intermediate compound elements. Compound elements represent
higher-level concepts that cannot be represented by individual data elements.
The form for the definition of compound elements is:
Compound element name -- definition.
Type: compound
Short Name:
The type of "compound" uniquely identifies the compound elements in the lists
of terms and definitions.
Short names consisting of eight alphabetic characters or less are included to
assist in implementation of the standard.
Data Elements
A data element is a logically primitive item of data. The entry for a data
element includes the name of the data element, the definition of the data
element, a description of the values that can be assigned to the data element,
and a short name for the data element. The form for the definition of the
data elements is:
Data element name -- definition.
Type:
Domain:
Short Name:
The information about the values for the data elements include a description
of the type of the value, and a description of the domain of the valid values.
The type of the data element describes the kind of value to be provided. The
choices are "integer" for integer numbers, "real" for real numbers, "text" for
ASCII characters, "date" for day of the year, and "time" for time of the day.
The domain describes valid values that can be assigned to the data element.
The domain may specify a list of valid values, references to lists of valid
values, or restrictions on the range of values that can be assigned
to a data element.
The domain also may note that the domain is free from restrictions, and any values that can be represented
by the "type" of the data element can be assigned. These unrestricted domains are represented by the use
of the word "free" followed by the type of the data element (that is, free text, free date, free real, free time,
free integer). Some domains can be partly, but not completely, specified. For example, there are several
widely used data transfer formats, but there may be many more that are less well known. To allow a
producer to describe its data in these circumstances, the convention of providing a list of values followed
by the designation of a "free" domain was used. In these cases, assignments of values shall be made from
the provided domain when possible. When not possible, providers may create and assign their own value.
A created value shall not redefine a value provided by the standard.
Short names consisting of eight alphabetic characters or less are included to assist in user implementation
of the standard.
Another issue is the representation of null values (representing such concepts as "unknown") in the domain.
While this is relatively simple for textual entries (one would enter the text "Unknown"), it is not as simple
for the integer, real, date, and time types. (For example, which integer value means "unknown"?).
Because conventions for providing this information vary among implementations, the standard specifies
what concepts shall be represented, but does not mandate a means for representing them.
In addition to the values to be represented, the form of representation also is important, especially to
applications that will manipulate the data elements. The following conventions for forms of values for data
elements shall be used:
Calendar Dates (Years, Months, and Days)
- A.D. Era to December 31, 9999 A.D. -- Values for day and month of year, and for years, shall
follow the calendar date convention (general forms of YYYY for years; YYYYMM for month of a
year (with month being expressed as an integer), and YYYYMMDD for a day of the year) specified
in American National Standards Institute, 1986, Representation for calendar date and ordinal date
for information interchange (ANSI X3.30-1985): New York, American National Standards Institute
(adopted as Federal Information Processing Standard 4-1).
- B.C. Era to 9999 B.C. -- Values for day and month of year, and for years, shall follow the calendar
date convention, preceded by the lower case letters "bc" (general forms of bcYYYY for years;
bcYYYYMM for month of a year (with month being expressed as an integer), and bcYYYYMMDD
for a day of the year).
- B.C. Era before 9999 B.C. -- Values for the year shall consist of as many numeric characters as
needed to represent the number of the year B.C., preceded by lower case letters "cc" (general form
of ccYYYYYYY...).
- A.D. Era after 9999 A.D. -- Values for the year shall consist of as many numeric characters as
needed to represent number of the year A.D., preceded by the lower case letters "cd" (general form
of cdYYYYYYY...).
Time of Day (Hours, Minutes, and Seconds)
Because some geospatial data and related applications are sensitive to time of day information, three
conventions are permitted. Only one convention shall be used for metadata for a data set. The
conventions are:
- Local Time. For producers who wish to record time in local time, values shall follow the 24-
hour timekeeping system for local time of day in the hours, minutes, seconds, and decimal
fractions of a second (to the precision desired) without separators convention (general form of
HHMMSSSS) specified in American National Standards Institute, 1986, Representations of
local time of day for information interchange (ANSI X3.43-1986): New York, American
National Standards Institute.
- Local Time with Time Differential Factor. For producers who wish to record time in local
time and the relationship to Universal Time (Greenwich Mean Time), values shall follow the
24-hour timekeeping system for local time of day in hours, minutes, seconds, and decimal
fractions of a second (to the resolution desired) without separators convention. This value
shall be followed, without separators, by the time differential factor. The time differential
factor expresses the difference in hours and minutes between local time and Universal Time.
It is represented by a four-digit number preceded by a plus sign (+) or minus sign (-),
indicating hours and minutes local time is ahead of or behind Universal Time, respectively.
The general form is HHMMSSSSshhmm, where HHMMSSSS is the local time using 24-hour
timekeeping (expressed to the precision desired), 's' is the plus or minus sign for the time
differential factor, and hhmm is the time differential factor. (This option allows producers to
record local time and time zone information. For example, Eastern Standard Time has a time
differential factor of -0500, Central Standard Time has a time differential factor of -0600,
Eastern Daylight Time has a time differential factor of -0400, and Central Daylight Time has
a time differential factor of -0500.) This option is specified in American National Standards
Institute, 1975, Representations of universal time, local time differentials, and United States
time zone reference for information interchange (ANSI X3.51-1975): New York, American
National Standards Institute.
- Universal Time (Greenwich Mean Time). For producers who wish to record time in
Universal Time (Greenwich Mean Time), values shall follow the 24-hour timekeeping system
for Universal Time of day in hours, minutes, seconds, and decimal fractions of a second
(expressed to the precision desired) without separators convention, with the upper case letter
"Z" directly following the low-order (or extreme right hand) time element of the 24-hour
clock time expression. The general form is HHMMSSSSZ, where HHMMSSSS is Universal
Time using 24-hour timekeeping, and Z is the letter "Z". This option is specified in American
National Standards Institute, 1975, Representations of universal time, local time differentials,
and United States time zone reference for information interchange (ANSI X3.51-1975): New
York, American National Standards Institute.
Latitude and Longitude
Values for latitude and longitude shall be expressed as decimal fractions of degrees. Whole degrees
of latitude shall be represented by a two-digit decimal number ranging from 0 through 90. Whole
degrees of longitude shall be represented by a three-digit decimal number ranging from 0 through
180. When a decimal fraction of a degree is specified, it shall be separated from the whole number
of degrees by a decimal point. Decimal fractions of a degree may be expressed to the precision
desired.
- Latitudes north of the equator shall be specified by a plus sign (+), or by the absence of a
minus sign (-), preceding the two digits designating degrees. Latitudes south of the Equator
shall be designated by a minus sign (-) preceding the two digits designating degrees. A point
on the Equator shall be assigned to the Northern Hemisphere.
- Longitudes east of the prime meridian shall be specified by a plus sign (+), or by the absence
of a minus sign (-), preceding the three digits designating degrees of longitude. Longitudes
west of the meridian shall be designated by minus sign (-) preceding the three digits
designating degrees. A point on the prime meridian shall be assigned to the Eastern
Hemisphere. A point on the 180th meridian shall be assigned to the Western Hemisphere.
One exception to this last convention is permitted. For the special condition of describing a
band of latitude around the earth, the East Bounding Coordinate data element shall be
assigned the value +180 (180) degrees.
- Any spatial address with a latitude of +90 (90) or -90 degrees will
specify the position at the North or South Pole, respectively. The
component for longitude may have any legal value.
With the exception of the special condition described above, this form is
specified in American National Standards Institute, 1986, Representations of
Geographic Point Locations for Information Interchange (ANSI X3.61-1986): New
York, American National Standards Institute.
Network Addresses and File Names
Values for file names, network addresses for computer systems, and related services should follow the
Uniform Resource Locator convention of the Internet when possible. See
http://www.ncsa.uiuc.edu/demoweb/url-primer.html for additional details about the Uniform Resource
Locator.
Optionality
The standard categorizes elements as being mandatory, mandatory-if-applicable,
or optional as follows:
- Mandatory elements must be provided.
- Mandatory-if-applicable elements must be provided if the data set
exhibits the defined characteristic.
- Optional elements are provided at the discretion of the metadata
producer.
The optionality of a section or compound element always takes precedence over the elements that it
contains. Once a section or compound element is recognized by the data set producer as applicable, then
the optionality of its subordinate elements is to be interpreted. See Production Rules section for additional
interpretive guidance.
Mandatory sections in the standard have some elements that are always required for all types of geospatial
data sets. For comparison with other metadata standards, these elements are referred to as "core" elements.
Production Rules
A production rule specifies the relationship between a compound element, and data elements and other
(lower-level) compound elements. Each production rule has a left side (identifier) and a right side
(expression) connected by the symbol "=", meaning that the term on the left side is replaced by or produces
the term on the right side. Terms on the right side are either other compound elements or individual data
elements. By making substitutions using matching terms in the production rules, one can explain higher-
level concepts using data elements. The symbols used in the production rules have the following meaning:
Symbol Meaning
=
+
[|]
m{}n
()
|
|
is replaced by, produces, consists of
and
selection - select one term from the list of enclosed terms (exclusive or).
Terms are separated by "|"
iteration - the term(s) enclosed is(are) repeated from "m" to "n" times
optional - the term(s) enclosed is(are) optional
|
Examples:
a = b + c
a = [b | c
a = 4{b}6
a = b + (c)
| |
"a consists of b and c"
"a consists of one of b or c"
"a consists of four to six occurrences of b"
"a consists of b and optionally c"
|
Interpreting the production rules:
The terms bounded by parentheses, "(" and ")", are optional and are provided at the discretion
of the data producer. If a producer chooses to provide information enclosed by parentheses,
the producer shall follow the production rules for the enclosed information. For example, if
the producer decides to provide the optional information described in the term:
the producer shall provide a and b and c.
Only for terms bounded by parentheses does the producer have the discretion of deciding
whether or not to provide the information.
The variation among the ways in which geospatial data are produced and distributed, the fact
that all geospatial data does not have the same characteristics, and the issue that all details of
data sets that are in work or are planned may not be decided, caused the need to express the
concept of "mandatory if applicable." This concept means that if the data set exhibits (or, for
data sets that are in work or planned, it is known that the data set will exhibit) a defined
characteristic, then the producer shall provide the information needed to describe that
characteristic. This concept is described by the production rule:
Extensibility
Extended elements may be defined by a data set producer or a user community. Extended elements
are elements outside the standard, but needed by the data set producer. If extended elements are
created, they must follow the guidelines in Appendix D, Guidelines for creating extended elements
to the Content Standard for Digital Geospatial Metadata.
|