Data quality criteria

To ensure data usability and provenance all entries in the ROD are required to meet certain data quality criteria as agreed upon by the ROD Advisory Board. Data conformance to these criteria is required during the initial data deposition stage.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Deposition types

The ROD currently accepts three distinct types of depositions that differ greatly in the availability of certain data. As a result, a separate set of criteria is provided for each deposition type.

Published data

The published data deposition type is intended for data that has been published in peer-reviewed publications.

Mandatory data:

Recommended data:

Prepublication data

The prepublication data deposition type is intended for data that is in the process of being reviewed or published in a peer-reviewed publication.

Since prepublication data might contain sensitive information only parts of it are initially accessible to the wider community. The full release date is decided upon by the depositor.

In cases when the prepublication data does not result in a publication it can alternatively be released to the public domain as a personal communication to ROD.

Mandatory data:

Recommended data:

Personal communication to ROD

The Personal Communication to ROD deposition type is intended for data that was provided to the ROD directly by volunteer depositors.

Disclaimer: by depositing your data as a personal communication you make a public disclosure of the data; this may have implications if you later decide to publish it in a journal or obtain patents on some use of the described compound. Only submit data as a personal communication if this is not an issue for you and your co-authors.

Mandatory data:

Recommended data:

Related data items

Data in the ROD is ingested and stored using the CIF2 format and strictly validated against a set of DDLm dictionaries. As such, specific data items have to be used to record the information in a human- and computer-readable way.

The following sections provide a brief description for each of the data groups as well as link to the related data items in the dictionaries. They are only intended as a general guide. For additional information such as measurement units, data item descriptions and usage examples the related dictionaries should be consulted.

Raman shifts and intensities

Raman shifts and intensities describe the spectrum that was acquired during the experiment. Data items from the RAMAN SPECTRUM category are used to record this information. The most basic set needed to describe the spectrum consists of the following data items:

Authorship

Data deposited as published or pending publication SHOULD contain the full author list as credited in the publication.

Data deposited as a personal communication MUST at least contain the real name of the depositor as it is declared in the associated ROD account.

The _publ_author_name data item is used to record this piece of information.

Sufficient bibliographic information

Sufficient bibliographic information to uniquely identify the related publication MUST be provided.

In case the DOI of the publication is available it SHOULD be provided using the _journal_paper_doi data item. Otherwise, the following data items MUST be provided:

Sample state

Sample state describes the state of matter in which the sample exist. A locally defined data item _[local]_chemical_compound_state is currently used to record this piece of information. The accepted states are "gas", "liquid" and "solid".

Chemical formula

Chemical formula provides information about the chemical composition of the sample. Multiple data items exists that describe chemical formulae in different notations:

  • _chemical_formula_sum – formula in which all discrete bonded residues and ions are summed over the constituent elements, following the Hill's ordering notation;
  • _chemical_formula_moiety – formula with each discrete bonded residue or ion shown as a separate moiety;
  • _chemical_formula_structural – formula for inorganics, organometallics, metal complexes etc., in which bonded groups are preserved as discrete entities within parentheses, with post-multipliers as required;
  • _chemical_formula_analytical – formula determined by standard chemical analysis including trace elements;
  • _chemical_formula_iupac – formula expressed in conformance with IUPAC rules for inorganic and metal-organic compounds.

Only the _chemical_formula_sum data item is required, however, the provision of additional formulae in different notations is encouraged.

Polarization

The propagation and polarization directions of the measurement device MUST be described using the Porto's notation.

The _raman_measurement_device.direction_polarization data item is used to record this piece of information.

Integration time

The _raman_measurement.integration_time data item is used to record the integration time.

Sample description and provenance

A set of locally defined data items are used to record the details about the sample:

  • _[local]_chemical_compound_color

Currently, only the _chemical_compound_source data item is used to record the provenance of the sample, but additional set of data items is under development.

Laser power density at the sample surface

The _x data item is used to record the laser power density at the sample surface. Alternatively, this value can be automatically calculated from the _x1, _x2 and _x3 data items using the following formula:

Spectrometer company and model

Data items from the RAMAN_MEASUREMENT_DEVICE category are used to record the details about the measurement device. The most basic set needed to identify the device consists of the following data items:

Device optics

The following data items are used to record the general information about the optics of the measurement device:

The microscope of the optics can be further described using the following data items:

Measurement date

The experiment initiation and termination dates are recorder using the following data items:

Only the _raman_measurement.datetime_initiated data item is required, but the provision of the _raman_measurement.datetime_terminated data item is encouraged.

Incident wavelength

The _raman_measurement_device.excitation_laser_wavelength data item is used to record the incident wavelength.

Spectrometer Configuration

The spectrometer configuration provides information about the configuration of the spectrometer. More information is needed to correctly define an associated data item. Currently, only one applicable value has been provided as an example (single monochromator).