Guide to Computational MetaInfo¶
Overview of metadata organization for computation¶
NOMAD stores all processed data in a well defined, structured, and machine readable format, known as the archive
.
The schema that defines the organization of (meta)data within the archive is known as the MetaInfo. See Explanation > Data structure for general information about data structures and schemas in NOMAD.
The following diagram is an overarching visualization of the most important archive sections for computational data:
archive
├── run
│ ├── method
│ │ ├── atom_parameters
│ │ ├── dft
│ │ ├── forcefield
│ │ └── ...
│ ├── system
│ │ ├── atoms
│ │ │ ├── positions
│ │ │ ├── lattice_vectors
│ │ │ └── ...
│ │ └── ...
│ └── calculation
│ ├── energy
│ ├── forces
│ └── ...
└── workflow2
├── method
├── inputs
├── tasks
├── outputs
└── results
Entire subsections of NOMAD's schema can be browsed using the MetaInfo Browser:
run
base schema: MetaInfo Browser > Entry > runrunschema
full schema forrun
: MetaInfo Browser > runschemaworkflow2
base schema: MetaInfo Browser > Entry > workflow2simulationworkflowschema
full computational schema forworkflow2
: MetaInfo Browser > simulationworkflowschema
The most important section of the archive for computational data is the run
section, which is
divided into three main subsections: method
, system
, and calculation
. method
stores
information about the computational model used to perform the calculation.
system
stores attributes of the atoms involved in the calculation, e.g., atom types, positions, lattice vectors, etc. calculation
stores the output of the calculation, e.g., energy, forces, etc.
The workflow
section of the archive then stores information about the series of tasks performed
to accumulate the (meta)data in the run section. The relevant input parameters for the workflow are
stored in method
, while the results
section stores output from the workflow beyond observables
of single configurations.
For example, any ensemble-averaged quantity from a molecular dynamics
simulation would be stored under workflow/results
. Then, the inputs
, outputs
, and tasks
sections define the specifics of the workflow.
For some standard workflows, e.g., geometry optimization and molecular dynamics, the NOMAD normalizers
For non-standard workflows, the parser (or more appropriately the corresponding normalizer) must
populate these sections accordingly.
See Explanation > Workflows for more information about the general structure of the workflow section, and How-to Guides > Customization > Define workflows for instructions on how to upload custom workflows to link individual entries in NOMAD.
Attention
We are currently performing a complete refactoring of the computational MetaInfo schema. The new schema will be populated under the data
section of the archive: MetaInfo Browser > Entry > data. A preliminary version of the full schema can be browsed in MetaInfo Browser > nomad_simulations.
Further information can be found within the schema plugin docs: nomad-simulations
Docs.