How to write a parser¶
NOMAD uses parsers to convert raw data (for example, output from computational software, instruments, or electronic lab notebooks) into NOMAD's common Archive format. The following pages describe how to develop such a parser and integrate it within the NOMAD software. The goal is equip users with the required knowledge to contribute to and extend NOMAD.
Getting started¶
In principle, it is possible to develop a "local parser" that uses the nomad-lab package to parse raw data, without changing the NOMAD software itself. This allows a quick start for focusing on the parsing of the data itself, but is not relevant for full integration of your new parser into NOMAD. Here we are focused on developing parsers that will be integrated into the NOMAD software. For this, you will have to install a development version of NOMAD.
Parser organization¶
The NOMAD parsers can be found within your local NOMAD git repo under
dependencies/parsers/
. The parsers are organized into the following individual projects
(dependencies/parsers/<parserproject>
) with their own corresponding repositories:
- atomistic - Parsers for output from classical molecular simulations, e.g., from Gromacs, Lammps, etc.
- database - Parsers for various databases, e.g., OpenKim.
- eelsdb - Parser for the EELS database (https://eelsdb.eu/; to be integrated in the database project).
- electronic - Parsers for output from electronic structure calculations, e.g., from Vasp, Fhiaims, etc.
- nexus - Parsers for combining various instrument output formats and electronic lab notebooks.
- workflow - Parsers for output from task managers and workflow schedulers.
Within each project folder you will find a test/
directory, containing the parser tests, and also a directory containing the parsers' source code,
<parserproject>parser
or <parserproject>parsers
, depending on if one or more
parsers are contained within the project, respectively. In the case of multiple parsers, the files
for individual parsers are contained within a corresponding subdirectory: <parserproject>parsers/<parsername>
For example, the Quantum Espresso parser files are found in dependencies/parsers/electronic/electronicparsers/quantumespresso/
.
Setting up your development branches¶
We will first focus on the case of adding a new parser to an existing parser project. Creating a new parser project will require a few extra steps.
The existing parser projects are stored within their own git repositories and then linked to the NOMAD software. All current parser projects are available at nomad-coe (see also individual links above).
You will first need to create new branches within both the NOMAD project and also within the corresponding parser project. Ideally, this should be done following the best practices for NOMAD development. Here, we briefly outline the procedure:
Create a new issue within the NOMAD project at NOMAD gitlab.
On the page of the new issue, in the top right, click the arrow next to the Create merge request
button and select Create branch
. The branch name should be automatically generated with the
corresponding issue number and the title of the issue (copy the branch name to the clipboard for use below),
and the default source branch should be develop
.
Click the Create branch
button.
Now, run the following commands in your local NOMAD directory:
git fetch --all
(to sync with remote)
git checkout origin/<new_branch_name> -b <new_branch_name>
(to checkout the new branch and create a local copy of the branch)
Unless you just installed the NOMAD development version, you should rerun ./scripts/setup_dev_env.sh
within the NOMAD directory to reinstall with the newest development branch.
Now we need to repeat this process for the parser project that we plan to extend. As above,
create a new issue at the relevant parser project GitHub page. Using the identical
issue title as you did for the NOMAD project above is ideal for clarity.
On the page of the new issue, in the right sidebar under the subsection Development
, click
Create a branch
. As above, the branch name should be automatically generated with the
corresponding issue number and the title of the issue, and the default source branch should be develop
(which can be seen by clicking change source branch
).
Under What's next
, the default option should be Checkout locally
, which is what we want in this case.
Click Create branch
, and then copy the provided commands to the clipboard, and run them within the parser
project folder within your local NOMAD repo, i.e., dependencies/parsers/<parserproject>
.