How to Extend Readers¶

The pynxtools-spm reader package is designed in a modular way to make it easy to add new readers (for other techniques) and extend the existing readers for newer file formats. Currently, the reader suite of pynxtools-spm hosts multiple readers for STS, STM, and AFM experiments. It is envisioned that in the future, more readers for diverse file formats of other SPM techniques will be included in this package.

Extend File Formats or Add New Techniques¶

This is an open-source project and any contribution to this project is welcome. Before proceeding with the steps below, please first read the reader structure to understand the modular design of the reader package.

To include a new reader for a technique or extend the reader capability by including other file formats, follow the steps below:

0. Clone and prepare the development environment for pynxtools-spm (follow the installation guide).

1. Go through the reader structure to understand the modular design of the reader package.

2. Create a new parser module in the parsers subpackage to read the raw data files from the new SPM file format and convert the raw data path into a slash-separated hierarchical path. For reading the raw data into a forward slash (/) - separated hierarchical path (see the Raw Data File section in How to Use the Reader), you can use a third-party Python package (if available for that file format) or build your own code to read the raw data from the files. All parsers should inherit from SPMBase class in the base_parser module (you may look at an existing module, e.g., nanonis_sxm.py or nanonis_dat.py).

3. Create a new formatter module in the corresponding subpackage of the nxformatters subpackage. Always ensure that the new formatter class is built by inheriting a formatter base class (e.g., NanonisBase, OmicronBase in the modules nanonis_base and omicron_base, respectively). By inheriting the base class, you can use existing methods or develop specific methods to curate the unstructured data coming from the raw files and the ELN YAML file, following the instructions given in the config file (please refer to one of the formatters, e.g., nanonis_dat_sts). Note that, in the data curation, formatter should strictly focus on application definition of the corresponding SPM technique.

If you are adding a new technique (e.g., Scanning Gate Microscopy (SGM)), the prerequisite step is to develop an application definition in the NeXus definitions repository.

4. Run the converter to test your development. If the raw data is not properly curated according to the application definition, you will encounter warning messages. These warning messages indicate which data is missing or does not follow the correct conventions. Check your code, config file, ELN file, and the content of those files. Fix the issues one by one until all warning messages are disappeared. Please let us know if you need further assistance.

5. Write test cases for your new parser and formatter modules. This is an easy but important part of the contribution process. Add your test cases to the test_reader module in tests and include only the necessary input files in the subdirectory of the data sub-directory of tests directory.

6. Create a pull request (PR) to include your contributions in the main branch of the pynxtools-spm repository. You may create the PR as a draft while development is ongoing and keep us in the discussion loop.

7. We will review your code and provide feedback. Once all changes are finalized, we will merge your code into the main branch of the pynxtools-spm repository and release a new version of the package, including your contributions.