2.3.3.3.4. Atom Probe Microscopy

Introduction

For the use case atom probe tomography the community contributed not only the NXapm application definition. It was also explored how using several instances of NXprocess is useful for documenting the many data processing steps that are typical in atom probe research to investigate structural features of the material demanding reconstructions, i.e., models of the crystal and defect network by analyzing the collective position data of atoms. The following definitions are a summary of the status quo how NeXus can be used for documenting these processing steps to improve numerical reproducibility and assist researchers with documenting procedural aspects of their data analysis workflows.

Base Classes

The processing steps of ranging and reconstructing are documented as two specializations of NXprocess:

NXapm_ranging

Metadata to ranging definitions made for a dataset in atom probe microscopy.

NXapm_reconstruction

Metadata of a dataset (tomographic reconstruction) in atom probe microscopy.

Spatial or other type of filters which are frequently used for atom probe to select specific atom positions or portions of the data based on isotopic identity are modeled as base classes for filters, which are defined atom-probe-agnostic empowering reuse:

NXdelocalization

Base class to describe the delocalization of point-like objects on a grid.

NXisocontour

Computational geometry description of isocontouring/phase-fields in Euclidean space.

NXmatch_filter

Base class to filter ions based on their type or on other properties like hit multiplicity.

NXspatial_filter

Base class to filter based on position. This base class takes advantage of NXcg_ellipsoid,

NXcg_cylinder, NXcg_hexahedron

Base classes to describe commonly used geometric primitives (not only) in atom probe. The primitives are used for defining the shape and extent of a region of interest (ROI) NXroi_process of material.

NXsubsampling_filter

Base class for a filter that can also be used for specifying how entries like ions can be filtered via sub-sampling.

Tools and applications in APM

Several research software tools exist in the APM community to analyze APM data. One of these is the paraprobe-toolbox The software is developed by M. Kühbach et al..

The paraprobe-toolbox is an example of an open-source parallelized software for analyzing point cloud data, for assessing meshes in 3D continuum space, and for studying the effects of parameterization on descriptors of micro- and nanoscale structural features (crystal defects).

There is a set of contributed application definitions which describe each computational step performed by tools of the paraprobe-toolbox. This is a blueprint of how NeXus can be used for documentation also computational steps of other software tools (including commercial ones).

The need for a thorough documentation of the tools was motivated by several needs:

First, users of software would like to better understand and also be able to study for themselves which individual parameters and settings for each tool exist and how configuring these affect analyses quantitatively. This stresses the aspect how to improve documentation.

Second, scientific software like paraprobe-toolbox implements workflows with numerics and algorithms that process data from multiple input sources (like previous analysis results), and carry these data through multiple steps inside the tool. The tool then creates output as files. This provenance and workflow should be documented for reproducibility (the “R” of the FAIR principles of data stewardship).

Individual tools of the paraprobe-toolbox are developed in C/C++ or Python. Each of these tools instructs a workflow that takes three steps each: 1. The creation of a configuration file. 2. The actual analysis using a given Python/or C/C++ tool from the toolbox with results summarized in a results file. 3. The optional analyses/visualization of the results based on data in NeXus/HDF5 files generated by each tool.

Data and metadata between the tools are exchanged via NeXus/HDF5 files. This means that data inside HDF5 binary containers are named, formatted, and hierarchically structured according to NeXus application definitions. These definitions are specific for each tool:

For example the application definition NXapm_paraprobe_surfacer_config: specifies the expected data formatting of a configuration file for the paraprobe-surfacer tool. The application definition defines which parameters are expected, which of these are optional, and if specific cardinality constraints exist. Each config file defines a controlled vocabulary of terms. The config files store SHA256 checksum for each input file, thereby implementing an uninterrupted provenance tracking chain that encodes the computational workflow.

As an example, a user may first range their reconstruction and then compute spatial correlation functions. The config file for the ranging tool stores the files which hold the reconstructed ion position and ranging definitions. The ranging tool generates a results file with the labels of each molecular ion. This results file is formatted according to the tool-specific results application definition. The generated results file and the reconstruction is imported by the spatial statistics tool which again keeps track of all files and reports its results in a spatial statistics tool results file.

This design makes it possible to rigorously trace which numerical results were achieved with specific inputs and settings using specifically-versioned tools including Y-junction on the workflow graph where multiple input sources are combined.

Concepts that are used in multiple tools are inherited from the following tool-agnostic application definitions:

NXapm_paraprobe_tool_config, NXapm_paraprobe_tool_results:

Configuration and results respectively.

Internally, these inherit from several other tool-agnostic base classes adding atom-probe-research-specific concepts:

NXapm_paraprobe_tool_parameters, NXapm_paraprobe_tool_process, NXapm_paraprobe_tool_common:

Parameters, processing specific data, and common parts respectively useful for the application definitions of the tools of the paraprobe-toolbox.

Application Definitions

In summary, these are the proposed pairs for all tools in the paraprobe-toolbox:

NXapm_paraprobe_ranger_config, NXapm_paraprobe_ranger_results

Configuration and results respectively of the paraprobe-ranger tool. Apply ranging definitions and explore possible molecular ions. Store applied ranging definitions and combinatorial analyses of possible iontypes.

NXapm_paraprobe_surfacer_config, NXapm_paraprobe_surfacer_results

Configuration and results respectively of the paraprobe-surfacer tool. Create a model for the edge of a point cloud via convex hulls, alpha shapes, or alpha-wrappings. Store triangulated surface meshes of models for the edge of a dataset.

NXapm_paraprobe_distancer_config, NXapm_paraprobe_distancer_results

Configuration and results respectively of the paraprobe-distancer tool. Compute and store analytical distances between ions to a set of triangles.

NXapm_paraprobe_tessellator_config, NXapm_paraprobe_tessellator_results

Configuration and results respectively of the paraprobe-tessellator tool. Compute and store Voronoi cells and properties of these for all ions in a dataset.

NXapm_paraprobe_selector_config, NXapm_paraprobe_selector_results

Configuration and results respectively of the paraprobe-selector tool. Defining complex spatial regions-of-interest to filter reconstructed datasets. Store which points are inside or on the boundary of complex spatial regions-of-interest.

NXapm_paraprobe_spatstat_config, NXapm_paraprobe_spatstat_results

Configuration and results respectively of the paraprobe-spatstat tool. Compute spatial statistics on the entire or selected regions of the reconstructed dataset.

NXapm_paraprobe_nanochem_config, NXapm_paraprobe_nanochem_results

Configuration and results respectively of the paraprobe-nanochem tool. Compute delocalization, iso-surfaces, analyze 3D objects, composition profiles, and mesh interfaces.

NXapm_paraprobe_clusterer_config, NXapm_paraprobe_clusterer_results

Configuration and results respectively of the paraprobe-clusterer tool. Compute cluster analyses with established machine learning algorithms using CPU or GPUs.

NXapm_paraprobe_intersector_config, NXapm_paraprobe_intersector_results

Configuration and results resepctively of the paraprobe-intersector tool. Analyze volumetric intersections and proximity of 3D objects discretized as triangulated surface meshes in continuum space to study the effect the parameterization of surface extraction algorithms on the resulting shape, spatial arrangement, and colocation of 3D objects via graph-based techniques.

Joint work German NFDI consortia NFDI-MatWerk and FAIRmat

Members of the FAIRmat and the NFDI-MatWerk consortia of the German National Research Data Infrastructure are working together within the Infrastructure Use Case IUC09 of the NFDI-MatWerk project to work on examples how software tools in both consortia become better documented and interoperable to use. Within this project, we added the CompositionSpace tool by A. Saxena et al. that has been developed at the Max Planck Institute for Sustainable Materials in Düsseldorf using the above-mentioned approach of pairs of application definitions:

NXapm_compositionspace_config, NXapm_compositionspace_results

Results of a run with Alaukik Saxena’s composition space tool.