NeXus Definition Language (NXDL), XML, and YAML¶
The NeXus data format, described by the NeXus Definition Language (NXDL), represents a concerted effort aimed at facilitating data exchange within scientific communities, particularly among those engaged in neutron, X-ray, and muon research (see the most recent NeXus paper by Könnecke et al.). The data format is also used by the material science community under the project NeXus-FAIRmat proposal supporting FAIR (see FAIR guiding principles) (Findable, Accessible, Interoperable, and Reuseable) data principles in material science. The format serves as a standardized framework for both data exchange and storage.
The NeXus Definition Language (NXDL) at the core of the NeXus data format functions as the cornerstone whereby scientists delineate the nomenclature and organizational pieces of information within NeXus data files specific to the scientific techniques. NXDL is used to define general data structures (base classes), which define the set of terms that might be used in an instance of that class. These base classes are the building blocks for defining, under the schema called application definition, measurement-specific or even instrument-specific or software-specific data storage objects. In this process, members and definitions of individual base classes can be used as is or customized. In essence, the process of schema development, whether for a base class or an application definition, entails crafting an NXDL schema definition file with the extension 'nxdl.xml', utilizing the Extensible Markup Language XML.
To expedite the schema development process, recently, we have introduced the use of Yet Another Markup Language YAML, which provides a syntax or style specifically tailored for defining scientific domain-driven schemas with NXDL. One significant advantage of YAML over XML is its indentation-driven approach, which eliminates the need for starting and ending tags for each entity within the schema. Thus, the use of YAML format reduces repetition of NXDL keywords and offers a more intuitive structure aligned with object-oriented programming concepts, such as class inheritance. These benefits are attained without compromising the integrity of the original NeXus schema, which is traditionally expressed in XML format.
While YAML offers a more readable and concise way to author NeXus definitions, a transcoding mechanism is required because the official format remains XML. The nyaml Python package serves as a converter tool designed for specifically this purpose. It enables the conversion of NXDL-compliant definitions from YAML to XML, making it easier for NeXus schema developers to create and maintain definitions. Furthermore, the tool offers the flexibility to extend existing NeXus schemas in XML by facilitating conversion back and forth between the two formats. It is important to note that here we do not introduce NeXus data objects, terms, or types, which are fundamental for writing base class schemas or application definition schemas. For individuals new to NeXus, we refer to the official NeXus site at NeXus (NeXus official website).