2.3.3.3.21. NXapm_paraprobe_clusterer_config

Status:

application definition, extends NXobject

Description:

Application definition for a configuration file of the paraprobe-clusterer tool. ...

Application definition for a configuration file of the paraprobe-clusterer tool.

This tool is part of the paraprobe-toolbox. Inspect NXapm_paraprobe_tool_config for details.

Symbols:

The symbols used in the schema to specify e.g. dimensions of arrays.

n_ivec_max: Maximum number of atoms per molecular ion. Should be 32 for paraprobe.

n_clust_algos: Number of clustering algorithms used.

n_ions: Number of different iontypes to distinguish during clustering.

Groups cited:

NXapm_paraprobe_tool_common, NXapm_paraprobe_tool_config, NXcg_cylinder_set, NXcg_ellipsoid_set, NXcg_face_list_data_structure, NXcg_hexahedron_set, NXcg_polyhedron_set, NXcs_filter_boolean_mask, NXcs_profiling, NXentry, NXmatch_filter, NXprocess, NXprogram, NXserialized, NXspatial_filter, NXsubsampling_filter

Structure:

ENTRY: (required) NXentry

definition: (required) NX_CHAR

Obligatory value: NXapm_paraprobe_clusterer_config

@version: (required) NX_CHAR

number_of_tasks: (required) NX_UINT {units=NX_UNITLESS}

How many cluster_analysis tasks should the tool execute.

cameca_to_nexus: (optional) NXapm_paraprobe_tool_config

This process maps results from a cluster analysis made with IVAS / AP Suite ...

This process maps results from a cluster analysis made with IVAS / AP Suite into an interoperable representation. IVAS / AP Suite usually exports such results as a list of reconstructed ion positions with one cluster label per position. These labels are reported via the mass-to-charge-state-ratio column of what is effectively a binary file that is formatted like a POS file but cluster labels written out using floating point numbers.

recover_evaporation_id: (required) NX_BOOLEAN

Specifies if paraprobe-clusterer should use the evaporation_ids from recon ...

Specifies if paraprobe-clusterer should use the evaporation_ids from reconstruction for recovering for each position in the NXserialized results the closest matching position (within floating point accuracy). This can be useful when users wish to recover the original evaporation_id, which IVAS /AP Suite drops when writing their .indexed. cluster results POS files that is referred to results.

reconstruction: (required) NXserialized

type: (required) NX_CHAR

path: (required) NX_CHAR

checksum: (required) NX_CHAR

algorithm: (required) NX_CHAR

position: (required) NX_CHAR

mass_to_charge: (required) NX_CHAR

results: (required) NXserialized

File with the results of the cluster analyses that was computed with IVAS ...

File with the results of the cluster analyses that was computed with IVAS / AP suite (e.g. maximum-separation method clustering algorithm J. Hyde et al.). The information is stored in an improper (.indexed.) POS file as a matrix of floating point quadruplets, one quadruplet for each ion. The first three values of each quadruplet encode the position of the ion. The fourth value is the integer identifier of the cluster encoded as a floating point number.

type: (required) NX_CHAR

path: (required) NX_CHAR

checksum: (required) NX_CHAR

algorithm: (required) NX_CHAR

cluster_analysisID: (optional) NXapm_paraprobe_tool_config

This process performs a cluster analysis on a ...

This process performs a cluster analysis on a reconstructed dataset or a ROI within it.

ion_type_filter: (required) NX_CHAR

How should iontypes be considered during the cluster analysis. ...

How should iontypes be considered during the cluster analysis.

The value resolve_all will set an ion active in the analysis regardless of which iontype it is.

The value resolve_unknown will set an ion active when it is of the UNKNOWNTYPE.

The value resolve_ion will set an ion active if it is of the specific iontype, irregardless of its nuclide structure.

The value resolve_element will set an ion active and account as many times for it, as the (molecular) ion contains atoms of elements in the whitelist ion_query_nuclide_vector.

The value resolve_isotope will set an ion active and account as many times for it, as the (molecular) ion contains nuclides in the whitelist ion_query_nuclide_vector.

In effect, ion_query_nuclide_vector acts as a whitelist to filter which ions are considered as source ions of the correlation statistics and how the multiplicity of each ion will be factorized.

This is relevant as in atom probe we have the situation that an ion of a molecular ion with more than one nuclide, say Ti O for example is counted potentially several times because at that position (reconstructed) position it has been assumed that there was a Ti and an O atom. This multiplicity affects the size of the feature and its chemical composition.

Obligatory value: resolve_element

ion_query_nuclide_vector: (required) NX_UINT (Rank: 2, Dimensions: [n_ions, n_ivec_max]) {units=NX_UNITLESS}

Matrix of nuclide vectors, as many as rows as different candidates ...

Matrix of nuclide vectors, as many as rows as different candidates for iontypes should be distinguished as possible source iontypes. In the simplest case, the matrix contains only the proton number of the element in the row, all other values set to zero.

reconstruction: (required) NXserialized

type: (required) NX_CHAR

path: (required) NX_CHAR

checksum: (required) NX_CHAR

algorithm: (required) NX_CHAR

position: (required) NX_CHAR

mass_to_charge: (required) NX_CHAR

ranging: (required) NXserialized

type: (required) NX_CHAR

path: (required) NX_CHAR

checksum: (required) NX_CHAR

algorithm: (required) NX_CHAR

ranging_definitions: (required) NX_CHAR

surface_distance: (optional) NXserialized

Distance between each ion and triangulated surface mesh.

type: (required) NX_CHAR

path: (required) NX_CHAR

checksum: (required) NX_CHAR

algorithm: (required) NX_CHAR

distance: (required) NX_CHAR

spatial_filter: (required) NXspatial_filter

windowing_method: (required) NX_CHAR

hexahedron_set: (optional) NXcg_hexahedron_set

dimensionality: (required) NX_POSINT

cardinality: (required) NX_POSINT

identifier_offset: (required) NX_INT

hexahedra: (required) NXcg_face_list_data_structure

vertices: (required) NX_UINT

cylinder_set: (optional) NXcg_cylinder_set

dimensionality: (required) NX_POSINT

cardinality: (required) NX_POSINT

identifier_offset: (required) NX_INT

center: (required) NX_NUMBER

height: (required) NX_NUMBER

radii: (required) NX_NUMBER

ellipsoid_set: (optional) NXcg_ellipsoid_set

dimensionality: (required) NX_POSINT

cardinality: (required) NX_POSINT

identifier_offset: (required) NX_INT

center: (required) NX_NUMBER

half_axes_radii: (required) NX_NUMBER

orientation: (required) NX_NUMBER

polyhedron_set: (optional) NXcg_polyhedron_set

bitmask: (optional) NXcs_filter_boolean_mask

number_of_objects: (required) NX_UINT

bitdepth: (required) NX_UINT

mask: (required) NX_UINT

evaporation_id_filter: (optional) NXsubsampling_filter

min_incr_max: (required) NX_INT

iontype_filter: (optional) NXmatch_filter

method: (required) NX_CHAR

match: (required) NX_NUMBER

hit_multiplicity_filter: (optional) NXmatch_filter

method: (required) NX_CHAR

match: (required) NX_NUMBER

dbscan: (required) NXprocess

Settings for DBScan clustering algorithm. For original details about the ...

Settings for DBScan clustering algorithm. For original details about the algorithm and (performance-relevant) details consider:

For details about how the DBScan algorithms is the key behind the specific modification known as the maximum-separation method in the atom probe community consider E. Jägle et al.

high_throughput_method: (required) NX_CHAR

Strategy how a set of cluster analyses with different parameter is execu ...

Strategy how a set of cluster analyses with different parameter is executed:

  • For tuple as many runs are performed as parameter values have been defined.

  • For combinatorics individual parameter arrays are looped over.

As an example we may provide ten entries for eps and three entries for min_pts. If high_throughput_method is set to tuple, the analysis is invalid because we have an insufficient number of min_pts values to pair them with our ten eps values. By contrast, if high_throughput_method is set to combinatorics, the tool will run three individual min_pts runs for each eps value, resulting in a total of 30 analyses.

A typical example from the literature M. Kühbach et al. can be instructed via setting eps to an array of values np.linspace(0.2, 5.0, nums=241, endpoint=True), one min_pts value that is equal to 1, and high_throughput_method set to combinatorics.

Any of these values: tuple | combinatorics

eps: (required) NX_FLOAT (Rank: 1, Dimensions: [i]) {units=NX_LENGTH}

Array of epsilon (eps) parameter values.

min_pts: (required) NX_UINT (Rank: 1, Dimensions: [j]) {units=NX_UNITLESS}

Array of minimum points (min_pts) parameter values.

hdbscan: (required) NXprocess

Settings for the HPDBScan clustering algorithm. ...

Settings for the HPDBScan clustering algorithm.

See also this documentation for details about the parameter. Here we use the terminology of the hdbscan documentation.

high_throughput_method: (required) NX_CHAR

Strategy how runs with different parameter values are composed, ...

Strategy how runs with different parameter values are composed, following the explanation for higih_throughput_method of dbscan.

Any of these values: tuple | combinatorics

min_cluster_size: (required) NX_NUMBER (Rank: 1, Dimensions: [i]) {units=NX_ANY}

Array of min_cluster_size parameter values.

min_samples: (required) NX_NUMBER (Rank: 1, Dimensions: [j]) {units=NX_ANY}

Array of min_samples parameter values.

cluster_selection_epsilon: (required) NX_NUMBER (Rank: 1, Dimensions: [k]) {units=NX_ANY}

Array of cluster_selection parameter values.

alpha: (required) NX_NUMBER (Rank: 1, Dimensions: [m]) {units=NX_ANY}

Array of alpha parameter values.

common: (required) NXapm_paraprobe_tool_common

status: (required) NX_CHAR

programID: (required) NXprogram

program: (required) NX_CHAR

@version: (required) NX_CHAR

profiling: (recommended) NXcs_profiling

start_time: (required) NX_DATE_TIME

end_time: (required) NX_DATE_TIME

total_elapsed_time: (required) NX_FLOAT

current_working_directory: (required) NX_CHAR

Hypertext Anchors

List of hypertext anchors for all groups, fields, attributes, and links defined in this class.

NXDL Source:

https://github.com/FAIRmat-NFDI/nexus_definitions/tree/fairmat/contributed_definitions/NXapm_paraprobe_clusterer_config.nxdl.xml