Tool Reference

The agent chooses tools from your prompt, but you can also name the tool directly when you know the analysis you want. Most tools require a parsed dataset_id from parse_data; KMC growth simulation does not.

Data Tools

ToolUse it forKey outputs
parse_dataLoad supported lab files into Matter42dataset_id, dataset kind, data type, metadata summary, cached status
explore_dataFirst-pass view of any parsed datasetInteractive spectra, spatial maps, statistics, figure IDs
plot_spectrumPlot a single spectrum or spectrum-like tableSpectrum figure, peak positions, overlay support
register_datasetRegister a canonical HDF5 from the guided upload pathdataset_id for exotic formats

parse_data is the default parser for supported files. It auto-detects LabSpec maps, single spectra, tables, documents, and images. Use explore_data once after parsing; the default view returns interactive, map, and statistics figures together.

Defect Clustering

cluster_defects identifies spectroscopically distinct pixel populations. It works on PL, Raman, or paired PL/Raman maps.

Inputs that matter:

  • dataset_id: primary PL or Raman dataset.
  • aux_dataset_id: optional companion map from the same sample.
  • n_clusters: fixed cluster count, or omit for automatic selection from 2-4 clusters.
  • features: optional subset of extracted features.
  • region_mode, boundary_buffer_um, selection_policy, and min_component_area_um2: region controls.

How it works: the tool extracts physics-informed features, standardizes them, reduces them to principal components for visualization, and clusters with KMeans. PL features include intensity, peak position, FWHM, and asymmetry. Raman features include E2g and A1g intensity, center, FWHM, asymmetry where available, and A1g/E2g ratio.

Use it to answer: "How many spectral populations are present, where are they, and how do their mean spectra differ?" Do not treat cluster labels as chemical identities by themselves.

Defect Density

estimate_defect_density estimates an equivalent defect percentage from Raman E2g linewidth broadening.

Inputs that matter:

  • dataset_id: Raman dataset.
  • material: MoS2 or WS2, if auto-detection is uncertain.
  • defect_type: optional specific calibration series.
  • instrument_fwhm: spectrometer contribution in cm^-1, subtracted in quadrature.
  • pl_dataset_id: optional PL map for quenching correlation.
  • Region controls.

How it works: the tool measures E2g FWHM per pixel, corrects instrumental broadening, and inverts the linewidth against MLIP calibration curves. It returns a density map, summary statistics, calibration candidates, and figures.

Interpretation: this is a model-dependent equivalent concentration, not a direct atom count. Raw linewidths without instrument correction will overestimate density.

Defect Type Classification

classify_defect_type ranks which calibrated defect family best matches a Raman fingerprint.

Inputs that matter:

  • dataset_id: Raman dataset.
  • material: MoS2 or WS2, if needed.
  • instrument_fwhm: spectrometer FWHM correction.
  • pl_dataset_id: optional PL evidence.
  • Region controls.

How it works: the tool uses the measured E2g linewidth to place the sample on each candidate calibration curve, predicts the associated Raman observables, and ranks candidates by weighted residual. A1g center shift is a strong discriminator; E2g/A1g asymmetry can help separate vacancy-like and substitution-like disorder.

Calibrated candidates currently include chalcogen vacancy, metal vacancy, S->O substitution, and S->C substitution for MoS2 and WS2. Treat the result as a ranked hypothesis over known calibration families.

Defect Activity From PL

estimate_defect_activity asks which defects are electronically active, not just structurally present.

Inputs that matter:

  • dataset_id: PL dataset.
  • material: MoS2, WS2, WSe2, or MoSe2, if it cannot be inferred.
  • raman_dataset_id: optional companion Raman map for density correlation.
  • Region controls.

How it works: the tool combines PL quenching, trion/exciton enhancement, and sub-gap emission into a 0-1 activity score. Higher scores indicate stronger non-radiative recombination or trap-like electronic behavior.

Use it when device relevance matters. Some defects strongly broaden Raman modes but are less electronically active; others produce severe PL quenching.

Region Segmentation

segment_regions labels every pixel as interior, transition, or damaged.

Inputs that matter:

  • dataset_id: PL or Raman hyperspectral map.
  • aux_dataset_id: optional second dataset to intersect with the overlap zone.
  • boundary_buffer_um: erosion distance separating clean interior from boundary halo.
  • selection_policy: all connected components or only the largest component.
  • min_component_area_um2: area threshold for dropping small islands.

Use this before density, classification, or clustering when the sample includes PFIB damage, holes, scratches, missing pixels, or boundary artifacts.

KMC Growth Simulation

request_kmc_params opens the interactive KMC controls in chat. run_kmc executes a fully specified run.

Supported run_kmc parameters:

  • material: MoS2, WS2, or WSe2.
  • T: 700-1200 K.
  • ratio: chalcogen-to-metal flux ratio from 5-200.
  • n_seeds: 1-10 nucleation seeds.

The model uses a BKL variable-time KMC algorithm on a 40x40 hexagonal lattice. It returns an animated growth trajectory, coverage, defect density percentage, vacancy density in cm^-2, grain count, and a CSV growth history.

Use it for qualitative growth-condition exploration, not for fitting a particular experimental map unless the parameters have been independently calibrated.

Figure Inspection

Many tools return figure IDs. The agent can call view_figure with a concrete focus, such as:

Inspect the cluster map and check whether the high-defect label leaks into the PFIB halo.

Use figure inspection when the structured numbers look suspicious, when boundaries are ambiguous, or when a map contains spatial artifacts.

Stub Or Future Tools

simulate_raman_spectrum is reserved for a future MLIP-backed Raman simulation workflow and should not be treated as an active predictive model.

interpolate_spatial is a placeholder for spatial interpolation workflows. Use current map and segmentation tools for supported spatial analyses.

Copyright © 2026 Matter42. All rights reserved.