mattertune.configs
- class mattertune.configs.AdamConfig(*, name='Adam', lr, eps=1e-08, betas=(0.9, 0.999), weight_decay=0.0, amsgrad=False)[source]
- Parameters:
name (Literal['Adam'])
lr (Annotated[float, Gt(gt=0)])
eps (Annotated[float, Ge(ge=0)])
betas (tuple[Annotated[float, Gt(gt=0)], Annotated[float, Gt(gt=0)]])
weight_decay (Annotated[float, Ge(ge=0)])
amsgrad (bool)
- name: Literal['Adam']
name of the optimizer.
- lr: C.PositiveFloat
Learning rate.
- eps: C.NonNegativeFloat
Epsilon.
- betas: tuple[C.PositiveFloat, C.PositiveFloat]
Betas.
- weight_decay: C.NonNegativeFloat
Weight decay.
- amsgrad: bool
Whether to use AMSGrad variant of Adam.
- class mattertune.configs.AdamWConfig(*, name='AdamW', lr, eps=1e-08, betas=(0.9, 0.999), weight_decay=0.01, amsgrad=False)[source]
- Parameters:
name (Literal['AdamW'])
lr (Annotated[float, Gt(gt=0)])
eps (Annotated[float, Ge(ge=0)])
betas (tuple[Annotated[float, Gt(gt=0)], Annotated[float, Gt(gt=0)]])
weight_decay (Annotated[float, Ge(ge=0)])
amsgrad (bool)
- name: Literal['AdamW']
name of the optimizer.
- lr: C.PositiveFloat
Learning rate.
- eps: C.NonNegativeFloat
Epsilon.
- betas: tuple[C.PositiveFloat, C.PositiveFloat]
Betas.
- weight_decay: C.NonNegativeFloat
Weight decay.
- amsgrad: bool
Whether to use AMSGrad variant of Adam.
- class mattertune.configs.AutoSplitDataModuleConfig(*, batch_size, num_workers='auto', pin_memory=True, dataset, train_split, validation_split='auto', shuffle=True, shuffle_seed=42)[source]
- Parameters:
batch_size (int)
num_workers (int | Literal['auto'])
pin_memory (bool)
dataset (DatasetConfig)
train_split (float)
validation_split (float | Literal['auto', 'disable'])
shuffle (bool)
shuffle_seed (int)
- dataset: DatasetConfig
The configuration for the dataset.
- train_split: float
The proportion of the dataset to include in the training split.
- validation_split: float | Literal['auto', 'disable']
The proportion of the dataset to include in the validation split.
If set to “auto”, the validation split will be automatically determined as the complement of the training split, i.e. validation_split = 1 - train_split.
If set to “disable”, the validation split will be disabled.
- shuffle: bool
Whether to shuffle the dataset before splitting.
- shuffle_seed: int
The seed to use for shuffling the dataset.
- class mattertune.configs.CSVLoggerConfig(*, type='csv', save_dir, name='lightning_logs', version=None, prefix='', flush_logs_every_n_steps=100)[source]
- Parameters:
type (Literal['csv'])
save_dir (str)
name (str)
version (int | str | None)
prefix (str)
flush_logs_every_n_steps (int)
- type: Literal['csv']
- save_dir: str
Save directory for logs.
- name: str
'lightning_logs'
.- Type:
Experiment name. Default
- version: int | str | None
Experiment version. If not specified, automatically assigns the next available version. Default:
None
.
- prefix: str
''
.- Type:
String to put at the beginning of metric keys. Default
- flush_logs_every_n_steps: int
100
.- Type:
How often to flush logs to disk. Default
- class mattertune.configs.CosineAnnealingLRConfig(*, type='CosineAnnealingLR', T_max, eta_min=0, last_epoch=-1)[source]
- Parameters:
type (Literal['CosineAnnealingLR'])
T_max (int)
eta_min (float)
last_epoch (int)
- type: Literal['CosineAnnealingLR']
Type of the learning rate scheduler.
- T_max: int
Maximum number of iterations.
- eta_min: float
Minimum learning rate.
- last_epoch: int
The index of last epoch.
- class mattertune.configs.CutoffsConfig(*, main, aeaint, qint, aint)[source]
- Parameters:
main (float)
aeaint (float)
qint (float)
aint (float)
- main: float
- aeaint: float
- qint: float
- aint: float
- class mattertune.configs.DBDatasetConfig(*, type='db', src, energy_key=None, forces_key=None, stress_key=None, preload=True)[source]
Configuration for a dataset stored in an ASE database.
- Parameters:
type (Literal['db'])
src (Database | str | Path)
energy_key (str | None)
forces_key (str | None)
stress_key (str | None)
preload (bool)
- type: Literal['db']
Discriminator for the DB dataset.
- src: Database | str | Path
Path to the ASE database file or a database object.
- energy_key: str | None
Key for the energy label in the database.
- forces_key: str | None
Key for the force label in the database.
- stress_key: str | None
Key for the stress label in the database.
- preload: bool
Whether to load all the data at once or not.
- class mattertune.configs.DataModuleBaseConfig(*, batch_size, num_workers='auto', pin_memory=True)[source]
- Parameters:
batch_size (int)
num_workers (int | Literal['auto'])
pin_memory (bool)
- batch_size: int
The batch size for the dataloaders.
- num_workers: int | Literal['auto']
The number of workers for the dataloaders.
This is the number of processes that generate batches in parallel.
If set to “auto”, the number of workers will be automatically set based on the number of available CPUs.
Set to 0 to disable parallelism.
- pin_memory: bool
Whether to pin memory in the dataloaders.
This is useful for speeding up GPU data transfer.
- class mattertune.configs.DatasetConfigBase[source]
- prepare_data()[source]
Prepare the dataset for training.
Use this to download and prepare data. Downloading and saving data with multiple processes (distributed settings) will result in corrupted data. Lightning ensures this method is called only within a single process, so you can safely add your downloading logic within this method.
- class mattertune.configs.EarlyStoppingConfig(*, monitor='val/total_loss', min_delta=0.0, patience=3, verbose=False, mode='min', strict=True, check_finite=True, stopping_threshold=None, divergence_threshold=None, check_on_train_epoch_end=None, log_rank_zero_only=False)[source]
- Parameters:
monitor (str)
min_delta (float)
patience (int)
verbose (bool)
mode (Literal['min', 'max'])
strict (bool)
check_finite (bool)
stopping_threshold (float | None)
divergence_threshold (float | None)
check_on_train_epoch_end (bool | None)
log_rank_zero_only (bool)
- monitor: str
Quantity to be monitored.
- min_delta: float
Minimum change in monitored quantity to qualify as an improvement. Changes of less than or equal to min_delta will count as no improvement. Default:
0.0
.
- patience: int
3
.- Type:
Number of validation checks with no improvement after which training will be stopped. Default
- verbose: bool
False
.- Type:
Whether to print messages when improvement is found or early stopping is triggered. Default
- mode: Literal['min', 'max']
One of ‘min’ or ‘max’. In ‘min’ mode, training stops when monitored quantity stops decreasing; in ‘max’ mode it stops when the quantity stops increasing. Default:
'min'
.
- strict: bool
True
.- Type:
Whether to raise an error if monitored metric is not found in validation metrics. Default
- check_finite: bool
True
.- Type:
Whether to stop training when the monitor becomes NaN or infinite. Default
- stopping_threshold: float | None
None
.- Type:
Stop training immediately once the monitored quantity reaches this threshold. Default
- divergence_threshold: float | None
None
.- Type:
Stop training as soon as the monitored quantity becomes worse than this threshold. Default
- check_on_train_epoch_end: bool | None
Whether to run early stopping at the end of training epoch. If False, check runs at validation end. Default:
None
.
- log_rank_zero_only: bool
False
.- Type:
Whether to log the status of early stopping only for rank 0 process. Default
- class mattertune.configs.EnergyPropertyConfig(*, name='energy', dtype='float', loss, loss_coefficient=1.0, type='energy')[source]
- Parameters:
name (str)
dtype (DType)
loss (LossConfig)
loss_coefficient (float)
type (Literal['energy'])
- type: Literal['energy']
- name: str
The name of the property.
This is the key that will be used to access the property in the output of the model.
- dtype: DType
The type of the property values.
- ase_calculator_property_name()[source]
If this property can be calculated by an ASE calculator, returns the name of the property that the ASE calculator uses. Otherwise, returns None.
This should only return non-None for properties that are supported by the ASE calculator interface, i.e.: - ‘energy’ - ‘forces’ - ‘stress’ - ‘dipole’ - ‘charges’ - ‘magmom’ - ‘magmoms’
Note that this does not refer to the new experimental custom property prediction support feature in ASE, but rather the built-in properties that ASE can calculate in the
ase.calculators.calculator.Calculator
class.
- class mattertune.configs.EqV2BackboneConfig(*, properties, optimizer, lr_scheduler=None, ignore_gpu_batch_transform_error=True, normalizers={}, name='eqV2', checkpoint_path, atoms_to_graph)[source]
- Parameters:
properties (Sequence[PropertyConfig])
optimizer (OptimizerConfig)
lr_scheduler (LRSchedulerConfig | None)
ignore_gpu_batch_transform_error (bool)
normalizers (Mapping[str, Sequence[NormalizerConfig]])
name (Literal['eqV2'])
checkpoint_path (Path | CachedPath)
atoms_to_graph (FAIRChemAtomsToGraphSystemConfig)
- name: Literal['eqV2']
The type of the backbone.
- checkpoint_path: Path | CE.CachedPath
The path to the checkpoint to load.
- atoms_to_graph: FAIRChemAtomsToGraphSystemConfig
Configuration for converting ASE Atoms to a graph.
- classmethod ensure_dependencies()[source]
Ensure that all dependencies are installed.
This method should raise an exception if any dependencies are missing, with a message indicating which dependencies are missing and how to install them.
- properties: Sequence[PropertyConfig]
Properties to predict.
- optimizer: OptimizerConfig
Optimizer.
- lr_scheduler: LRSchedulerConfig | None
Learning Rate Scheduler
- ignore_gpu_batch_transform_error: bool
Whether to ignore data processing errors during training.
- normalizers: Mapping[str, Sequence[NormalizerConfig]]
Normalizers for the properties.
Any property can be associated with multiple normalizers. This is useful for cases where we want to normalize the same property in different ways. For example, we may want to normalize the energy by subtracting the atomic reference energies, as well as by mean and standard deviation normalization.
The normalizers are applied in the order they are defined in the list.
- class mattertune.configs.ExponentialConfig(*, type='ExponentialLR', gamma)[source]
- Parameters:
type (Literal['ExponentialLR'])
gamma (float)
- type: Literal['ExponentialLR']
Type of the learning rate scheduler.
- gamma: float
Multiplicative factor of learning rate decay.
- class mattertune.configs.FAIRChemAtomsToGraphSystemConfig(*, radius, max_num_neighbors)[source]
Configuration for converting ASE Atoms to a graph for the FAIRChem model.
- Parameters:
radius (float)
max_num_neighbors (int)
- radius: float
The radius for edge construction.
- max_num_neighbors: int
The maximum number of neighbours each node can send messages to.
- class mattertune.configs.FinetuneModuleBaseConfig(*, properties, optimizer, lr_scheduler=None, ignore_gpu_batch_transform_error=True, normalizers={})[source]
- Parameters:
properties (Sequence[PropertyConfig])
optimizer (OptimizerConfig)
lr_scheduler (LRSchedulerConfig | None)
ignore_gpu_batch_transform_error (bool)
normalizers (Mapping[str, Sequence[NormalizerConfig]])
- properties: Sequence[PropertyConfig]
Properties to predict.
- optimizer: OptimizerConfig
Optimizer.
- lr_scheduler: LRSchedulerConfig | None
Learning Rate Scheduler
- ignore_gpu_batch_transform_error: bool
Whether to ignore data processing errors during training.
- normalizers: Mapping[str, Sequence[NormalizerConfig]]
Normalizers for the properties.
Any property can be associated with multiple normalizers. This is useful for cases where we want to normalize the same property in different ways. For example, we may want to normalize the energy by subtracting the atomic reference energies, as well as by mean and standard deviation normalization.
The normalizers are applied in the order they are defined in the list.
- abstract classmethod ensure_dependencies()[source]
Ensure that all dependencies are installed.
This method should raise an exception if any dependencies are missing, with a message indicating which dependencies are missing and how to install them.
- class mattertune.configs.ForcesPropertyConfig(*, name='forces', dtype='float', loss, loss_coefficient=1.0, type='forces', conservative)[source]
- Parameters:
name (str)
dtype (DType)
loss (LossConfig)
loss_coefficient (float)
type (Literal['forces'])
conservative (bool)
- type: Literal['forces']
- name: str
The name of the property.
This is the key that will be used to access the property in the output of the model.
- dtype: DType
The type of the property values.
- conservative: bool
Whether the forces are energy conserving.
This is used by the backbone to decide the type of output head to use for this property. Conservative force predictions are computed by taking the negative gradient of the energy with respect to the atomic positions, whereas non-conservative forces may be computed by other means.
- ase_calculator_property_name()[source]
If this property can be calculated by an ASE calculator, returns the name of the property that the ASE calculator uses. Otherwise, returns None.
This should only return non-None for properties that are supported by the ASE calculator interface, i.e.: - ‘energy’ - ‘forces’ - ‘stress’ - ‘dipole’ - ‘charges’ - ‘magmom’ - ‘magmoms’
Note that this does not refer to the new experimental custom property prediction support feature in ASE, but rather the built-in properties that ASE can calculate in the
ase.calculators.calculator.Calculator
class.
- class mattertune.configs.GraphPropertyConfig(*, name, dtype, loss, loss_coefficient=1.0, type='graph_property', reduction)[source]
- Parameters:
name (str)
dtype (DType)
loss (LossConfig)
loss_coefficient (float)
type (Literal['graph_property'])
reduction (Literal['mean', 'sum', 'max'])
- type: Literal['graph_property']
- reduction: Literal['mean', 'sum', 'max']
The reduction to use for the output. - “sum”: Sum the property values for all atoms in the system. This is optimal for extensive properties (e.g. energy). - “mean”: Take the mean of the property values for all atoms in the system. This is optimal for intensive properties (e.g. density). - “max”: Take the maximum of the property values for all atoms in the system. This is optimal for properties like the last phdos peak of Matbench’s phonons dataset.
- ase_calculator_property_name()[source]
If this property can be calculated by an ASE calculator, returns the name of the property that the ASE calculator uses. Otherwise, returns None.
This should only return non-None for properties that are supported by the ASE calculator interface, i.e.: - ‘energy’ - ‘forces’ - ‘stress’ - ‘dipole’ - ‘charges’ - ‘magmom’ - ‘magmoms’
Note that this does not refer to the new experimental custom property prediction support feature in ASE, but rather the built-in properties that ASE can calculate in the
ase.calculators.calculator.Calculator
class.
- class mattertune.configs.HuberLossConfig(*, name='huber', delta=1.0, reduction='mean')[source]
- Parameters:
name (Literal['huber'])
delta (float)
reduction (Literal['mean', 'sum'])
- name: Literal['huber']
- delta: float
The threshold value for the Huber loss function.
- reduction: Literal['mean', 'sum']
How to reduce the loss values across the batch.
"mean"
: The mean of the loss values."sum"
: The sum of the loss values.
- class mattertune.configs.JMPBackboneConfig(*, properties, optimizer, lr_scheduler=None, ignore_gpu_batch_transform_error=True, normalizers={}, name='jmp', ckpt_path, graph_computer)[source]
- Parameters:
properties (Sequence[PropertyConfig])
optimizer (OptimizerConfig)
lr_scheduler (LRSchedulerConfig | None)
ignore_gpu_batch_transform_error (bool)
normalizers (Mapping[str, Sequence[NormalizerConfig]])
name (Literal['jmp'])
ckpt_path (Path | CachedPath)
graph_computer (JMPGraphComputerConfig)
- name: Literal['jmp']
The type of the backbone.
- ckpt_path: Path | CE.CachedPath
The path to the pre-trained model checkpoint.
- graph_computer: JMPGraphComputerConfig
The configuration for the graph computer.
- classmethod ensure_dependencies()[source]
Ensure that all dependencies are installed.
This method should raise an exception if any dependencies are missing, with a message indicating which dependencies are missing and how to install them.
- properties: Sequence[PropertyConfig]
Properties to predict.
- optimizer: OptimizerConfig
Optimizer.
- lr_scheduler: LRSchedulerConfig | None
Learning Rate Scheduler
- ignore_gpu_batch_transform_error: bool
Whether to ignore data processing errors during training.
- normalizers: Mapping[str, Sequence[NormalizerConfig]]
Normalizers for the properties.
Any property can be associated with multiple normalizers. This is useful for cases where we want to normalize the same property in different ways. For example, we may want to normalize the energy by subtracting the atomic reference energies, as well as by mean and standard deviation normalization.
The normalizers are applied in the order they are defined in the list.
- class mattertune.configs.JMPGraphComputerConfig(*, pbc, cutoffs=CutoffsConfig(main=12.0, aeaint=12.0, qint=12.0, aint=12.0), max_neighbors=MaxNeighborsConfig(main=30, aeaint=20, qint=8, aint=1000), per_graph_radius_graph=False)[source]
- Parameters:
pbc (bool)
cutoffs (CutoffsConfig)
max_neighbors (MaxNeighborsConfig)
per_graph_radius_graph (bool)
- pbc: bool
Whether to use periodic boundary conditions.
- cutoffs: CutoffsConfig
The cutoff for the radius graph.
- max_neighbors: MaxNeighborsConfig
The maximum number of neighbors for the radius graph.
- per_graph_radius_graph: bool
Whether to compute the radius graph per graph.
- class mattertune.configs.JSONDatasetConfig(*, type='json', src, tasks)[source]
- Parameters:
type (Literal['json'])
src (str | Path)
tasks (dict[str, str])
- type: Literal['json']
Discriminator for the JSON dataset.
- src: str | Path
The path to the JSON dataset.
- tasks: dict[str, str]
Attributes in the JSON file that correspond to the tasks to be predicted.
- class mattertune.configs.L2MAELossConfig(*, name='l2_mae', reduction='mean')[source]
- Parameters:
name (Literal['l2_mae'])
reduction (Literal['mean', 'sum'])
- name: Literal['l2_mae']
- reduction: Literal['mean', 'sum']
How to reduce the loss values across the batch.
"mean"
: The mean of the loss values."sum"
: The sum of the loss values.
- class mattertune.configs.M3GNetBackboneConfig(*, properties, optimizer, lr_scheduler=None, ignore_gpu_batch_transform_error=True, normalizers={}, name='m3gnet', ckpt_path, graph_computer)[source]
- Parameters:
properties (Sequence[PropertyConfig])
optimizer (OptimizerConfig)
lr_scheduler (LRSchedulerConfig | None)
ignore_gpu_batch_transform_error (bool)
normalizers (Mapping[str, Sequence[NormalizerConfig]])
name (Literal['m3gnet'])
ckpt_path (str | Path)
graph_computer (M3GNetGraphComputerConfig)
- name: Literal['m3gnet']
The type of the backbone.
- ckpt_path: str | Path
The path to the pre-trained model checkpoint.
- graph_computer: M3GNetGraphComputerConfig
Configuration for the graph computer.
- properties: Sequence[PropertyConfig]
Properties to predict.
- optimizer: OptimizerConfig
Optimizer.
- lr_scheduler: LRSchedulerConfig | None
Learning Rate Scheduler
- ignore_gpu_batch_transform_error: bool
Whether to ignore data processing errors during training.
- normalizers: Mapping[str, Sequence[NormalizerConfig]]
Normalizers for the properties.
Any property can be associated with multiple normalizers. This is useful for cases where we want to normalize the same property in different ways. For example, we may want to normalize the energy by subtracting the atomic reference energies, as well as by mean and standard deviation normalization.
The normalizers are applied in the order they are defined in the list.
- class mattertune.configs.M3GNetGraphComputerConfig(*, element_types=<factory>, cutoff=None, threebody_cutoff=None, pre_compute_line_graph=False, graph_labels=None)[source]
Configuration for initialize a MatGL Atoms2Graph Convertor.
- Parameters:
element_types (tuple[str, ...])
cutoff (float | None)
threebody_cutoff (float | None)
pre_compute_line_graph (bool)
graph_labels (list[int | float] | None)
- element_types: tuple[str, ...]
The element types to consider, default is all elements.
- cutoff: float | None
The cutoff distance for the neighbor list. If None, the cutoff is loaded from the checkpoint.
- threebody_cutoff: float | None
The cutoff distance for the three-body interactions. If None, the cutoff is loaded from the checkpoint.
- pre_compute_line_graph: bool
Whether to pre-compute the line graph for three-body interactions in data preparation.
- graph_labels: list[int | float] | None
The graph labels to consider, default is None.
- class mattertune.configs.MAELossConfig(*, name='mae', reduction='mean')[source]
- Parameters:
name (Literal['mae'])
reduction (Literal['mean', 'sum'])
- name: Literal['mae']
- reduction: Literal['mean', 'sum']
How to reduce the loss values across the batch.
"mean"
: The mean of the loss values."sum"
: The sum of the loss values.
- class mattertune.configs.MPDatasetConfig(*, type='mp', api, fields, query)[source]
Configuration for a dataset stored in the Materials Project database.
- Parameters:
type (Literal['mp'])
api (str)
fields (list[str])
query (dict)
- type: Literal['mp']
Discriminator for the MP dataset.
- api: str
Input API key for the Materials Project database.
- fields: list[str]
Fields to retrieve from the Materials Project database.
- query: dict
Query to filter the data from the Materials Project database.
- class mattertune.configs.MPTrajDatasetConfig(*, type='mptraj', split='train', min_num_atoms=5, max_num_atoms=None, elements=None)[source]
Configuration for a dataset stored in the Materials Project database.
- Parameters:
type (Literal['mptraj'])
split (Literal['train', 'val', 'test'])
min_num_atoms (int | None)
max_num_atoms (int | None)
elements (list[str] | None)
- type: Literal['mptraj']
Discriminator for the MPTraj dataset.
- split: Literal['train', 'val', 'test']
Split of the dataset to use.
- min_num_atoms: int | None
Minimum number of atoms to be considered. Drops structures with fewer atoms.
- max_num_atoms: int | None
Maximum number of atoms to be considered. Drops structures with more atoms.
- elements: list[str] | None
List of elements to be considered. Drops structures with elements not in the list. Subsets are also allowed. For example, [“Li”, “Na”] will keep structures with either Li or Na.
- class mattertune.configs.MSELossConfig(*, name='mse', reduction='mean')[source]
- Parameters:
name (Literal['mse'])
reduction (Literal['mean', 'sum'])
- name: Literal['mse']
- reduction: Literal['mean', 'sum']
How to reduce the loss values across the batch.
"mean"
: The mean of the loss values."sum"
: The sum of the loss values.
- class mattertune.configs.ManualSplitDataModuleConfig(*, batch_size, num_workers='auto', pin_memory=True, train, validation=None)[source]
- Parameters:
batch_size (int)
num_workers (int | Literal['auto'])
pin_memory (bool)
train (DatasetConfig)
validation (DatasetConfig | None)
- train: DatasetConfig
The configuration for the training data.
- validation: DatasetConfig | None
The configuration for the validation data.
- class mattertune.configs.MatbenchDatasetConfig(*, type='matbench', task=None, property_name=None, fold_idx=0)[source]
Configuration for the Matbench dataset.
- Parameters:
type (Literal['matbench'])
task (str | None)
property_name (str | None)
fold_idx (Literal[0, 1, 2, 3, 4])
- type: Literal['matbench']
Discriminator for the Matbench dataset.
- task: str | None
The name of the self.tasks to include in the dataset.
- property_name: str | None
Assign a property name for the self.task. Must match the property head in the model.
- fold_idx: Literal[0, 1, 2, 3, 4]
The index of the fold to be used in the dataset.
- class mattertune.configs.MatterTunerConfig(*, data, model, trainer=TrainerConfig(accelerator='auto', strategy='auto', num_nodes=1, devices='auto', precision='32-true', deterministic=None, max_epochs=None, min_epochs=None, max_steps=-1, min_steps=None, max_time=None, val_check_interval=None, check_val_every_n_epoch=1, log_every_n_steps=None, gradient_clip_val=None, gradient_clip_algorithm=None, checkpoint=None, early_stopping=None, loggers='default', additional_trainer_kwargs={}))[source]
- Parameters:
data (DataModuleConfig)
model (ModelConfig)
trainer (TrainerConfig)
- data: DataModuleConfig
The configuration for the data.
- model: ModelConfig
The configuration for the model.
- trainer: TrainerConfig
The configuration for the trainer.
- class mattertune.configs.MaxNeighborsConfig(*, main, aeaint, qint, aint)[source]
- Parameters:
main (int)
aeaint (int)
qint (int)
aint (int)
- main: int
- aeaint: int
- qint: int
- aint: int
- class mattertune.configs.MeanStdNormalizerConfig(*, mean, std)[source]
- Parameters:
mean (float)
std (float)
- mean: float
The mean of the property values.
- std: float
The standard deviation of the property values.
- class mattertune.configs.ModelCheckpointConfig(*, dirpath=None, filename=None, monitor=None, verbose=False, save_last=None, save_top_k=1, save_weights_only=False, mode='min', auto_insert_metric_name=True, every_n_train_steps=None, train_time_interval=None, every_n_epochs=None, save_on_train_epoch_end=None, enable_version_counter=True)[source]
- Parameters:
dirpath (str | None)
filename (str | None)
monitor (str | None)
verbose (bool)
save_last (Literal[True, False, 'link'] | None)
save_top_k (int)
save_weights_only (bool)
mode (Literal['min', 'max'])
auto_insert_metric_name (bool)
every_n_train_steps (int | None)
train_time_interval (timedelta | None)
every_n_epochs (int | None)
save_on_train_epoch_end (bool | None)
enable_version_counter (bool)
- dirpath: str | None
None
.- Type:
Directory to save the model file. Default
- filename: str | None
None
.- Type:
Checkpoint filename. Can contain named formatting options. Default
- monitor: str | None
None
.- Type:
Quantity to monitor. Default
- verbose: bool
False
.- Type:
Verbosity mode. Default
- save_last: Literal[True, False, 'link'] | None
None
.- Type:
When True or “link”, saves a ‘last.ckpt’ checkpoint when a checkpoint is saved. Default
- save_top_k: int
1
.- Type:
If save_top_k=k, save k models with best monitored quantity. Default
- save_weights_only: bool
False
.- Type:
If True, only save model weights. Default
- mode: Literal['min', 'max']
'min'
.- Type:
One of {‘min’, ‘max’}. For ‘min’ training stops when monitored quantity stops decreasing. Default
- auto_insert_metric_name: bool
True
.- Type:
Whether to automatically insert metric name in checkpoint filename. Default
- every_n_train_steps: int | None
None
.- Type:
Number of training steps between checkpoints. Default
- train_time_interval: timedelta | None
None
.- Type:
Checkpoints are monitored at the specified time interval. Default
- every_n_epochs: int | None
None
.- Type:
Number of epochs between checkpoints. Default
- save_on_train_epoch_end: bool | None
None
.- Type:
Whether to run checkpointing at end of training epoch. Default
- enable_version_counter: bool
True
.- Type:
Whether to append version to existing filenames. Default
- class mattertune.configs.MultiStepLRConfig(*, type='MultiStepLR', milestones, gamma)[source]
- Parameters:
type (Literal['MultiStepLR'])
milestones (list[int])
gamma (float)
- type: Literal['MultiStepLR']
Type of the learning rate scheduler.
- milestones: list[int]
List of epoch indices. Must be increasing.
- gamma: float
Multiplicative factor of learning rate decay.
- class mattertune.configs.NormalizerConfigBase[source]
- class mattertune.configs.OMAT24DatasetConfig(*, type='omat24', src)[source]
- Parameters:
type (Literal['omat24'])
src (Path)
- type: Literal['omat24']
Discriminator for the OMAT24 dataset.
- src: Path
The path to the OMAT24 dataset.
- class mattertune.configs.ORBBackboneConfig(*, properties, optimizer, lr_scheduler=None, ignore_gpu_batch_transform_error=True, normalizers={}, name='orb', pretrained_model, system=ORBSystemConfig(radius=10.0, max_num_neighbors=20))[source]
- Parameters:
properties (Sequence[PropertyConfig])
optimizer (OptimizerConfig)
lr_scheduler (LRSchedulerConfig | None)
ignore_gpu_batch_transform_error (bool)
normalizers (Mapping[str, Sequence[NormalizerConfig]])
name (Literal['orb'])
pretrained_model (str)
system (ORBSystemConfig)
- name: Literal['orb']
The type of the backbone.
- pretrained_model: str
The name of the pretrained model to load.
- system: ORBSystemConfig
The system configuration, controlling how to featurize a system of atoms.
- classmethod ensure_dependencies()[source]
Ensure that all dependencies are installed.
This method should raise an exception if any dependencies are missing, with a message indicating which dependencies are missing and how to install them.
- properties: Sequence[PropertyConfig]
Properties to predict.
- optimizer: OptimizerConfig
Optimizer.
- lr_scheduler: LRSchedulerConfig | None
Learning Rate Scheduler
- ignore_gpu_batch_transform_error: bool
Whether to ignore data processing errors during training.
- normalizers: Mapping[str, Sequence[NormalizerConfig]]
Normalizers for the properties.
Any property can be associated with multiple normalizers. This is useful for cases where we want to normalize the same property in different ways. For example, we may want to normalize the energy by subtracting the atomic reference energies, as well as by mean and standard deviation normalization.
The normalizers are applied in the order they are defined in the list.
- class mattertune.configs.ORBSystemConfig(*, radius, max_num_neighbors)[source]
Config controlling how to featurize a system of atoms.
- Parameters:
radius (float)
max_num_neighbors (int)
- radius: float
The radius for edge construction.
- max_num_neighbors: int
The maximum number of neighbours each node can send messages to.
- class mattertune.configs.PerAtomReferencingNormalizerConfig(*, per_atom_references)[source]
- Parameters:
per_atom_references (Mapping[int, float] | Sequence[float] | Path)
- per_atom_references: Mapping[int, float] | Sequence[float] | Path
The reference values for each element.
If a dictionary is provided, it maps atomic numbers to reference values
If a list is provided, it’s a list of reference values indexed by atomic number
If a path is provided, it should point to a JSON file containing the references
- class mattertune.configs.PropertyConfigBase(*, name, dtype, loss, loss_coefficient=1.0)[source]
- Parameters:
name (str)
dtype (DType)
loss (LossConfig)
loss_coefficient (float)
- name: str
The name of the property.
This is the key that will be used to access the property in the output of the model.
This is also the key that will be used to access the property in the ASE Atoms object.
- dtype: DType
The type of the property values.
- loss: LossConfig
The loss function to use when training the model on this property.
- loss_coefficient: float
The coefficient to apply to this property’s loss function when training the model.
- abstract from_ase_atoms(atoms)[source]
Extract the property value from an ASE Atoms object.
- Parameters:
atoms (Atoms)
- Return type:
int | float | ndarray | Tensor
- classmethod metric_cls()[source]
- Return type:
type[MetricBase]
- abstract ase_calculator_property_name()[source]
If this property can be calculated by an ASE calculator, returns the name of the property that the ASE calculator uses. Otherwise, returns None.
This should only return non-None for properties that are supported by the ASE calculator interface, i.e.: - ‘energy’ - ‘forces’ - ‘stress’ - ‘dipole’ - ‘charges’ - ‘magmom’ - ‘magmoms’
Note that this does not refer to the new experimental custom property prediction support feature in ASE, but rather the built-in properties that ASE can calculate in the
ase.calculators.calculator.Calculator
class.- Return type:
ASECalculatorPropertyName | None
- class mattertune.configs.RMSNormalizerConfig(*, rms)[source]
- Parameters:
rms (float)
- rms: float
The root mean square of the property values.
- class mattertune.configs.ReduceOnPlateauConfig(*, type='ReduceLROnPlateau', mode, factor, patience, threshold=0.0001, threshold_mode='rel', cooldown=0, min_lr=0, eps=1e-08)[source]
- Parameters:
type (Literal['ReduceLROnPlateau'])
mode (str)
factor (float)
patience (int)
threshold (float)
threshold_mode (str)
cooldown (int)
min_lr (float)
eps (float)
- type: Literal['ReduceLROnPlateau']
Type of the learning rate scheduler.
- mode: str
One of {“min”, “max”}. Determines when to reduce the learning rate.
- factor: float
Factor by which the learning rate will be reduced.
- patience: int
Number of epochs with no improvement after which learning rate will be reduced.
- threshold: float
Threshold for measuring the new optimum.
- threshold_mode: str
One of {“rel”, “abs”}. Determines the threshold mode.
- cooldown: int
Number of epochs to wait before resuming normal operation.
- min_lr: float
A lower bound on the learning rate.
- eps: float
Threshold for testing the new optimum.
- class mattertune.configs.SGDConfig(*, name='SGD', lr, momentum=0.0, weight_decay=0.0, nestrov=False)[source]
- Parameters:
name (Literal['SGD'])
lr (Annotated[float, Gt(gt=0)])
momentum (Annotated[float, Ge(ge=0)])
weight_decay (Annotated[float, Ge(ge=0)])
nestrov (bool)
- name: Literal['SGD']
name of the optimizer.
- lr: C.PositiveFloat
Learning rate.
- momentum: C.NonNegativeFloat
Momentum.
- weight_decay: C.NonNegativeFloat
Weight decay.
- nestrov: bool
Whether to use nestrov.
- class mattertune.configs.StepLRConfig(*, type='StepLR', step_size, gamma)[source]
- Parameters:
type (Literal['StepLR'])
step_size (int)
gamma (float)
- type: Literal['StepLR']
Type of the learning rate scheduler.
- step_size: int
Period of learning rate decay.
- gamma: float
Multiplicative factor of learning rate decay.
- class mattertune.configs.StressesPropertyConfig(*, name='stresses', dtype='float', loss, loss_coefficient=1.0, type='stresses', conservative)[source]
- Parameters:
name (str)
dtype (DType)
loss (LossConfig)
loss_coefficient (float)
type (Literal['stresses'])
conservative (bool)
- type: Literal['stresses']
- name: str
The name of the property.
This is the key that will be used to access the property in the output of the model.
- dtype: DType
The type of the property values.
- conservative: bool
Similar to the conservative parameter in ForcesPropertyConfig, this parameter specifies whether the stresses should be computed in a conservative manner.
- ase_calculator_property_name()[source]
If this property can be calculated by an ASE calculator, returns the name of the property that the ASE calculator uses. Otherwise, returns None.
This should only return non-None for properties that are supported by the ASE calculator interface, i.e.: - ‘energy’ - ‘forces’ - ‘stress’ - ‘dipole’ - ‘charges’ - ‘magmom’ - ‘magmoms’
Note that this does not refer to the new experimental custom property prediction support feature in ASE, but rather the built-in properties that ASE can calculate in the
ase.calculators.calculator.Calculator
class.
- class mattertune.configs.TensorBoardLoggerConfig(*, type='tensorboard', save_dir, name='lightning_logs', version=None, log_graph=False, default_hp_metric=True, prefix='', sub_dir=None, additional_params={})[source]
- Parameters:
type (Literal['tensorboard'])
save_dir (str)
name (str | None)
version (int | str | None)
log_graph (bool)
default_hp_metric (bool)
prefix (str)
sub_dir (str | None)
additional_params (dict[str, Any])
- type: Literal['tensorboard']
- save_dir: str
Save directory where TensorBoard logs will be saved.
- name: str | None
'lightning_logs'
. If empty string, no per-experiment subdirectory is used.- Type:
Experiment name. Default
- version: int | str | None
Experiment version. If not specified, logger auto-assigns next available version. If string, used as run-specific subdirectory name. Default:
None
.
- log_graph: bool
Whether to add computational graph to tensorboard. Requires model.example_input_array to be defined. Default:
False
.
- default_hp_metric: bool
Enables placeholder metric with key hp_metric when logging hyperparameters without a metric. Default:
True
.
- prefix: str
''
.- Type:
String to put at beginning of metric keys. Default
- sub_dir: str | None
Sub-directory to group TensorBoard logs. If provided, logs are saved in
/save_dir/name/version/sub_dir/
. Default:None
.
- additional_params: dict[str, Any]
{}
.- Type:
Additional parameters passed to tensorboardX.SummaryWriter. Default
- class mattertune.configs.TrainerConfig(*, accelerator='auto', strategy='auto', num_nodes=1, devices='auto', precision='32-true', deterministic=None, max_epochs=None, min_epochs=None, max_steps=-1, min_steps=None, max_time=None, val_check_interval=None, check_val_every_n_epoch=1, log_every_n_steps=None, gradient_clip_val=None, gradient_clip_algorithm=None, checkpoint=None, early_stopping=None, loggers='default', additional_trainer_kwargs={})[source]
- Parameters:
accelerator (str)
strategy (str | Strategy)
num_nodes (int)
devices (list[int] | str | int)
precision (Literal[64, 32, 16] | ~typing.Literal['transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true'] | ~typing.Literal['64', '32', '16', 'bf16'] | None)
deterministic (bool | Literal['warn'] | None)
max_epochs (int | None)
min_epochs (int | None)
max_steps (int)
min_steps (int | None)
max_time (str | timedelta | dict[str, int] | None)
val_check_interval (int | float | None)
check_val_every_n_epoch (int | None)
log_every_n_steps (int | None)
gradient_clip_val (int | float | None)
gradient_clip_algorithm (str | None)
checkpoint (ModelCheckpointConfig | None)
early_stopping (EarlyStoppingConfig | None)
loggers (Sequence[LoggerConfig] | Literal['default'])
additional_trainer_kwargs (dict[str, Any])
- accelerator: str
Supports passing different accelerator types (“cpu”, “gpu”, “tpu”, “ipu”, “hpu”, “mps”, “auto”) as well as custom accelerator instances.
- strategy: str | Strategy
Supports different training strategies with aliases as well custom strategies. Default:
"auto"
.
- num_nodes: int
Number of GPU nodes for distributed training. Default:
1
.
- devices: list[int] | str | int
The devices to use. Can be set to a sequence of device indices, “all” to indicate all available devices should be used, or
"auto"
for automatic selection based on the chosen accelerator. Default:"auto"
.
- precision: _PRECISION_INPUT | None
Double precision (64, ‘64’ or ‘64-true’), full precision (32, ‘32’ or ‘32-true’), 16bit mixed precision (16, ‘16’, ‘16-mixed’) or bfloat16 mixed precision (‘bf16’, ‘bf16-mixed’). Can be used on CPU, GPU, TPUs, HPUs or IPUs. Default:
'32-true'
.
- deterministic: bool | Literal['warn'] | None
If
True
, sets whether PyTorch operations must use deterministic algorithms. Set to"warn"
to use deterministic algorithms whenever possible, throwing warnings on operations that don’t support deterministic mode. If not set, defaults toFalse
. Default:None
.
- max_epochs: int | None
Stop training once this number of epochs is reached. Disabled by default (None). If both max_epochs and max_steps are not specified, defaults to
max_epochs = 1000
. To enable infinite training, setmax_epochs = -1
.
- min_epochs: int | None
Force training for at least these many epochs. Disabled by default (None).
- max_steps: int
Stop training after this number of steps. Disabled by default (-1). If
max_steps = -1
andmax_epochs = None
, will default tomax_epochs = 1000
. To enable infinite training, setmax_epochs
to-1
.
- min_steps: int | None
Force training for at least these number of steps. Disabled by default (
None
).
- max_time: str | timedelta | dict[str, int] | None
Stop training after this amount of time has passed. Disabled by default (
None
). The time duration can be specified in the format DD:HH:MM:SS (days, hours, minutes seconds), as adatetime.timedelta
, or a dictionary with keys that will be passed todatetime.timedelta
.
- val_check_interval: int | float | None
How often to check the validation set. Pass a
float
in the range [0.0, 1.0] to check after a fraction of the training epoch. Pass anint
to check after a fixed number of training batches. Anint
value can only be higher than the number of training batches whencheck_val_every_n_epoch=None
, which validates after everyN
training batches across epochs or during iteration-based training. Default:1.0
.
- check_val_every_n_epoch: int | None
Perform a validation loop every after every N training epochs. If
None
, validation will be done solely based on the number of training batches, requiringval_check_interval
to be an integer value. Default:1
.
- log_every_n_steps: int | None
How often to log within steps. Default:
50
.
- gradient_clip_val: int | float | None
The value at which to clip gradients. Passing
gradient_clip_val=None
disables gradient clipping. If using Automatic Mixed Precision (AMP), the gradients will be unscaled before. Default:None
.
- gradient_clip_algorithm: str | None
The gradient clipping algorithm to use. Pass
gradient_clip_algorithm="value"
to clip by value, andgradient_clip_algorithm="norm"
to clip by norm. By default it will be set to"norm"
.
- checkpoint: ModelCheckpointConfig | None
The configuration for the model checkpoint.
- early_stopping: EarlyStoppingConfig | None
The configuration for early stopping.
- loggers: Sequence[LoggerConfig] | Literal['default']
The loggers to use for logging training metrics.
If
"default"
, will use the CSV logger + the W&B logger if available. Default:"default"
.
- additional_trainer_kwargs: dict[str, Any]
Additional keyword arguments for the Lightning Trainer.
This is for advanced users who want to customize the Lightning Trainer, and is not recommended for beginners.
- class mattertune.configs.WandbLoggerConfig(*, type='wandb', name=None, save_dir='.', version=None, offline=False, dir=None, id=None, anonymous=None, project=None, log_model=False, prefix='', experiment=None, checkpoint_name=None, additional_init_parameters={})[source]
- Parameters:
type (Literal['wandb'])
name (str | None)
save_dir (str)
version (str | None)
offline (bool)
dir (str | None)
id (str | None)
anonymous (bool | None)
project (str | None)
log_model (Literal['all'] | bool)
prefix (str)
experiment (Any | None)
checkpoint_name (str | None)
additional_init_parameters (dict[str, Any])
- type: Literal['wandb']
- name: str | None
None
.- Type:
Display name for the run. Default
- save_dir: str
.
.- Type:
Path where data is saved. Default
- version: str | None
None
.- Type:
Sets the version, mainly used to resume a previous run. Default
- offline: bool
False
.- Type:
Run offline (data can be streamed later to wandb servers). Default
- dir: str | None
None
.- Type:
Same as save_dir. Default
- id: str | None
None
.- Type:
Same as version. Default
- anonymous: bool | None
None
.- Type:
Enables or explicitly disables anonymous logging. Default
- project: str | None
None
.- Type:
The name of the project to which this run will belong. Default
- log_model: Literal['all'] | bool
False
. If ‘all’, checkpoints are logged during training. If True, checkpoints are logged at the end of training. If False, no checkpoints are logged.- Type:
Whether/how to log model checkpoints as W&B artifacts. Default
- prefix: str
''
.- Type:
A string to put at the beginning of metric keys. Default
- experiment: Any | None
None
.- Type:
WandB experiment object. Automatically set when creating a run. Default
- checkpoint_name: str | None
None
.- Type:
Name of the model checkpoint artifact being logged. Default
- additional_init_parameters: dict[str, Any]
{}
.- Type:
Additional parameters to pass to wandb.init(). Default
- class mattertune.configs.XYZDatasetConfig(*, type='xyz', src)[source]
- Parameters:
type (Literal['xyz'])
src (str | Path)
- type: Literal['xyz']
Discriminator for the XYZ dataset.
- src: str | Path
The path to the XYZ dataset.
Modules