mattertune.configs.data.mptraj

class mattertune.configs.data.mptraj.DatasetConfigBase[source]

abstract create_dataset()[source]

Return type:: Dataset[Atoms]

Prepare the dataset for training.

Use this to download and prepare data. Downloading and saving data with multiple processes (distributed settings) will result in corrupted data. Lightning ensures this method is called only within a single process, so you can safely add your downloading logic within this method.

classmethod ensure_dependencies()[source]

Ensure that all dependencies are installed.

This method should raise an exception if any dependencies are missing, with a message indicating which dependencies are missing and how to install them.

class mattertune.configs.data.mptraj.MPTrajDatasetConfig(*, type='mptraj', split='train', min_num_atoms=5, max_num_atoms=None, elements=None)[source]

Configuration for a dataset stored in the Materials Project database.

Parameters:

type (Literal['mptraj'])
split (Literal['train', 'val', 'test'])
min_num_atoms (int | None)
max_num_atoms (int | None)
elements (list[str] | None)

type: Literal['mptraj']: Discriminator for the MPTraj dataset.

split: Literal['train', 'val', 'test']: Split of the dataset to use.

min_num_atoms: int | None: Minimum number of atoms to be considered. Drops structures with fewer atoms.

max_num_atoms: int | None: Maximum number of atoms to be considered. Drops structures with more atoms.

elements: list[str] | None: List of elements to be considered. Drops structures with elements not in the list. Subsets are also allowed. For example, [“Li”, “Na”] will keep structures with either Li or Na.

create_dataset()[source]