mattertune.main
Classes
|
|
|
|
|
|
|
The output of the MatterTuner.tune method. |
- class mattertune.main.TuneOutput(model, trainer)[source]
The output of the MatterTuner.tune method.
- Parameters:
model (FinetuneModuleBase)
trainer (Trainer)
- model: FinetuneModuleBase
The trained model.
- trainer: Trainer
The trainer used to train the model.
- class mattertune.main.TrainerConfig(*, accelerator='auto', strategy='auto', num_nodes=1, devices='auto', precision='32-true', deterministic=None, max_epochs=None, min_epochs=None, max_steps=-1, min_steps=None, max_time=None, val_check_interval=None, check_val_every_n_epoch=1, log_every_n_steps=None, gradient_clip_val=None, gradient_clip_algorithm=None, checkpoint=None, early_stopping=None, ema=None, loggers='default', additional_trainer_kwargs={})[source]
- Parameters:
accelerator (str)
strategy (str | Strategy)
num_nodes (int)
devices (list[int] | str | int)
precision (Literal[64, 32, 16] | ~typing.Literal['transformer-engine', 'transformer-engine-float16', '16-true', '16-mixed', 'bf16-true', 'bf16-mixed', '32-true', '64-true'] | ~typing.Literal['64', '32', '16', 'bf16'] | None)
deterministic (bool | Literal['warn'] | None)
max_epochs (int | None)
min_epochs (int | None)
max_steps (int)
min_steps (int | None)
max_time (str | timedelta | dict[str, int] | None)
val_check_interval (int | float | None)
check_val_every_n_epoch (int | None)
log_every_n_steps (int | None)
gradient_clip_val (int | float | None)
gradient_clip_algorithm (str | None)
checkpoint (ModelCheckpointConfig | None)
early_stopping (EarlyStoppingConfig | None)
ema (EMAConfig | None)
loggers (Sequence[LoggerConfig] | Literal['default'])
additional_trainer_kwargs (dict[str, Any])
- accelerator: str
Supports passing different accelerator types (“cpu”, “gpu”, “tpu”, “ipu”, “hpu”, “mps”, “auto”) as well as custom accelerator instances.
- strategy: str | Strategy
Supports different training strategies with aliases as well custom strategies. Default:
"auto".
- num_nodes: int
Number of GPU nodes for distributed training. Default:
1.
- devices: list[int] | str | int
The devices to use. Can be set to a sequence of device indices, “all” to indicate all available devices should be used, or
"auto"for automatic selection based on the chosen accelerator. Default:"auto".
- precision: _PRECISION_INPUT | None
Double precision (64, ‘64’ or ‘64-true’), full precision (32, ‘32’ or ‘32-true’), 16bit mixed precision (16, ‘16’, ‘16-mixed’) or bfloat16 mixed precision (‘bf16’, ‘bf16-mixed’). Can be used on CPU, GPU, TPUs, HPUs or IPUs. Default:
'32-true'.
- deterministic: bool | Literal['warn'] | None
If
True, sets whether PyTorch operations must use deterministic algorithms. Set to"warn"to use deterministic algorithms whenever possible, throwing warnings on operations that don’t support deterministic mode. If not set, defaults toFalse. Default:None.
- max_epochs: int | None
Stop training once this number of epochs is reached. Disabled by default (None). If both max_epochs and max_steps are not specified, defaults to
max_epochs = 1000. To enable infinite training, setmax_epochs = -1.
- min_epochs: int | None
Force training for at least these many epochs. Disabled by default (None).
- max_steps: int
Stop training after this number of steps. Disabled by default (-1). If
max_steps = -1andmax_epochs = None, will default tomax_epochs = 1000. To enable infinite training, setmax_epochsto-1.
- min_steps: int | None
Force training for at least these number of steps. Disabled by default (
None).
- max_time: str | timedelta | dict[str, int] | None
Stop training after this amount of time has passed. Disabled by default (
None). The time duration can be specified in the format DD:HH:MM:SS (days, hours, minutes seconds), as adatetime.timedelta, or a dictionary with keys that will be passed todatetime.timedelta.
- val_check_interval: int | float | None
How often to check the validation set. Pass a
floatin the range [0.0, 1.0] to check after a fraction of the training epoch. Pass anintto check after a fixed number of training batches. Anintvalue can only be higher than the number of training batches whencheck_val_every_n_epoch=None, which validates after everyNtraining batches across epochs or during iteration-based training. Default:1.0.
- check_val_every_n_epoch: int | None
Perform a validation loop every after every N training epochs. If
None, validation will be done solely based on the number of training batches, requiringval_check_intervalto be an integer value. Default:1.
- log_every_n_steps: int | None
How often to log within steps. Default:
50.
- gradient_clip_val: int | float | None
The value at which to clip gradients. Passing
gradient_clip_val=Nonedisables gradient clipping. If using Automatic Mixed Precision (AMP), the gradients will be unscaled before. Default:None.
- gradient_clip_algorithm: str | None
The gradient clipping algorithm to use. Pass
gradient_clip_algorithm="value"to clip by value, andgradient_clip_algorithm="norm"to clip by norm. By default it will be set to"norm".
- checkpoint: ModelCheckpointConfig | None
The configuration for the model checkpoint.
- early_stopping: EarlyStoppingConfig | None
The configuration for early stopping.
- loggers: Sequence[LoggerConfig] | Literal['default']
The loggers to use for logging training metrics.
If
"default", will use the CSV logger + the W&B logger if available. Default:"default".
- additional_trainer_kwargs: dict[str, Any]
Additional keyword arguments for the Lightning Trainer.
This is for advanced users who want to customize the Lightning Trainer, and is not recommended for beginners.
- class mattertune.main.MatterTunerConfig(*, data, model, trainer=TrainerConfig(accelerator='auto', strategy='auto', num_nodes=1, devices='auto', precision='32-true', deterministic=None, max_epochs=None, min_epochs=None, max_steps=-1, min_steps=None, max_time=None, val_check_interval=None, check_val_every_n_epoch=1, log_every_n_steps=None, gradient_clip_val=None, gradient_clip_algorithm=None, checkpoint=None, early_stopping=None, ema=None, loggers='default', additional_trainer_kwargs={}), recipes=[])[source]
- Parameters:
data (DataModuleConfig)
model (ModelConfig)
trainer (TrainerConfig)
recipes (Sequence[RecipeConfig])
- data: DataModuleConfig
The configuration for the data.
- model: ModelConfig
The configuration for the model.
- trainer: TrainerConfig
The configuration for the trainer.
- recipes: Sequence[RecipeConfig]
Recipes to modify the training process.
Recipes are configurable components that can modify how models are trained. Each recipe provides a specific capability like parameter-efficient fine-tuning, regularization, or advanced optimization techniques.
Recipes are applied in order when training starts. Multiple recipes can be combined to achieve the desired training behavior.
Examples
```python # Use LoRA for memory-efficient training recipes=[
- LoRARecipeConfig(
lora=LoraConfig(r=8, target_modules=[“linear1”])
)
]
- class mattertune.main.MatterTuner(config)[source]
- Parameters:
config (MatterTunerConfig)
- __init__(config)[source]
- Parameters:
config (MatterTunerConfig)