mattertune.configs.callbacks
- class mattertune.configs.callbacks.EarlyStoppingConfig(*, monitor='val/total_loss', min_delta=0.0, patience=3, verbose=False, mode='min', strict=True, check_finite=True, stopping_threshold=None, divergence_threshold=None, check_on_train_epoch_end=None, log_rank_zero_only=False)[source]
- Parameters:
monitor (str)
min_delta (float)
patience (int)
verbose (bool)
mode (Literal['min', 'max'])
strict (bool)
check_finite (bool)
stopping_threshold (float | None)
divergence_threshold (float | None)
check_on_train_epoch_end (bool | None)
log_rank_zero_only (bool)
- monitor: str
Quantity to be monitored.
- min_delta: float
Minimum change in monitored quantity to qualify as an improvement. Changes of less than or equal to min_delta will count as no improvement. Default:
0.0
.
- patience: int
3
.- Type:
Number of validation checks with no improvement after which training will be stopped. Default
- verbose: bool
False
.- Type:
Whether to print messages when improvement is found or early stopping is triggered. Default
- mode: Literal['min', 'max']
One of ‘min’ or ‘max’. In ‘min’ mode, training stops when monitored quantity stops decreasing; in ‘max’ mode it stops when the quantity stops increasing. Default:
'min'
.
- strict: bool
True
.- Type:
Whether to raise an error if monitored metric is not found in validation metrics. Default
- check_finite: bool
True
.- Type:
Whether to stop training when the monitor becomes NaN or infinite. Default
- stopping_threshold: float | None
None
.- Type:
Stop training immediately once the monitored quantity reaches this threshold. Default
- divergence_threshold: float | None
None
.- Type:
Stop training as soon as the monitored quantity becomes worse than this threshold. Default
- check_on_train_epoch_end: bool | None
Whether to run early stopping at the end of training epoch. If False, check runs at validation end. Default:
None
.
- log_rank_zero_only: bool
False
.- Type:
Whether to log the status of early stopping only for rank 0 process. Default
- class mattertune.configs.callbacks.ModelCheckpointConfig(*, dirpath=None, filename=None, monitor=None, verbose=False, save_last=None, save_top_k=1, save_weights_only=False, mode='min', auto_insert_metric_name=True, every_n_train_steps=None, train_time_interval=None, every_n_epochs=None, save_on_train_epoch_end=None, enable_version_counter=True)[source]
- Parameters:
dirpath (str | None)
filename (str | None)
monitor (str | None)
verbose (bool)
save_last (Literal[True, False, 'link'] | None)
save_top_k (int)
save_weights_only (bool)
mode (Literal['min', 'max'])
auto_insert_metric_name (bool)
every_n_train_steps (int | None)
train_time_interval (timedelta | None)
every_n_epochs (int | None)
save_on_train_epoch_end (bool | None)
enable_version_counter (bool)
- dirpath: str | None
None
.- Type:
Directory to save the model file. Default
- filename: str | None
None
.- Type:
Checkpoint filename. Can contain named formatting options. Default
- monitor: str | None
None
.- Type:
Quantity to monitor. Default
- verbose: bool
False
.- Type:
Verbosity mode. Default
- save_last: Literal[True, False, 'link'] | None
None
.- Type:
When True or “link”, saves a ‘last.ckpt’ checkpoint when a checkpoint is saved. Default
- save_top_k: int
1
.- Type:
If save_top_k=k, save k models with best monitored quantity. Default
- save_weights_only: bool
False
.- Type:
If True, only save model weights. Default
- mode: Literal['min', 'max']
'min'
.- Type:
One of {‘min’, ‘max’}. For ‘min’ training stops when monitored quantity stops decreasing. Default
- auto_insert_metric_name: bool
True
.- Type:
Whether to automatically insert metric name in checkpoint filename. Default
- every_n_train_steps: int | None
None
.- Type:
Number of training steps between checkpoints. Default
- train_time_interval: timedelta | None
None
.- Type:
Checkpoints are monitored at the specified time interval. Default
- every_n_epochs: int | None
None
.- Type:
Number of epochs between checkpoints. Default
- save_on_train_epoch_end: bool | None
None
.- Type:
Whether to run checkpointing at end of training epoch. Default
- enable_version_counter: bool
True
.- Type:
Whether to append version to existing filenames. Default
Modules