Skip to content

schedulers

Classes:

CosineAnnealingScheduler

CosineAnnealingScheduler(max_T: int, min_lr: float, epoch_offset: Optional[int] = 0)

Cosine annealing learning rate scheduler.

This scheduler updates the learning rate using a cosine annealing schedule between the current learning rate and a minimum learning rate over a specified number of epochs.

The schedule is applied after an optional epoch offset, during which the learning rate is left unchanged.

Parameters:

  • max_T

    (int) –

    Number of epochs over which to apply cosine annealing after epoch_offset. After epoch_offset + max_T, the learning rate is fixed to min_lr.

  • min_lr

    (float) –

    Minimum learning rate reached at the end of the annealing period.

  • epoch_offset

    (int, default: 0 ) –

    Number of initial epochs during which the scheduler does not modify the learning rate. Defaults to 0.

Source code in src/fpga_profile_reco/utils/schedulers.py
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
def __init__(self, max_T: int, min_lr: float, epoch_offset: Optional[int] = 0):
    """
    Initialize the cosine annealing scheduler.

    Parameters
    ----------
    max_T : int
        Number of epochs over which to apply cosine annealing after
        `epoch_offset`. After `epoch_offset + max_T`, the learning rate
        is fixed to `min_lr`.
    min_lr : float
        Minimum learning rate reached at the end of the annealing period.
    epoch_offset : int, optional
        Number of initial epochs during which the scheduler does not modify
        the learning rate. Defaults to 0.
    """
    self.max_T = max_T
    self.min_lr = min_lr
    self.epoch_offset = epoch_offset

CosineAnnealingWithRestartsScheduler

CosineAnnealingWithRestartsScheduler(restart_lrs: Sequence[float], min_lrs: Sequence[float], Ts: Sequence[int])

Cosine annealing with restarts learning rate scheduler.

This scheduler applies cosine annealing over multiple intervals (restarts). For each interval, a cosine schedule is used between a restart learning rate and a minimum learning rate with its own duration.

The intervals are defined by Ts, and each interval may have its own minimum learning rate. Optionally, each interval can also have its own restart learning rate.

Parameters:

  • restart_lrs

    (Sequence[float]) –

    Learning rates used at the beginning of each restart interval. If len(restart_lrs) == len(Ts) == len(min_lrs), the first value overrides the optimizer's initial learning rate as well. Otherwise, len(restart_lrs) + 1 == len(Ts) == len(min_lrs) and the optimizer's initial learning rate is used for the first interval while the elements of restart_lrs are used for the subsequent intervals.

  • min_lrs

    (Sequence[float]) –

    Minimum learning rate reached at the end of each interval. Must be the same length as Ts.

  • Ts

    (Sequence[int]) –

    Number of epochs for each cosine annealing interval (cycle). The sum of all Ts defines the total scheduled duration.

Source code in src/fpga_profile_reco/utils/schedulers.py
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
def __init__(self, restart_lrs: Sequence[float], min_lrs: Sequence[float], Ts: Sequence[int]):
    """
    Initialize the cosine annealing with restarts scheduler.

    Parameters
    ----------
    restart_lrs : Sequence[float]
        Learning rates used at the beginning of each restart interval.
        If `len(restart_lrs) == len(Ts) == len(min_lrs)`, the first value
        overrides the optimizer's initial learning rate as well.
        Otherwise, `len(restart_lrs) + 1 == len(Ts) == len(min_lrs)` and
        the optimizer's initial learning rate is used for the first
        interval while the elements of `restart_lrs` are used for the
        subsequent intervals.
    min_lrs : Sequence[float]
        Minimum learning rate reached at the end of each interval. Must be
        the same length as `Ts`.
    Ts : Sequence[int]
        Number of epochs for each cosine annealing interval (cycle). The
        sum of all `Ts` defines the total scheduled duration.
    """
    if len(restart_lrs) == len(Ts) == len(min_lrs):
        print("WARNING: the initial learning rate passed to the optimizer will be ignored in favor of the first restart_lr")
    else:
        assert len(restart_lrs) + 1 == len(Ts) == len(min_lrs), "Length of restart_lrs must be one less than length of Ts and min_lrs or all three must be the same"

    self.restart_lrs = restart_lrs
    self.min_lrs = min_lrs
    self.Ts = Ts

    self.epoch_offsets = [0] + [sum(Ts[:i + 1]) for i in range(len(Ts) - 1)]
    self.cumulative_Ts = [sum(Ts[:i + 1]) for i in range(len(Ts))]

    # instantiate cosine schedulers for each interval
    self.cosine_schedulers = [CosineAnnealingScheduler(max_T=Ts[i], min_lr=min_lrs[i], epoch_offset=(self.epoch_offsets[i])) for i in range(len(Ts))]