learning
← back to map

training

Notes on pretraining loops, optimizers, loss, and scaling laws.

No notes yet.