learning
← back to map
training
Notes on pretraining loops, optimizers, loss, and scaling laws.
No notes yet.