learning
← back to map

model architecture

Notes on transformers, attention, MLPs, and model design choices.