learning
← back to map
post-training
Notes on SFT, RLHF, DPO, and alignment.
No notes yet.