Decoder-only Transformer language model built from scratch with PyTorch — trained on Tiny Shakespeare
pytorch transformer gpt language-model autoregressive from-scratch p100 causal-attention tiny-shakespeare
-
Updated
Mar 28, 2026 - Python