Tag: small models
-
TIL: training small models can be more energy intensive than training large models
As I end up reading more around AI, I came across this snippet from a recent post by Sayah Kapor, which initially felt really counter intuitive: Paradoxically, smaller models require more training to reach the same level of performance. So the downward pressure on model size is putting upward pressure on training compute. In effect,…