Tag: scaling laws

Hyperparameters That Matter Most in Large Language Model Pretraining 25 January 2026

Hyperparameters That Matter Most in Large Language Model Pretraining

Learn which hyperparameters matter most in LLM pretraining: learning rate and batch size. Discover the Step Law formula that predicts optimal settings using model size and dataset size, saving time and improving performance.

Susannah Greenwood 5 Comments