optimizersmathadvanced

Second-order optimization methods

Newton, quasi-Newton (L-BFGS), natural gradient, K-FAC — using curvature information for faster convergence.

Уровни глубины

L0Intro~1ч

Knows Newton's method uses both gradient and Hessian.

L1Basics~8ч

Implements Newton for a convex quadratic; understands the Hessian's role.

L2Working~12ч

Uses scipy.optimize L-BFGS for classical ML; understands why full Newton is impractical for DL.

L3Advanced~25ч

Natural gradient (Fisher info), K-FAC, Shampoo for deep networks.

L4Research~60ч

Tractable second-order methods for LLM-scale training.

Ресурсы

L1 — Basics

📚
Numerical Optimization — Chapters 3, 6, 7
Nocedal, Wrighten~15ч

L2 — Working

Ведёт к

← Обратно к графу Предложить правку