Installation
About
A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization algorithm. MADGRAD is a 'best-of-both-worlds' optimizer with the generalization performance of stochastic gradient descent and at least as fast convergence as that of Adam, often faster. A drop-in optim_madgrad() implementation is provided based on Defazio et al (2020)
Key Metrics
Downloads
Yesterday | 3 0% |
Last 7 days | 32 -40% |
Last 30 days | 133 0% |
Last 90 days | 394 -41% |
Last 365 days | 2.028 -14% |