ModelKeeper Accelerating DNN Training via Automated Training Warmup
1 minute read ∼ Filed in : A paper noteIntroduction
Context
Train many models to customize the latency-accuracy trade-off across hardware.
Motivation: existing training optimizations don’t take NN similarity into consideration. While one can reduce the amount of training needed for model convergence by leveraging a well-trained model’s weights to warm up the training of a new model.
Gap
Similarity is not automatically matched, and not captured acorss models.
Goal & Contributitions
it proposes a cluster-wide training warmup system, to reduce the training execution needed for model convergence via automated model weight transformation.
Challenge
How to determine similarity between models.
If multiple model has similarity, which model to use and how to transfer their weights?
How to server dynamic workloads at scale?