This work presents a novel knowledge distillation framework that utilizes multiple intermediated assistant models of varying sizes and architectures to facilitate knowledge transfer from a teacher (source) model to a student (target) model.
This work presents a novel knowledge distillation framework that utilizes multiple intermediated assistant models of varying sizes and architectures to facilitate knowledge transfer from a teacher (source) model to a student (target) model.