Abstract: Shared gradients are extensively utilized for safeguarding the privacy of training data. However, an increasing body of research is uncovering that the gradients or model parameters ...
Abstract: Feature-based knowledge distillation has been applied to compress modern recommendation models, usually with projectors that align student (small) recommendation models’ dimensions with ...