Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation

A Published in AAAI, 2021

Author: Shaoxiong Feng, Hongshen Chen, Xuancheng Ren, Zhuoye Ding, Kan Li, Xu Sun

Published in: AAAI

Abstract

Collaborative learning has successfully applied knowledge transfer to guide apool of small student networks towards robust local minima. However, previousapproaches typically struggle with drastically aggravated studenthomogenization when the number of students rises. In this paper, we proposeCollaborative Group Learning, an efficient framework that aims to diversify thefeature representation and conduct an effective regularization. Intuitively,similar to the human group study mechanism, we induce students to learn andexchange different parts of course knowledge as collaborative groups. First,each student is established by randomly routing on a modular neural network,which facilitates flexible knowledge communication between students due torandom levels of representation sharing and branching. Second, to resist thestudent homogenization, students first compose diverse feature sets byexploiting the inductive bias from sub-sets of training data, and thenaggregate and distill different complementary knowledge by imitating a randomsub-group of students at each time step. Overall, the above mechanisms arebeneficial for maximizing the student population to further improve the modelgeneralization without sacrificing computational efficiency. Empiricalevaluations on both image and text tasks indicate that our method significantlyoutperforms various state-of-the-art collaborative approaches whilst enhancingcomputational efficiency.


Recommended citation:

Shaoxiong Feng, Hongshen Chen, Xuancheng Ren, Zhuoye Ding, Kan Li, and Xu Sun. The 35th AAAI conference on Artificial Intelligence. AAAI 2021.**