Skip to main content

Some thoughts on size generalization

Abstract: This talks explore the question: For existing models capable of handling inputs of arbitrary sizes (e.g. DeepSet, GNN, transformer), what properties of the data and task enable eff ective size generalization? My answer involves two key points: (1) task-model alignment; (2) training and test distributions are “similar” relative to the model’s inductive bias.

Slides

There's no articles to list here yet.