Some thoughts on size generalization
Abstract: This talks explore the question: For existing models capable of handling inputs of arbitrary sizes (e.g. DeepSet, GNN, transformer), what properties of the data and task enable eff ective size generalization? My answer involves two key points: (1) task-model alignment; (2) training and test distributions are “similar” relative to the model’s inductive bias.
There's no articles to list here yet.