A Formal Framework for Understanding Length Generalization in Transformers (bibtex)
by Xinting Huang, Andy Yang, Satwik Bhattamishra, Yash Sarrof, Andreas Krebs, Hattie Zhou, Preetum Nakkiran, Michael Hahn
Reference:
A Formal Framework for Understanding Length Generalization in TransformersXinting Huang, Andy Yang, Satwik Bhattamishra, Yash Sarrof, Andreas Krebs, Hattie Zhou, Preetum Nakkiran, Michael HahnThe Thirteenth International Conference on Learning Representations (ICLR), 2025.
Bibtex Entry:
@inproceedings{huang2024formalframeworkunderstandinglength,
      title={A Formal Framework for Understanding Length Generalization in Transformers}, 
      author={Xinting Huang and Andy Yang and Satwik Bhattamishra and Yash Sarrof and Andreas Krebs and Hattie Zhou and Preetum Nakkiran and Michael Hahn},
	month={June},
	year={2025},
booktitle={The Thirteenth International Conference on Learning Representations (ICLR)},
      eprint={2410.02140},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      preprint={https://arxiv.org/abs/2410.02140}, 
	      url={https://openreview.net/forum?id=U49N5V51rU},
      github={https://github.com/lacoco-lab/length_generalization}
}
Powered by bibtexbrowser