I am a post-doc at the Language Technologies Institute (LTI) of Carnegie Mellon University, working with Prof. Graham Neubig, and a member of NeuLab.
My research interests are broad and include programming language processing (PLP), natural language processing (NLP) and deep learning in general.
I obtained a PhD in the department of Computer Science at the Technion, advised by Prof. Eran Yahav.
Previously, I served 7 years as an officer onboard a missile ship in the Israeli Navy. Later, I completed my BSc summa cum laude at the Computer Science Department at the Technion, as an alumnus of The Rothschild-Technion Scholars Program for Excellence. Between 2014-2016, I worked at Microsoft R&D center in Haifa, developing data security services for the cloud. Between June-September of 2018, I interned at Google New-York, researching neural models for speech recognition.
In addition, I hold a B.A. in Humanities.
I am happily married to Lee and father of Gur đ
News
- January 2023 - Our DocPrompting paper was accepted to ICLRâ2023 as a Spotlight!
- January 2023 - a new preprint: Why do Nearest Neighbor Language Models Work?
- December 2022 - A new demo for PAL!
- December 2022 - I was invited to the explAInable podcast (Hebrew)
- November 2022 - a new preprint: PaL: Program-aided Language Models
- October 2022 - Our paper Language Models of Code are Few-Shot Commonsense Learners was accepted to EMNLPâ2022!
- September 2022 - We released a new repository for evaluation of code generation: code-bert-score, along with pretrained models of several programming languages, based on CodeBERT.
- August 2022 - a new preprint:
DocPrompting
: Generating Code by Retrieving the Docs - July 2022 - I released a new HuggingFace đ¤
transformers
implementation of RetoMaton, kNN-language models and kNN-machine translation: https://github.com/neulab/knn-transformers - June 2022 - I was selected for the ACM SIGPLAN Reynolds Doctoral Dissertation Award (formerly âSIGPLAN Outstanding Doctoral Dissertation Awardâ)!
- May 2022 - Our RetoMaton paper was accepted to ICMLâ2022!
- April 2022 - Our PolyCoder paper will appear in ICLR 2022âs DL4Code and PLDI 2022âs MAPS workshops.
- March 2022 - A new preprint: A Systematic Evaluation of Large Language Models of Code
- February 2022 - A new preprint: Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
- January 2022 - Our paper How Attentive are Graph Attention Networks? was accepted to ICLRâ2022!
Publications
Preprints
- Why do Nearest Neighbor Language Models Work?
- PaL: Program-aided Language Models
- Luyu Gao, Aman Madaan, Shuyan Zhou, Uri Alon, Pengfei Liu, Yiming Yang, Jamie Callan, Graham Neubig
- Online demo: https://huggingface.co/spaces/JavaFXpert/gpt-math-techniques
- [PDF] [Code] [Tweet] [BibTex]
Accepted Papers
DocPrompting
: Generating Code by Retrieving the Docs- Language Models of Code are Few-Shot Commonsense Learners
- Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval (RetoMaton)
- Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig
- Appeared in ICMLâ2022
- [PDF] [Poster] [5-min Video] [1-hour Video] [Slides] [Tweet] [BibTex]
- [Code -
fairseq
implementation] - [Code - HuggingFace đ¤
transformers
implementation] [Trained models]
- How Attentive are Graph Attention Networks?
- Shaked Brody, Uri Alon, Eran Yahav
- Appeared in ICLRâ2022
- [PDF] [Poster] [Code] [Video] [BibTex]
- GATv2 implementations:
- [PyTorch Geometric]:
from torch_geometric.nn.conv.gatv2_conv import GATv2Conv
- [DGL]: Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
from dgl.nn.pytorch import GATv2Conv
- [TensorFlow GNN]: Â Â Â
from tensorflow_gnn.keras.layers import GATv2
- [PyTorch Geometric]:
- On the Bottleneck of Graph Neural Networks and its Practical Implications
- A Structural Model for Contextual Code Changes
- Adversarial Examples for Models of Code
- Neural Reverse Engineering of Stripped Binaries using Augmented Control Flow Graphs
- Structural Language Models of Code
- Contextual Speech Recognition with Difficult Negative Training Examples
- code2seq: Generating Sequences from Structured Representations of Code
- code2vec: Learning Distributed Representations of Code
- Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav
- Appeared in POPLâ2019
- ACM SIGPLAN Research Highlight
- Online demo: https://www.code2vec.org
- [PDF] [Slides (PDF)] [Slides (PPT)] [Video] [Blog] [Code] [BibTex]
- A General Path-Based Representation for Predicting Program Properties
PhD Thesis
- Machine Learning for Programming Language Processing
- Computer Science Department, Technion, 2021
- Awarded the Reynolds Doctoral Dissertation Award (formerly âSIGPLAN Outstanding Doctoral Dissertation Awardâ)
- [PDF]
Technical Reports
- Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
- Jonathan Shen, âŚ, Uri Alon, âŚ
- [PDF]
Workshops
- A Systematic Evaluation of Large Language Models of Code (PolyCoder)
- Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- Appeared in MAPSâ2022
- Appeared in Deep Learning for Code, ICLRâ2022 workshop
- Press: [Forbes] [ZDNet] [VentureBeat] [MarkTechPost]
- [PDF] [Code] [BibTex]
- Huggingfaceđ¤ model: NinedayWang/PolyCoder-2.7B
- Single-Node Attack for Fooling Graph Neural Networks
Awards
- 2022 - Reynolds Doctoral Dissertation Award (formerly âSIGPLAN Outstanding Doctoral Dissertation Awardâ)
- 2021-2022 â Rothschild Post-Doctoral Fellowship
- 2021-2022 â Fulbright Post-Doctoral Fellowship (declined)
- 2020 â ACM SIGPLAN Research Highlight, âcode2vec: Learning Distributed Representations of Codeâ (POPLâ2019)
- 2019 â Jacobs Excellence Scholarship
- 2019 â Department Funding Excellence Scholarship
- 2018 â Department Funding Excellence Scholarship
- 2016 â Excellent Teaching Assistant
- 2016 â Deanâs Excellent Scholarship
- 2016 â Alumnus of the Rothchild-Technion Program for Excellence
- 2015 â SAMBA â CS Excellent Students
Demos
Service
- Reviewer: ICLRâ2023, NeurIPS â2022 (Outstanding Reviewer), TMLR, ICMLâ2022 (Outstanding Reviewer - top 10%), ICLRâ2022 (Highlighted Reviewer), AIPLANS NeurIPS 2021 workshop, ICMLâ2021 (top 10% Best Reviewers), ICLRâ2021, NeurIPSâ2020, ICLRâ2020
- Program Committee: MAPSâ2022, Deep Learning for Code ICLRâ22 workshop, PLDIâ2021, NeurIPSâ2020 CAP workshop, AIDMâ20, AIDMâ19
- Area Chair: Learning on Graphs â2022