I am a postdocoral researcher at the Language Technologies Institute (LTI) of Carnegie Mellon University, working with Prof. Graham Neubig, a member of NeuLab, and a Rothschild Fellow.
I obtained my PhD in the department of Computer Science at the Technion, where I was fortunate to be advised by Prof. Eran Yahav. My PhD dissertation has awarded the Reynolds Doctoral Dissertation Award (formerly “SIGPLAN Outstanding Doctoral Dissertation Award”).
Previously, I served 7 years as an officer onboard a missile ship in the Israeli Navy. Later, I completed my BSc summa cum laude at the Computer Science Department at the Technion, as an alumnus of The Rothschild-Technion Scholars Program for Excellence. Between 2014-2016, I worked at Microsoft R&D center in Haifa, developing data security services for the cloud. Between June-September of 2018, I interned at Google New-York, researching neural models for speech recognition.
In addition, I hold a B.A. in Humanities.
I am happily married to Lee and father of Gur 🙂
News
- March 2023 - Learning Performance-Improving Code Edits and CodeBERTScore (Spotlight!) will appear in the Deep Learning for Code ICLR’2023 workshop
- February 2023 - a new preprint: Learning Performance-Improving Code Edits
- February 2023 - a new preprint: CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
- January 2023 - Our DocPrompting paper was accepted to ICLR’2023 as a Spotlight!
- January 2023 - a new preprint: Why do Nearest Neighbor Language Models Work?
- December 2022 - A new demo for PAL!
- December 2022 - I was invited to the explAInable podcast (Hebrew)
- November 2022 - a new preprint: PaL: Program-aided Language Models
- October 2022 - Our paper Language Models of Code are Few-Shot Commonsense Learners was accepted to EMNLP’2022!
- September 2022 - We released a new repository for evaluation of code generation: code-bert-score, along with pretrained models of several programming languages, based on CodeBERT.
- August 2022 - a new preprint:
DocPrompting
: Generating Code by Retrieving the Docs - July 2022 - I released a new HuggingFace 🤗
transformers
implementation of RetoMaton, kNN-language models and kNN-machine translation: https://github.com/neulab/knn-transformers - June 2022 - I was selected for the ACM SIGPLAN Reynolds Doctoral Dissertation Award (formerly “SIGPLAN Outstanding Doctoral Dissertation Award”)!
- May 2022 - Our RetoMaton paper was accepted to ICML’2022!
- April 2022 - Our PolyCoder paper will appear in ICLR 2022’s DL4Code and PLDI 2022’s MAPS workshops.
- March 2022 - A new preprint: A Systematic Evaluation of Large Language Models of Code
- February 2022 - A new preprint: Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
- January 2022 - Our paper How Attentive are Graph Attention Networks? was accepted to ICLR’2022!
Publications
Preprints
- Why do Nearest Neighbor Language Models Work?
- PaL: Program-aided Language Models
Accepted Papers
DocPrompting
: Generating Code by Retrieving the Docs- Shuyan Zhou , Uri Alon, Frank F. Xu, Zhengbao Jiang, Graham Neubig
- To appear in ICLR’2023 (Spotlight)
- Press: [MarkTechPost] [Medium] [Prophet-Con] [Synched]
- [PDF] [Code] [BibTex]
- Language Models of Code are Few-Shot Commonsense Learners
- Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval (RetoMaton)
- Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig
- Appeared in ICML’2022
- [PDF] [Poster] [5-min Video] [1-hour Video] [Slides] [Tweet] [BibTex]
- [Code -
fairseq
implementation] - [Code - HuggingFace 🤗
transformers
implementation] [Trained models]
- How Attentive are Graph Attention Networks?
- Shaked Brody, Uri Alon, Eran Yahav
- Appeared in ICLR’2022
- [PDF] [Poster] [Code] [Video] [BibTex]
- GATv2 implementations:
- [PyTorch Geometric]:
from torch_geometric.nn.conv.gatv2_conv import GATv2Conv
- [DGL]:
from dgl.nn.pytorch import GATv2Conv
- [TensorFlow GNN]:
from tensorflow_gnn.keras.layers import GATv2
- [PyTorch Geometric]:
- On the Bottleneck of Graph Neural Networks and its Practical Implications
- A Structural Model for Contextual Code Changes
- Adversarial Examples for Models of Code
- Neural Reverse Engineering of Stripped Binaries using Augmented Control Flow Graphs
- Structural Language Models of Code
- Contextual Speech Recognition with Difficult Negative Training Examples
- code2seq: Generating Sequences from Structured Representations of Code
- code2vec: Learning Distributed Representations of Code
- Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav
- Appeared in POPL’2019
- ACM SIGPLAN Research Highlight
- Online demo: https://www.code2vec.org
- [PDF] [Slides (PDF)] [Slides (PPT)] [Video] [Blog] [Code] [BibTex]
- A General Path-Based Representation for Predicting Program Properties
Workshops
- Learning Performance-Improving Code Edits
- Aman Madaan, Alexander Shypula, Uri Alon, Milad Hashemi, Parthasarathy Ranganathan, Yiming Yang, Graham Neubig, Amir Yazdanbakhsh
- To appear in Deep Learning for Code, ICLR’2023 workshop
- [PDF] [Code] [Website] [BibTex]
- CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
- Shuyan Zhou, Uri Alon, Sumit Agarwal, Graham Neubig
- To appear in Deep Learning for Code, ICLR’2023 workshop (Spotlight)
- [PDF] [Code] [Huggingface Models] [BibTex]
- A Systematic Evaluation of Large Language Models of Code (PolyCoder)
- Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- Appeared in MAPS’2022
- Appeared in Deep Learning for Code, ICLR’2022 workshop
- Press: [Forbes] [ZDNet] [VentureBeat] [MarkTechPost]
- [PDF] [Code] [BibTex]
- Huggingface🤗 model: NinedayWang/PolyCoder-2.7B
- Single-Node Attack for Fooling Graph Neural Networks
PhD Thesis
- Machine Learning for Programming Language Processing
- Computer Science Department, Technion, 2021
- Awarded the Reynolds Doctoral Dissertation Award (formerly “SIGPLAN Outstanding Doctoral Dissertation Award”)
- [PDF]
Technical Reports
- Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
- Jonathan Shen, …, Uri Alon, …
- [PDF]
Awards
- 2022 - Reynolds Doctoral Dissertation Award (formerly “SIGPLAN Outstanding Doctoral Dissertation Award”)
- 2021-2022 – Rothschild Post-Doctoral Fellowship
- 2021-2022 – Fulbright Post-Doctoral Fellowship (declined)
- 2020 – ACM SIGPLAN Research Highlight, “code2vec: Learning Distributed Representations of Code” (POPL’2019)
- 2019 – Jacobs Excellence Scholarship
- 2019 – Department Funding Excellence Scholarship
- 2018 – Department Funding Excellence Scholarship
- 2016 – Excellent Teaching Assistant
- 2016 – Dean’s Excellent Scholarship
- 2016 – Alumnus of the Rothchild-Technion Program for Excellence
- 2015 – SAMBA – CS Excellent Students
Demos
Service
- Reviewer: ICLR’2023, NeurIPS ‘2022 (Outstanding Reviewer), TMLR, ICML’2022 (Outstanding Reviewer - top 10%), ICLR’2022 (Highlighted Reviewer), AIPLANS NeurIPS 2021 workshop, ICML’2021 (top 10% Best Reviewers), ICLR’2021, NeurIPS’2020, ICLR’2020
- Program Committee: MAPS’2022, Deep Learning for Code ICLR’22 workshop, PLDI’2021, NeurIPS’2020 CAP workshop, AIDM’20, AIDM’19
- Area Chair: Learning on Graphs ‘2022