Zijian Wang

I am currently a Tech Lead Manager at AWS AI Labs in the Amazon Q Developer team. We work on large language models (LLM) for code. Please feel free to email me for oppotunities in our team.

Previously, I was at Stanford and I was part of the Stanford NLP Group, advised by Prof. Chris Potts. Before that, I was at the University of Michigan, working with Prof. David Jurgens and Prof. Kevyn Collins-Thompson. Earlier, I studied at Shanghai Jiao Tong University.

I actively contribute to the community. I co-organize the LLM4Code workshop at ICSE'25 (submit here!) and previously co-organized the Deep Learning for Code (DL4C) workshop at ICLR'23. I serve as an Area Chair for ARR and regularly review for major ML conferences.

zijwang@cs.stanford.edu | Google Scholar | Linkedin |

profile photo
Publications

*=equal contribution; =author is an intern

liu2024learning.png Learning Code Preference via Synthetic Evolution
Jiawei Liu, Thanh Nguyen, Mingyue Shang, Hantian Ding, Xiaopeng Li, Yu Yu, Varun Kumar, and Zijian Wang
arXiv, 2024
paper / summary
ding2024horizon.png Horizon-Length Prediction: Advancing Fill-in-the-Middle Capabilities for Code Generation with Lookahead Planning
Yifeng Ding, Hantian Ding, Shiqi Wang, Qing Sun, Varun Kumar, and Zijian Wang
arXiv, 2024
paper / summary
ding2024fewer.png Fewer Truncations Improve Language Modeling
Hantian Ding, Zijian Wang, Giovanni Paolini, Varun Kumar, Anoop Deoras, Dan Roth, and Stefano Soatto
ICML, 2024
paper / summary / Blogpost / 机器之心 (Blogpost in Chinese)
zhuo2024bcb.png BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
The BigCode Team
arXiv, 2024
paper / webpage
zhang2024codefort.png CodeFort: Robust Training for Code Generation Models
Yuhao Zhang, Shiqi Wang, Haifeng Qian, Zijian Wang, Mingyue Shang, Linbo Liu, Sanjay Krishna Gouda, Baishakhi Ray, Murali Krishna Ramanathan, Xiaofei Ma, and Anoop Deoras
Findings of EMNLP, 2024
paper
athiwaratkun2024token.png Token Alignment via Character Matching for Subword Completion
Ben Athiwaratkun*, Shiqi Wang*, Mingyue Shang*, Yuchen Tian, Zijian Wang, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Rob Kwiatowski, Ramesh Nallapati, and Bing Xiang
Findings of ACL, 2024
paper
lozhkov2024starcoder2.png StarCoder 2 and The Stack v2: The Next Generation
The BigCode Team
arXiv, 2024
paper / blogpost
ding2022cocomic CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context
Yangruibo Ding*, Zijian Wang*, Wasi Ahmad*, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, and Bing Xiang
LREC-COLING, 2024
paper
ding2023cross.png CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion
Yangruibo Ding*, Zijian Wang*, Wasi Ahmad*, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, and Bing Xiang
NeurIPS (Datasets and Benchmarks Track), 2023
paper/webpage/data/code
wang2022recode ReCode: Robustness Evaluation of Code Generation Models
Shiqi Wang*, Zheng Li*, Haifeng Qian, Chenghao Yang, Zijian Wang, Mingyue Shang, Varun Kumar, Samson Tan, Baishakhi Ray, Parminder Bhatia, Ramesh Nallapati, Murali Krishna Ramanathan, Dan Roth, and Bing Xiang
ACL, 2023
paper / code + data
jain2022contragen ContraCLM: Effective Contrastive Learning For Causal Language Model
Nihal Jain*, Dejiao Zhang*, Wasi Ahmad*, Zijian Wang, Feng Nan, Xiaopeng Li, Ming Tan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Xiaofei Ma, and Bing Xiang
ACL, 2023
paper
ding2023static A Static Evaluation of Code Completion by Large Language Models
Hantian Ding, Varun Kumar, Yuchen Tian, Zijian Wang, Rob Kwiatkowski, Xiaopeng Li, Murali Krishna Ramanathan, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth and Bing Xiang
ACL (Industry), 2023
paper
wei2023towards Towards Greener Yet Powerful Code Generation via Quantization: An Empirical Study
Xiaokai Wei, Sujan Gonugondla, Wasi Ahmad, Shiqi Wang, Baishakhi Ray, Haifeng Qian, Xiaopeng Li, Varun Kumar, Zijian Wang, Yuchen Tian, Qing Sun, Ben Athiwaratkun, Mingyue Shang, Murali Krishna Ramanathan, Parminder Bhatia, and Bing Xiang
ESEC/FSE, 2023
paper
ben2022mbxp Multi-lingual Evaluation of Code Generation Models
Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, ...19 other authors..., Sudipta Sengupta, Dan Roth, and Bing Xiang
ICLR, 2023
paper / code + data
li2022debiasing Debiasing Neural Retrieval via In-batch Balancing Regularization
Yuantong Li, Xiaokai Wei*, Zijian Wang*, Shen Wang*, Xiaofei Ma, Parminder Bhatia, and Andrew Arnold
4th Workshop on Gender Bias in Natural Language Processing at NAACL, 2022
paper
bombari2022towards Towards Differential Relational Privacy and its use in Question Answering
Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, and Stefano Soatto
arXiv, 2022
paper
li2022dqbart DQ-BART: Efficient Sequence-to-Sequence Model via Joint Distillation and Quantization
Zheng Li*, Zijian Wang*, Ming Tan, Ramesh Nallapati, Parminder Bhatia, Andrew Arnold, Bing Xiang, and Dan Roth
ACL, 2022
paper / code
dhole2021nl NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
The NL-Augmenter Team
arXiv, 2021
paper / code + data
kreiss2020modeling Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives
Elisa Kreiss*, Zijian Wang*, and Christopher Potts
CoNLL, 2020
paper / code + data / video
wang2019talkdown TalkDown: A Corpus for Condescension Detection in Context
Zijian Wang and Christopher Potts
EMNLP-IJCNLP, 2019
paper / code + data
qi2019answering Answering Complex Open-Domain Questions Through Iterative Query Generation
Peng Qi, Xianwen Lin*, Leo Mehr*, Zijian Wang*, and Christopher D. Manning
EMNLP-IJCNLP, 2019
paper / code / blog post
wang2019demographic Demographic Inference and Representative Population Estimates from \ Social Media Data (Best Poster Award)
Zijian Wang, Scott A. Hale, David Adelani, Przemyslaw A. Grabowicz, Timo Hartmann, Fabian Flöck, and David Jurgens
TheWebConf (WWW), 2019 (also presented at IC2S2 2019)
paper / demo / code / pip-installable package / poster
wang2018its It's going to be okay: Measuring Access to Support in Online Communities
Zijian Wang and David Jurgens
EMNLP, 2018
paper / project webpage / pip-installable package
choi2017social Social work in the classroom? A tool to evaluate topical relevance in student writing
Heeryung Choi, Zijian Wang, Christopher Brooks, Kevyn Collins-Thompson, Beth Glover Reed, and Dale Fitch
EDM, 2017
paper
Academic Services

Organizer/Program Committee/Reviewer:

  • Co-organizer of the second Deep Learning for Code (DL4C) workshop at ICLR'23
  • Co-organizer of the second LLM4Code workshop at ICSE'25
  • Area Chair of ARR
  • Outstanding Reviewer at ACL'21
  • Current or past reviewer of ICML, NeurIPS, ICLR, COLM, *ACL/ARR, ICWSM, WebSci, AAAI, IJCAI, and many workshops

Teaching Assistant:

Other:

  • Volunteer: EMNLP 19
  • Webmaster: Stanford NLP Group (2019-2020)
  • Transfer Student Leader: University of Michigan (2017-2018)


Homepage credits: Jon Barron