About
I am now a full professor with tenure in the School of Computer Science in Peking University. I obtained my Ph.D. from Peking University in 2006. I had been a visiting associate professor at Artificial Intelligence Laboratory of Stanford University in 2013-2014. My current research mainly concerns applications of probabilistic methods for machine learning, including Program Language Processing, Natural Language Processing, and Software Engineering.
Preprints
- [arXiv 2025] Yihong Dong, Ge Li, Xue Jiang, Yongding Tao, Kechi Zhang, Hao Zhu, Huanyu Liu, Jiazheng Ding, Jia Li, Jinliang Deng, and Hong Mei; FANformer: Improving Large Language Models Through Effective Periodicity Modeling, arXiv preprint, arXiv: 2502.21309, 2025.
- [arXiv 2025]Kechi Zhang, Ge Li, Jia Li, Yihong Dong, Jia Li, Zhi Jin; Focused-DPO: Enhancing Code Generation Through Focused Preference Optimization on Error-Prone Points, arXiv preprint, arXiv: 2502.11475, 2025. (Cited by 2)
- [arXiv 2024] Lecheng Wang, Xianjie Shi, Ge Li, Jia Li, Yihong Dong, Xuanming Zhang, Wenpin Jiao, Hong Mei; Why Language Models Collapse When Trained on Recursively Generated Text; arXiv preprint, arXiv: 2412.14872, 2024.
- [arXiv 2024] Yihong Dong, Ge Li, Yongding Tao, Xue Jiang, Kechi Zhang, Jia Li, Jing Su, Jun Zhang, Jingjing Xu; FAN: Fourier Analysis Networks; arXiv preprint, arXiv: 2410.02675, 2024. (Cited by 4)
- [arXiv 2024] Jia Li, Ge Li, Lecheng Wang, Hao Zhu, Zhi Jin; Generating Equivalent Representations of Code By A Self-Reflection Approach; arXiv preprint, arXiv: 2410.03351, 2024.
- [arXiv 2024] Kechi Zhang, Ge Li, Yihong Dong, Jingjing Xu, Jun Zhang, Jing Su, Yongfei Liu, Zhi Jin; CodeDPO: Aligning Code Models with Self Generated and Verified Source Code; arXiv preprint, arXiv: 2410.05605, 2024. (Cited by 9)
- [arXiv 2024] Kaibo Liu, Yiyang Liu, Zhenpeng Chen, Jie M Zhang, Yudong Han, Yun Ma, Ge Li, Gang Huang; LLM-Powered Test Case Generation for Detecting Tricky Bugs; arXiv preprint, arXiv: 2404.10304, 2024. (Cited by 20)
- [arXiv 2024] Xue Jiang, Yihong Dong, Zhi Jin, Ge Li, SEED: Customize Large Language Models with Sample-Efficient Adaptation for Code Generation, arXiv preprint, arXiv: 2403.00046, 2024. (Cited by 5)
- [arXiv 2023] Zhang, Kechi, Ge Li, Jia Li, Zhuo Li, Zhi Jin, ToolCoder: Teach Code Generation Models to Use API Search Tool, arXiv preprint, arXiv: 2305.04032, 2023. (Cited by 50)
- [arXiv 2023] Zejun Wang, Jia Li, Ge Li, Zhi Jin. ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation, arXiv preprint, arXiv: 2311.00272, 2023. (Cited by 850)
- [arXiv 2022] Zhang, Kechi, Ge Li, Zhi Jin, What Does Transformer Learn About Source Code?, arXiv preprint, arXiv: 2207.08466, 2022. (Cited by 11)
Selected Publications
BibTex BibSytle- [ICSE, 2025] Siyuan Jiang, Jia Li, He Zong, Huanyu Liu, Hao Zhu, Shukai Hu, Erlu Li, Jiazheng Ding, Yu Han, Wei Ning, Ge Li; aiXcoder-7B: A Lightweight and Effective Large Language Model for Code Completion; Proceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE), Ottawa, Ontario, Canada, Apr. 27 - May 3, 2025 (Accepted). (Cited by 5)
- [ICSE, 2025] Xue Jiang, Yihong Dong, Yongding Tao, Huanyu Liu, Zhi Jin, Ge Li; ROCODE: Integrating Backtracking Mechanism and Program Analysis in Large Language Models for Code Generation; Proceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE), Ottawa, Ontario, Canada, Apr. 27 - May 3, 2025 (Accepted). (Cited by 2)
- [TOSEM, 2024] Jia Li, Ge Li, Yongmin Li, Jin Zhi; Structured Chain-of-Thought Prompting for Code Generation; ACM Transactions on Software Engineering and Methodology (TOSEM), 2024. (Accepted) (Cited by 104)
- [TOSEM, 2024] Yihong Dong, Jiazheng Ding, Xue Jiang, Ge Li, Zhuo Li, Zhi Jin; Evaluating Code Generation by Learning Code Execution; ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 33, No. 7, Article 182, Sep. 2024. [PDF] (Cited by 52)
- [ASE 2024] Zejun Wang, Kaibo Liu, Ge Li, Zhi Jin; SlicePromptTest4J: High-coverage Test Generation using LLM via Method Slicing; Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (ASE 2024), Sacramento, California, United States, 27 October - 1 November 2024, pp.1258 - 1268. [PDF] (Cited by 15)
- [NeurIPS 2024] Jia Li, Ge Li, Xuanming Zhang, Yunfei Zhao, Yihong Dong, Zhi Jin, Binhua Li, Fei Huang, Yongbin Li; EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific Evaluations; Proceedings of the 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024); Vancouver, Canada, Dec. 10-15, 2024. [PDF] (Cited by 6)
- [TOSEM, 2024] Jia Li, Yunfei Zhao, Yongmin Li, Ge Li, Zhi Jin; AceCoder: An Effective Prompting Technique Specialized in Code Generation; ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 33, No. 8, Article 204, Nov. 2024. [PDF] (Cited by 20)
- [TOSEM, 2024] Yihong Dong, Xue Jiang, Zhi Jin, Ge Li; Self-collaboration Code Generation via ChatGPT; ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 33, No. 7, Article 189, Sep. 2024. [PDF] (Cited by 243)
- [TOSEM, 2024] Xue Jiang, Yihong Dong, Lecheng Wang, Zheng Fang, Qiwei Shang, Ge Li, Zhi Jin, Wenpin Jiao; Self-Planning Code Generation with Large Language Model; ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 33, No. 7, Article 182, Sep. 2024. [PDF] (Cited by 155)
- [JSEP, 2024] Huangzhao Zhang, Zhuo Li, Zhi Jin, Ge Li; WELL: Applying Bug Detectors to Bug Localization via Weakly Supervised Learning; Journal of Software: Evolution and Process, Vol. 36, No. 9, Sep 01, 2024. doi: 10.1002/smr.2669. pp 1-23. [PDF] (Cited by 3)
- [ACL 2024] Kechi Zhang, Ge Li, Huangzhao Zhang, Zhi Jin; HiRoPE: Length Extrapolation for Code Models Using Hierarchical Position; Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024), Bangkok, Thailand, Aug. 11-16, 2024. [PDF] (Cited by 7)
- [ACL 2024] Kechi Zhang, Jia Li, Ge Li, Xianjie Shi, Zhi Jin; CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges; Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024), Bangkok, Thailand, Aug. 11-16, 2024. [PDF] (Cited by 69)
- [ACL 2024] Yihong Dong, Xue Jiang, Huanyu Liu, Zhi Jin, Ge Li; Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models; Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024), Bangkok, Thailand, Aug. 11-16, 2024. [PDF] (Cited by 61)
- [ACL 2024] Jia Li, Ge Li, Yunfei Zhao, Yongmin Li, Zhi Jin, Hao Zhu, Huanyu Liu, Kaibo Liu, Lecheng Wang, Zheng Fang, Lanshen Wang, Jiazheng Ding, Xuanming Zhang, Yihong Dong, Yuqi Zhu; DevEval: A Manually-Annotated Code Generation Benchmark Aligned with Real-World Code Repositories; Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024), Bangkok, Thailand, Aug. 11-16, 2024. [PDF] (Cited by 16)
- [ACL 2024] Yihong Dong, Kangcheng Luo, Xue Jiang, Zhi Jin, Ge Li; PACE: Improving Prompt with Actor-Critic Editing for Large Language Model; Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024), Bangkok, Thailand, Aug. 11-16, 2024. [PDF] (Cited by 13)
- [FSE 2024] Bolun Li, Zhihong Sun, Tao Huang, Hongyu Zhang, Yao Wan, Ge Li, Zhi Jin, Chen Lyu; IRCoCo: Immediate Rewards-Guided Deep Reinforcement Learning for Code Completion; Proceedings of the 2024 ACM International Conference on the Foundations of Software Engineering (FSE), Porto de Galinhas, Brazil, July 15-19, 2024. [PDF] (Cited by 7)
- [FSE 2024] Zhen Yang, Fang Liu, Zhongxing Yu, Jacy Wai Keung, Jia Li, Shuo Liu, Yifan Hong, Xiaoxue Ma, Zhi Jin, Ge Li; Exploring and Unleashing the Power of Large Language Models in Automated Code Translation; Proceedings of the 2024 ACM International Conference on the Foundations of Software Engineering (FSE), Porto de Galinhas, Brazil, July 15-19, 2024. [PDF] (Cited by 49)
- [TSE, 2024] Xin-Cheng Wen, Cuiyun Gao, Feng Luo, Haoyu Wang, Ge Li, and Qing Liao; LIVABLE: Exploring Long-Tailed Classification of Software Vulnerability Types; IEEE Transactions on Software Engineering (TSE), Vol. 50, Iss. 6, Jun. 2024, pp. 1325-1339. [PDF] (Cited by 11)
- [LREC-COLING 2024] Zhihong Sun, Chen Lyu, Yao Wan, Hongyu Zhang, Ge Li, Zhi Jin; Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs; Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), Torino, Italia, May 20-25, 2024. [PDF] (Cited by 12)
- [ICPC 2024] Tao Huang, Zhihong Sun, Zhi Jin, Ge Li, Chen Lyu; Knowledge-Aware Code Generation with Large Language Models; Proceedings of the 32nd ACM/IEEE International Conference on Program Comprehension (ICPC), Lisbon, Portugal, April 15-16, 2024. [PDF] (Cited by 10)
- [ICSE 2024] Tao Huang, Zhihong Sun, Zhi Jin, Ge Li, Chen Lyu; KareCoder: A New Knowledge-Enriched Code Generation System; Proceedings of the 46th ACM/IEEE International Conference on Software Engineering (ICSE), Lisbon, Portugal, April 14-20, 2024.(Short) [PDF] (Cited by 3)
- [JSEP, 2024] Huangzhao Zhang, Shuai Lu, Zhuo Li, Zhi Jin, Lei Ma, Yang Liu, Ge Li; Codebert-Attack: Adversarial Attack against Source Code Deep Learning Models via Pre-trained Model; Journal of Software: Evolution and Process, Vol. 36, Iss. 3, Mar., 2024 [PDF] (Cited by 11)
- [SCIS, 2024] Huangzhao Zhang, Kechi Zhang, Zhuo Li, Jia Li, Jia Li, Yongmin Li, Yunfei Zhao, Yuqi Zhu, Fang Liu, Ge Li, Zhi Jin; Deep Learning for Code Generation: A Survey; Science China Information Sciences (SCIS), doi: 10.1007/s11432-023-3956-3, Feb 6, 2024. [PDF] (Cited by 5)
- [TOSEM, 2024] Jia Li, Zhuo Li, Huangzhao Zhang, Ge Li, Zhi Jin, Xing Hu, Xin Xia; Poison Attack and Poison Detection on Deep Source Code Processing Models; ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 33, No. 62, Mar. 14, 2024, pp 1-31. [PDF] (Cited by 23)
- [AAAI 2024] Yuqi Zhu, Jia Allen Li, Ge Li, YunFei Zhao, Jia Li, Zhi Jin, Hong Mei; Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Language Models; Proceedings of the 38th Annual AAAI Conference on Artificial Intelligence (AAAI), Vancouver, Canada, Feb 20-27, 2024. [PDF] (Cited by 32)
- [ASEJ, 2023] Zejun Wang, Fang Liu, Yiyang Hao, Zhi Jin; AdaComplete: improve DL-based code completion method’s domain adaptability; Automated Software Engineering (ASEJ), Vol. 30, No. 1, Mar 06, 2023, pp 28-39. [PDF] (Cited by 42)
- [Internetware 2023] Jia Li, Fang Liu, Jia Allen Li, Yunfei Zhao, Ge Li, and Zhi Jin; Mcodesearcher: Multi-view Contrastive Learning for Code Search; Proceedings of the 14th Asia-Pacific Symposium on Internetware (Internetware), Hangzhou, China, August 4-6, 2023. [PDF] (Cited by 3)
- [Internetware 2023] Yunfei Zhao, Yihong Dong, Ge Li; Seq2Seq or Seq2Tree: Generating Code Using Both Paradigms via Mutual Learning; Proceedings of the 14th Asia-Pacific Symposium on Internetware (Internetware), Hangzhou, China, August 4-6, 2023, pp 238 - 248. [PDF] (Cited by 3)
- [ECAI 2023] Yihong Dong, Ge Li, Xue Jiang, Zhi Jin; Antecedent Predictions Are More Important Than You Think: An Effective Method for Tree-Based Code Generation; Proceedings of the 26th European Conference on Artificial Intelligence (ECAI), Kraków, Poland, Sept. 30 - Oct. 4, 2023. pp 565-574. [PDF] (Cited by 1)
- [ASE 2023] Jia Li, Chongyang Tao, Zhi Jin, Fang Liu, Jia Allen Li, Ge Li; ZC3 Zero-Shot Cross-Language Code Clone Detection; Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), Kirchberg, Luxembourg, September 11-15, 2023. [PDF] (Cited by 7)
- [ACL 2023] Kechi Zhang, Zhuo Li, Jia Allen Li, Ge Li, Zhi Jin; Self-Edit: Fault-Aware Code Editor for Code Generation; Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), Toronto, Canada, July 9-14, 2023. [PDF] (Cited by 116)
- [TOSEM, 2023] Jia Allen Li, Ge Li, Zhuo Li, Zhi Jin, Xing Hu, Kechi Zhang, Zhiyi Fu; CodeEditor: Learning to Edit Source Code with Pre-trained Models; ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 32, No. 6, May 22, 2023, pp 143-165. [PDF] (Cited by 35)
- [ISSTA 2023] Yihong Dong, Ge Li, Jiazheng Ding, Zhi Jin; CODEP: Grammatical Seq2Seq Model for General-Purpose Code Generation; Proceedings of the ACM Sigsoft International Symposium on Software Testing and Analysis (ISSTA'23), Seattle, Washington, United States, July 17-21, 2023. [PDF] (Cited by 22)
- [ICSE 2023] Jia Allen Li, Yongmin Li, Ge Li, Zhi Jin, Xing Hu; SkCoder: A Sketch-based Approach for Automatic Code Generation; Proceedings of the 45th International Conference on Software Engineering (ICSE), Melbourne, Australia, May 14-20, 2023. [PDF] (Cited by 68)
- [JSEP, 2023] Huangzhao Zhang, Shuai Lu, Zhi Jin, Lei Ma, Zhuo Li, Yang Liu, Ge Li; CodeBERT-Attack: Adversarial Attack against Source Code Deep Learning Models via Pre-Trained Model; Journal of Software: Evolution and Process, Vol. 36, No. 3, Apr. 11, 2023. pp 1-29. [PDF] (Cited by 11)
- [ICPC 2023] Kechi Zhang, Zhou Li, Zhi Jin, Ge Li; Implant Global and Local Hierarchy Information to Sequence based Code Representation Models; Proceedings of the 31st IEEE/ACM International Conference on Program Comprehension (ICPC), Melbourne Australia, May 15-16, 2023. (ACM SIGSOFT Distinguished Paper Award) [PDF] (Cited by 9)
- [SANER 2023] Wenhan Wang, Kechi Zhang, Ge Li, Shangqing Liu, Anran Li, Zhi Jin, Yang Liu; Learning Program Representations with a Tree-Structured Transformer; Proceedings of the 30th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Macao SAR, China, March 21st-24th, 2023. [PDF] (Cited by 16)
- [EMNLP 2022] Han Peng, Ge Li, Yunfei Zhao and Zhi Jin; Rethinking Positional Encoding in Tree Transformer for Code Representation; Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing(EMNLP 2022), Abu Dhabi, December 7–11, 2022, pp 3204 - 3214. [PDF] (Cited by 19)
- [NeurIPS 2022] Zhang Haojie, Ge Li, Jia Allen Li, Zhongjin Zhang, Yuqi Zhu, Zhi Jin; Fine-Tuning Pre-Trained Language Models Effectively by Optimizing Subnetworks Adaptively; Proceedings of the 36th Conference on Neural Information Processing Systems (NeurIPS), Online, Nov. 29 - Dec.1, 2022. [PDF] (Cited by 34)
- [FSE 2022] Sijie Shen, Xiang Zhu, Yihong Dong, Qizhi Guo, Yankun Zhen, Ge Li; Incorporating Domain Knowledge through Task Augmentation for Front-End JavaScript Code Generation; Proceedings of The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Singapore, 14th - 16th November 2022. [PDF] (Cited by 29)
- [FSE 2022] Lin Shi, Fangwen Mu, Xiao Chen, Song Wang, Junjie Wang, Ye Yang, Ge Li, Xin Xia, Qing Wang; We Building on the Rock? On the Importance of Data Preprocessing for Code Summarization; Proceedings of The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Singapore, 14th - 16th November 2022. [PDF] (Cited by 43)
- [CIKM 2022] Jia Li, Yuyuan Zhao, Zhi Jin, Ge Li, Tao Shen, Zhengwei Tao, Chongyang Tao; SK2: Integrating Implicit Sentiment Knowledge and Explicit Syntax Knowledge for Aspect-Based Sentiment Analysis; Proceedings of 31st ACM International Conference on Information and Knowledge Management, Atlanta, Georgia, USA, Oct. 17-21, 2022. [PDF] (Cited by 17)
- [TOSEM, 2022] Hao Yu, Xing Hu, Ge Li, Ying Li, Qianxiang Wang, Tao Xie; Assessing and Improving an Evaluation Dataset for Detecting Semantic Code Clones via Deep Learning; ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 31, No. 4, Article 62, July, 2022, pp 1–25. [PDF] (Cited by 6)
- [ICSE 2022] Fang Liu, Ge Li, Zhiyi Fu, Shuai Lu, Yiyang Hao, Zhi Jin; Learning to Recommend Method Names with Global Context; Proceedings of the 44th International Conference on Software Engineering (ICSE 2022), Pittsburgh, PA, USA, May 21-29, 2022. [PDF] (Cited by 30)
- [ICSE 2022] Hao Yu, Yiling Lou, Ke Sun, Dezhi Ran, Tao Xie, Dan Hao, Ying Li, Ge Li, Qianxiang Wang; Automated Assertion Generation via Information Retrieval and Its Integration with Deep Learning; Proceedings of the 44th International Conference on Software Engineering (ICSE 2022), Pittsburgh, PA, USA, May 21-29, 2022. [PDF] (Cited by 47)
- [ICPC 2022] Kechi Zhang, Wenhan Wang, Huangzhao Zhang, Ge Li, Zhi Jin; Learning to Represent Programs with Heterogeneous Graphs; Proceedings of the 30th ACM/IEEE International Conference on Program Comprehension (ICPC), Pittsburgh, PA, USA, May 16-17, 2022. [PDF] (Cited by 71)
- [EMSE, 2022] Fang Liu, Ge Li, Bolin Wei, Xin Xia, Zhiyi Fu, Zhi Jin; A Unified Multi-task Learning Model for AST-level and Token-level Code Completion; Empirical Software Engineering(EMSE), Vol. 27, Iss. 4, Apr. 18, 2022, pp. 1-38. [PDF] (Cited by 30)
- [TOSEM, 2022] Huangzhao Zhang, Zhiyi Fu, Ge Li, Lei Ma, Zhehao Zhao, Hua’an Yang, Yizhe Sun, Yang Liu, Zhi Jin; Towards Robustness of Deep Program Processing Models—Detection, Estimation, and Enhancement; ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 31, Iss. 3, Apr. 9, 2022, pp. 1-40. [PDF] (Cited by 49)
- [TSE, 2022] Hui Liu, Mingzhu Shen, Jiaqi Zhu, Nan Niu , Ge Li, Lu Zhang; Deep Learning Based Program Generation From Requirements Text: Are We There Yet?; IEEE Transactions on Software Engineering (TSE), Vol. 48, Iss. 4, Apr. 1, 2022. [PDF] (Cited by 55)
- [JSS, 2022] Zhehao Zhao, Bo Yang, Ge Li, Huai Liu, Zhi Jin; Precise Learning of Source Code Contextual Semantics via Hierarchical Dependence Structure and Graph Attention Networks; Journal of Systems and Software, Volume 184, February 2022. [PDF] (Cited by 24)
- [NeurIPS 2021] Han Peng, Ge Li, Wenhan Wang, Yunfei Zhao, Zhi Jin; Integrating Tree Path in Transformer for Code Representation; Proceedings of the 35th Conference on Neural Information Processing Systems (NeurIPS), Online, December 6-14, 2021. [PDF] (Cited by 54)
- [ASE 2021] Jia Allen Li, Yongmin Li, Ge Li, Xing Hu, Xin Xia, Zhi Jin; EDITSUM: A Retrieve-and-Edit Framework for Source Code Summarization; Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia, Sun 14 - Sat 20 November, 2021. [PDF] (Cited by 70)
- [IJCAI 2020] Wenjie Zhang, Zeyu Sun, Qihao Zhu, Ge Li, Shaowei Cai, Yingfei Xiong, Lu Zhang; NLocalSAT: Boosting Local Search with Solution Prediction; Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), Yokohama, Japan, January 7-15, 2021, pp. 1177-1183. [PDF]
- [ASE 2020] Fang Liu, Ge Li, Yunfei Zhao, Zhi Jin; Multi-task Learning based Pre-trained Language Model for Code Completion; Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia, Sep. 21-25, 2020. [PDF] (Cited by 226)
- [ASE 2020] Bolin Wei, Yongmin Li, Ge Li, Xin Xia, Zhi Jin; Retrieve and Refine: Exemplar-based Neural Comment Generation; Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, Australia, Sep. 21-25, 2020. [PDF] (Cited by 137)
- [TOSEM, 2020] Wenhan Wang, Ge Li, Sijie Shen, Xin Xia, Zhi Jin; Modular Tree Network for Source Code Representation Learning; ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 29, No. 4, Article 31, September 2020. [PDF] (Cited by 55)
- [ICPC 2020] Fang Liu, Ge Li, Xin Xia, Bolin Wei, Zhi Jin; A Self-Attentional Neural Architecture for Code Completion with Multi-Task Learning; Proceedings of the 28th IEEE/ACM International Conference on Program Comprehension (ICPC), Seoul, South Korea, May 23-24, 2020, Pages 37–47. (ACM SIGSOFT Distinguished Paper Award) [PDF] (Cited by 90)
- [SANER 2020] Wenhan Wang, Ge Li, Bo Ma, Xin Xia, Zhi Jin; Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree; Proceedings of the 27th IEEE International Conference on Software Analysis (SANER), Evolution and Reengineering London, Ontario, Canada, February 18-21, 2020. [PDF] (Cited by 310)
- [AAAI 2020] Huangzhao Zhang, Zhuo Li, Ge Li, Lei Ma, Yang Liu, Zhi Jin; Generating Adversarial Examples for Holding Robustness of Source Code Processing Models; Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI), New York, USA, Feb 7-12, 2020. [PDF] (Cited by 130)
- [NeurIPS 2019] Bolin Wei, Ge Li, Xin Xia, Zhiyi Fu, Zhi Jin; Code Generation as a Dual Task of Code Summarization; Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS), Vancouver, Canada, Dec 8-14, 2019, pp.6563-6573. [PDF] (Cited by 258)
- [ICASSP 2019] Bolin Wei, Shuai Lu, Lili Mou, Hao Zhou, Pascal Poupart, Ge Li, Zhi Jin; Why Do Neural Dialog Systems Generate Short and Meaningless Replies? a Comparison between Dialog and Translation; Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, May 12-17, 2019, pp.7290-7294. [PDF] (Cited by 36)
- [COMPSAC 2019] Xing Hu, Rui Men, Ge Li, Zhi Jin; Deep-AutoCoder: Learning to Complete Code Precisely with Induced Code Tokens; Proceedings of 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC), Milwaukee, Wisconsin, USA, Jul. 15-19, 2019. [PDF] (Cited by 10)
- [EMSE, 2019] Xing Hu, Ge Li, Xin Xia, David Lo, Zhi Jin; Deep Code Comment Generation with Hybrid Lexical and Syntactical Information; Empirical Software Engineering (EMSE), Vol. 25, Iss. 3, Jun. 18, 2019. pp 2179–2217. [PDF] (Cited by 850)
- [ICPC 2019] Hao Yu, Wing Lam, Long Chen, Ge Li, Tao Xie, Qianxiang Wang; Neural Detection of Semantic Code Clones via Tree-based Convolution; Proceedings of the 27th International Conference on Program Comprehension (ICPC), Montreal, QC, Canada, May 25-31, 2019, pp. 70-80. [PDF] (Cited by 164)
- [AAAI 2019] Zeyu Sun, Qihao Zhu, Lili Mou, Yingfei Xiong, Ge Li, Lu Zhang; A Grammar-Based Structural CNN Decoder for Code Generation; Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI), Honolulu, Hawaii, USA, Jan. 27 – Feb. 1, 2019. [PDF] (Cited by 149)
- [ICPC 2018] Xiaochen Li, He Jiang, Dong Liu, Zhilei Ren, Ge Li; Unsupervised Deep Bug Report Summarization; Proceedings of the 26th Conference on Program Comprehension (ICPC), 2018. pp. 144-155. [PDF] (Cited by 75)
- [IJCAI 2018] Xing Hu, Ge Li, Xin Xia, David Lo, Shuai Lu, Zhi Jin; Summarizing Source Code with Transferred API Knowledge; Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), July 13-19, 2018, Stockholm, Sweden. pp. 2269-2275. [PDF] (Cited by 331)
- [ICPC 2018] Xing Hu, Ge Li, Xia Xin, David Lo, Zhi Jin; Deep Code Comment Generation; Proceedings of IEEE/ACM 26th International Conference on Program Comprehension (ICPC), Gothenburg, Sweden, 27-28 May 2018, pp.200-210. (ACM SIGSOFT Distinguished Paper Award) [PDF] (Cited by 850)
- [KSEM 2017] Yunchuan Chen, Ge Li and Zhi Jin; Learning Sparse Overcomplete Word Vectors without Intermediate Dense Representations; Proceedings of the 10th International Conference on Knowledge Science, Engineering and Management (KSEM), Melbourne, Australia, August,19-20, 2017. [PDF] (Cited by 5)
- [KSEM 2017] Yangyang Lu, Ge Li, Zelong Zhao, Lingfeng Wen and Zhi Jin; Learning To Infer API Mappings From API Documents; Proceedings of the 10th International Conference on Knowledge Science, Engineering and Management (KSEM), Melbourne, Australia, August,19-20, 2017. [PDF] (Cited by 15)
- [KSEM 2017] Wenhao Huang, Ge Li and Zhi Jin; Improved Knowledge Base Completion by the Path-Augmented TransR Model; Proceedings of the 10th International Conference on Knowledge Science, Engineering and Management (KSEM), Melbourne, Australia, August,19-20, 2017. [PDF] (Cited by 17)
- [COLING 2016] Lili Mou, Yiping Song, Rui Yan, Ge Li, Lu Zhang, Zhi Jin; Sequence to Backward and Forward Sequences: A Content-Introducing Approach to Generative Short-Text Conversation; Proceedings of the 26th International Conference on Computational Linguistics (COLING), Osaka, Japan, December 11-17, 2016, pp. 3349–3358. [PDF] (Cited by 296)
- [COLING 2016] Yan Xu, Ran Jia, Lili Mou, Ge Li, Yunchuan Chen, Yangyang Lu and Zhi Jin; Improved Relation Classification by Deep Recurrent Neural Networks with Data Augmentation; Proceedings of the 26th International Conference on Computational Linguistics (COLING), Osaka, Japan, December 11-17, 2016, pp. 1461–1470. [PDF] (Cited by 292)
- [EMNLP 2016] Lili Mou, Zhao Meng, Rui Yan, Ge Li, Yan Xu, Lu Zhang, Zhi Jin; How Transferable are Neural Networks in NLP Applications?; Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), Austin, Texas, November 1-5, 2016, pp. 479–489. [PDF] (Cited by 376)
- [ACL 2016] Yunchuan Chen, Lili Mou, Yan Xu, Ge Li, Zhi Jin; Compressing Neural Language Models by Sparse Word Representations; Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), Berlin, Germany, August 7-12, 2016, pp. 226–235. [PDF] (Cited by 34)
- [ACL 2016] Lili Mou, Rui Men, Ge Li, Yan Xu, Lu Zhang, Rui Yan, Zhi Jin; Natural Language Inference by Tree-Based Convolution and Heuristic Matching; Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), Berlin, Germany, August 7-12, 2016, pp. 130–136. [PDF] (Cited by 420)
- [AAAI 2016] Lili Mou, Ge Li, Lu Zhang, Tao Wang, Zhi Jin; Convolutional Neural Networks over Tree Structures for Programming Language Processing; Proceedings of 2016 AAAI Conference on Artificial Intelligence, pages 1287-1293, Phoenix, USA, January 12-18, 2016. [PDF] (Cited by 1040)
- [CIKM 2016] Lili Mou, Ran Jia, Yan Xu, Ge Li, Lu Zhang, Zhi Jin; Distilling Word Embeddings: An Encoding Approach; Proceedings of the 25th ACM International Conference on Information and Knowledge Management, Indianapolis, USA, October 24-28, 2016. [PDF] (Cited by 28)
- [KSEM 2016] Zhao Meng, Lili Mou, Ge Li and Zhi Jin; Context-Aware Tree-Based Convolutional Neural Networks for Natural Language Inference; Proceedings of 9th International Conference on Knowledge Science, Engineering and Management, Passau, Germany, October 4-8, 2016, LNAI 9983, pp. 515–526. [PDF] (Cited by 1)
- [KSEM 2016] Yangyang Lu, Ge Li, Rui Miao and Zhi Jin; Learning Embeddings Of API Tokens To Facilitate Deep Learning Based Program Processing; Proceedings of 9th International Conference on Knowledge Science, Engineering and Management, Passau, Germany, October 4-8, 2016, LNAI 9983, pp. 527–539. [PDF] (Cited by 4)
- [EMNLP 2015] Hao Peng, Lili Mou, Ge Li, Yan Xu, Lu Zhang, Zhi Jin; A Comparative Study on Regularization Strategies for Embedding-based Neural Networks; Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisboa, Portugal, September 17–21, 2015. [PDF] (Cited by 39)
- [KSEM 2015] Hao Peng, Lili Mou, Ge Li, Yuxuan Liu, Lu Zhang and Zhi Jin; Building Program Vector Representations for Deep Learning; Proceedings of the 8th International Conference on Knowledge Science, Engineering and Management, Chongqing, China October 28-30, 2015. pp. 547-553. [PDF] (Cited by 222)
- [EMNLP 2015] Lili Mou, Hao Peng, Ge Li, Yan Xu, Lu Zhang, Zhi Jin; Discriminative Neural Sentence Modeling by Tree-Based Convolution; Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, 17-21 September, 2015. pp. 2315–2325. [PDF] (Cited by 158)
- [EMNLP 2015] Yan Xu, Lili Mou, Ge Li, Lu Zhang, Zhi Jin; Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths; Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisboa, Portugal, September 17–21, 2015. [PDF] (Cited by 864)
- [KSEM 2014] Lili Mou, Ge Li, Zhi Jin and Lu Zhang; Verification based on Hyponymy Hierarchical Characteristics for Web-based Hyponymy Discovery; Proceedings of the International Conference on Knowledge Science, Engineering and Management 2014, Lecture Notes in Computer Science Volume 8793, 2014, pp 81-92. [PDF]
- [IJSEKE, 2014] Yan Xu, Ge Li, Lili Mou, Yangyang Lu; Learning Non-taxonomy Relations on Demand for Ontology Extension; Proceedings of the International Journal of Software Engineering and Knowledge Engineering, October 2014, Vol.24, No.08, pp.1159-1175. [PDF] (Cited by 11)
Basic Papers on Deep Learning based Code Processing
- [arXiv 2015] Lili Mou, Rui Men, Ge Li, Lu Zhang, Zhi Jin; On End-to-End Program Generation from User Intention by Deep Neural Networks; arXiv preprint, arXiv: 1510.07211, 2015. (Cited by 80)
- [arXiv 2014] Lili Mou, Ge Li, Zhi Jin, Lu Zhang, Tao Wang; TBCNN: A Tree-Based Convolutional Neural Network for Programming Language Processing; arXiv preprint, arXiv: 1409.5718, 2014. (Cited by 1040)
- [arXiv 2014] Lili Mou, Ge Li, Yuxuan Liu, Hao Peng, Zhi Jin, Yan Xu, Lu Zhang; Building Program Vector Representations for Deep Learning; arXiv preprint, arXiv: 1409.3358, 2014. (Cited by 222)