Kejing Yang
Researcher
Kejing Yang has been actively engaged in research on code representation and code large language models (Code LLMs) since 2021, starting as an algorithm engineer in Alibaba Group’s Security Department. From 2021 to 2023, her focus was on detecting vulnerable interfaces by analyzing source code semantics.
Since late 2022, her work has shifted toward source code pre-training, where she has been involved in building data pipelines and designing masked language modeling and contrastive learning tasks based on the RoBERTa architecture. In 2023, Yang began training and fine-tuning Code LLMs such as StarCoder, CodeLlama, and CodeQwen, utilizing both full-parameter and LoRA-based approaches. She has also optimized models for various downstream tasks, including code summarization and generation.