About me
Hi there! My name is Yixiao Ma, an LLM&IR enthusiast from Beijing, China. I received my banchelor and master degree of Computer Science and Technology from Tsinghua University. My supervisor is Prof. Yiqun Liu. Currently I’m working as an LLM algorithm expert at Huawei.
Education Background
- 08.2020-06.2023 Master, Department of Computer Science and Technology, Tsinghua University
- 03.2019-09.2019 Research Intern, Department of Computer Science, Carnegie Mellon University
- 08.2016-06.2020 Banchelor, Department of Computer Science and Technology, Tsinghua University
Working Experience
- 06.2023-Now Algorithm engineer (Top Mind), Pangu Large Language Model team, Huawei
- Owner of Legal LLM training project.
- General LLM SFT for RAG and ToB projects.
- High-quality SFT data construction.
- Information retrieval algorithm research.
- (Intern) 07.2022-09.2022 Algorithm engineer, Shopee
- Short video search
- (Intern) 08.2020-09.2020 Algorithm engineer, Bytedance
- User growth
- (Intern) 05.2020-07.2020 Algorithm engineer , WeChat, Tencent
- Multimodal sentiment analysis
Publications
- Yixiao Ma, Yueyue Wu, Weihang Su, Qingyao Ai, and Yiqun Liu. CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding (EMNLP’23).
- Yixiao Ma, Qingyao Ai, Yueyue Wu, Yunqiu Shao, Yiqun Liu, Min Zhang, and Shaoping Ma. Incorporating Retrieval Information into the Truncation of Ranking Lists for Better Legal Search (SIGIR’22).
- Yixiao Ma, Yunqiu Shao, Yueyue Wu, Yiqun Liu, Ruizhe Zhang, Min Zhang, and Shaoping Ma. LeCaRD: A Legal Case Retrieval Dataset for Chinese Law System (SIGIR’21).
- Yixiao Ma, Yueyue Wu, Qingyao Ai, Yiqun Liu, Yunqiu Shao, Min Zhang, and Shaoping Ma. Incorporating Structural Information into Legal Case Retrieval (TOIS).
- Yixiao Ma, Yunqiu Shao, Yiqun Liu, Min Zhang, and Shaoping Ma. Retrieving Legal Cases from a Large-scale Candidate Corpus (ICAIL’21).
- Yufeng Yang*, Yixiao Ma, Zhengyu Wang, and Min Xu. *AttNet: Attention-based Deep Neural Network for 3D Point Set Analysis (Sensors, co-first author).
- Haitao Li, Yunqiu Shao, Yueyue Wu, Qingyao Ai, Yixiao Ma, and Yiqun Liu. LeCaRDv2: A Large-Scale Chinese Legal Case Retrieval Dataset (SIGIR’24).
- Wenmeng Yu, Fanyang Meng, Yilin Zhu, Yixiao Ma et al. Improving Multimodal Sentiment Analysis with Independent Unimodal Annotations (ACL’20).
- Yunqiu Shao, Haitao Li, Yueyue Wu, Qingyao Ai, Jiaxin Mao, Yixiao Ma, and Yiqun Liu. An intent taxonomy of legal case retrieval (TOIS).
- Ruizhe Zhang, Qingyao Ai, Yueyue Wu, Yixiao Ma, and Yiqun Liu. Result Diversification for Legal Case Retrieval (SIGIR-AP’23).
Projects
- 1st place in CAIL2023 conversational legal case retrieval track (2023).
- 1st place in CAIL2023 legal case retrieval track (2023).
- 1st place in COLIEE2021 legal case search track (2021).
- Patent: Chong Chen and Yixiao Ma. A Generalized Semantic Embedding Training Method for Information Retrieval.
- Patent: Yixiao Ma, Chong Chen, and Chao Feng. A Generalized Large Language Model Training Algorithm for Retrieval-Augmented Generation.
- Reviewer of ACL, EMNLP, and TOIS conference
Honors and Awards
- Awards of Beijing Outstanding Graduate & Tsinghua University Computer Science Department Outstanding Graduate (2023).
- 84 Innovative Future Scholarship (2023).
- Hye-yeon Excellence Scholarship (2022).
- Longhu Scholarship (2022).
- Tsinghua University Outstanding Graduation Design (2020).
- Tsinghua University Scholarship (2019).
- Tsinghua University Practice Detachment School Level Grand Prize
Others
- Has received CS offers from Cornell, Georgia Tech, Columbia, etc.
- Has received full-time job offers from Huawei (Top Mind), Kwai (Kwai Star), ByteDance, Baichuan, Alibaba, etc.
- Computer Science Student Union (2017).
- National Level II Athlete (Go)