Huiqiang Jiang (姜慧强)

a fake MLSys/NLPer Google schoal,
Research focus on Efficient Methods (in LLMs)

A unpopular blogger Blog & Zhihu
A programming enthusiast @iofu728

Phone: +86 178 xxxx xxxx
Email: iofu728[aT]gmail[DoT.]com


Huiqiang Jiang obtained his Master's Degree in Software Engineering from Peking University, worded with A.P. Xiang Jing. And also was a research intern at the KC Group, Microsoft Research Asia (19/6-21/3) with Börje Karlsson and Guoxin Wang as well as the search group, Ant Group (20/6-20/8). He was a Research SDE in Microsoft Research Asia (Shanghai).
Huiqiang's research primarily focuses on system-algorithm co-design, particularly on efficient methods to accelerate inference and training, including dynamic sparse attention (MInference, RetrievalAttention, MMInference), KV Cache centric analysis (SCBench), prompt compression (LLMLingua), speculative decoding, model compression, sparse inference (PIT), neural architecture search, and efficient tuning, with a particular emphasis on LLMs. Additionally, he is interested in addressing typical challenges in natural language processing.

I'm actively seeking research interns to collaborate on efficient LLM methods. If you're interested in these research topics, please contact me at iofu728[aT]gmail[DoT]com.

Selected Publications

† equal contribution, ‡ student I advised, § corresponding.

NLP & MLSys

  1. MTraining: Efficient Distributed Training for Ultra-Long Contexts via Dynamic Sparse Attention
    Wenxuan Li, Chengruidong Zhang, Huiqiang Jiang, Yucheng Li, Yuqing Yang, Lili Qiu.
    In ICML Workshop Efficient Systems for Foundation Models (Es-FoMo), 2025

  2. SortedRL: Accelerating RL Training for LLMs through Online Length-aware Scheduling
    Yiqi Zhang, Huiqiang Jiang, Xufang Luo, Zhihe Yang, Chengruidong Zhang, Yifei Shen, Dongsheng Li, Yuqing Yang, Lili Qiu, Yang You.
    In ICML Workshop Efficient Systems for Foundation Models (Es-FoMo), 2025

  3. Chain-of-Model Learning for Language Model
    Kaitao Song, Xiaohua Wang, Xu Tan, Huiqiang Jiang, Chengruidong Zhang, Yongliang Shen, Cen LU, Zihao Li, Zifan Song, Caihua Shan, Yansen Wang, Kan Ren, Xiaoqing Zheng, Tao Qin, Yuqing Yang, Dongsheng Li, Lili Qiu.
    In Proc. of NeurIPS'25

  4. GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
    Qianhui Wu, Kanzhi Cheng, Rui Yang, Chaoyun Zhang, Jianwei Yang, Huiqiang Jiang, Jian Mu, Baolin Peng, Bo Qiao, Reuben Tan, Si Qin, Lars Liden, Qingwei Lin, Huan Zhang, Tong Zhang, Jianbing Zhang, Dongmei Zhang, Jianfeng Gao .
    In Proc. of NeurIPS'25

  5. RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference
    Yaoqi Chen, Jinkai Zhang, Baotong Lu, Qianxi Zhang, Chengruidong Zhang, Jingjia Luo, Di Liu, Huiqiang Jiang, Qi Chen, Jing Liu, Bailu Ding, Xiao Yan, Jiawei Jiang, Chen Chen, Mingxing Zhang, Weiming Zhang, Yuqing Yang, Fan Yang, Mao Yang.
    [Code] [Project Page]

  6. MMInference: Accelerating Pre-filling for Long-Context Visual Language Models via Modality-Aware Permutation Sparse Attention
    Yucheng Li, Huiqiang Jiang§, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Amir H. Abdi, Dongsheng Li, Jianfeng Gao, Yuqing Yang, Lili Qiu.
    In Proc. of ICML'25
    also appeared in ICLR Workshop FM-Wild, 2025
    [Code] [Project Page]

  7. SCBench: A KV Cache-Centric Analysis of Long-Context Methods
    Yucheng Li, Huiqiang Jiang§, Qianhui Wu, Xufang Luo, Surin Ahn, Chengruidong Zhang, Amir H. Abdi, Dongsheng Li, Jianfeng Gao, Yuqing Yang, Lili Qiu.
    In Proc. of ICLR'25
    also appeared in NeurIPS Workshop ENLSP-IV, 2024
    [Code] [Project Page] [Dataset]

  8. SeCom: On Memory Construction and Retrieval for Personalized Conversational Agents
    Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Xufang Luo, Hao Cheng, Dongsheng Li, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Jianfeng Gao.
    In Proc. of ICLR'25
    also appeared in NeurIPS Workshop AFM, 2024
    [Code] [Project Page]

  9. RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
    Di Liu, Meng Chen, Baotong Lu, Huiqiang Jiang, Zhenhua Han, Qianxi Zhang, Qi Chen, Chengruidong Zhang, Bailu Ding, Kai Zhang, Chen Chen, Fan Yang, Yuqing Yang, Lili Qiu.
    In Proc. of NeurIPS'25
    also appeared in NeurIPS Workshop ENLSP-IV (Best Paper Award), 2024
    [Code]

  10. MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention
    Huiqiang Jiang§, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H. Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu.
    In Proc. of NeurIPS'24 (Spotlight)
    also appeared in ICML Workshop Efficient Systems for Foundation Models (Es-FoMo), 2024
    [Code] [Project Page] [Demo]

  11. SecurityLingua: Efficient Defense of LLM Jailbreak Attacks via Security-Aware Prompt Compression
    Yucheng Li, Surin Ahn, Huiqiang Jiang, Amir H. Abdi, Yuqing Yang, Lili Qiu.
    In Proc. of COLM'25
    [Code]

  12. Mitigate Position Bias in Large Language Models via Scaling a Single Dimension
    Yijiong Yu, Huiqiang Jiang, Xufang Luo, Qianhui Wu, Chin-Yew Lin, Dongsheng Li, Yuqing Yang, Yongfeng Huang, Lili Qiu.
    In Proc. of ACL'25 Findings
    also appeared in ICML Workshop Long Context Foundation Models (LCFM) (Oral), 2024
    [Code]

  13. LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression
    Zhuoshi Pan, Qianhui Wu, Huiqiang Jiang, Menglin Xia, Xufang Luo, Jue Zhang, Qingwei Lin, Victor Rühle, Yuqing Yang, Chin-Yew Lin, H. Vicky Zhao, Lili Qiu, Dongmei Zhang.
    In Proc. of ACL'24 Findings
    [Code] [Project Page] [Demo]

  14. LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
    Huiqiang Jiang, Qianhui Wu, Xufang Luo, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu.
    In Proc. of ACL'24
    also appeared in ICLR Workshop Mathematical and Empirical Understanding of Foundation Models (ME-FoMo), 2024
    [Code] [Project Page] [Demo]

  15. LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
    Huiqiang Jiang, Qianhui Wu, Chin-Yew Lin, Yuqing Yang, Lili Qiu.
    In Proc. of EMNLP'23 (Oral)
    [Code] [Project Page] [Demo]

  16. PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation
    Ningxin Zheng, Huiqiang Jiang, Quanlu Zhang, Zhenhua Han, Lingxiao Ma, Yuqing Yang, Fan Yang, Chengruidong Zhang, Lili Qiu, Mao Yang, Lidong Zhou.
    In Proc. of SOSP'23
    [Code]

  17. Accurate and Structured Pruning for Efficient Automatic Speech Recognition
    Huiqiang Jiang, Li Lyna Zhang, Yuang Li, Yu Wu, Shijie Cao, Ting Cao, Yuqing Yang, Jinyu Li, Mao Yang, Lili Qiu.
    In Proc. of Interspeech'23

  18. TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning
    Shivam Shandilya, Menglin Xia, Supriyo Ghosh, Huiqiang Jiang, Jue Zhang, Qianhui Wu, Victor Rühle, Saravan Rajmohan.
    In Proc. of ACL'25 Findings

  19. Position Engineering: Boosting Large Language Models through Positional Information Manipulation
    Zhiyuan He, Huiqiang Jiang, Zilong Wang, Yuqing Yang, Luna Qiu, Lili Qiu.
    In Proc. of EMNLP'24

  20. CoLaDa: A Collaborative Label Denoising Framework for Cross-lingual Named Entity Recognition
    Tingting Ma, Qianhui Wu, Huiqiang Jiang, Börje Karlsson, Tiejun Zhao, Chin-Yew Lin.
    In Proc. of ACL'23
    [Code]

  21. Multi-Level Knowledge Distillation for Out-of-Distribution Detection in Text
    Qianhui Wu, Huiqiang Jiang, Haonan Yin, Börje F. Karlsson, Chin-Yew Lin.
    In Proc. of ACL'23
    [Code]

  22. Decomposed Meta-Learning for Few-Shot Sequence Labeling
    Tingting Ma, Qianhui Wu, Huiqiang Jiang, Jieru Lin, Börje F Karlsson, Tiejun Zhao, Chin-Yew Lin.
    IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), 2024
    [Code]

  23. Decomposed Meta-Learning for Few-Shot Named Entity Recognition
    Tingting Ma, Huiqiang Jiang, Qianhui Wu, Tiejun Zhao, Chin-Yew Lin.
    In Proc. of ACL'22 Findings
    [Code]

  24. AdvPicker: Effectively Leveraging Unlabeled Data via Adversarial Discriminator for Cross-Lingual NER
    Weile Chen, Huiqiang Jiang, Qianhui Wu, Börje F. Karlsson, Yi Guan.
    In Proc. of ACL'21
    [Code]

  25. BoningKnife: Joint Entity Mention Detection and Typing for Nested NER via prior Boundary Knowledge
    Huiqiang Jiang, Guoxin Wang, Weile Chen, Chengxi Zhang, Börje F. Karlsson.
    arXiv (work between 2019/2020)

CV

  1. ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices
    Chen Tang, Li Lyna Zhang, Huiqiang Jiang, Jiahang Xu, Ting Cao, Quanlu Zhang, Yuqing Yang, Zhi Wang, Mao Yang.
    In Proc. of ICCV'23
    [Code]

  2. Attentive Mask CLIP
    Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang.
    In Proc. of ICCV'23
    [Code]

Selected Honors & Awards

  • Awarded as Best Paper Award in ENLSP-IV @ NeurIPS'24, 2024.
  • Awarded as Top Reviewer in NeurIPS, 2024.
  • Awarded as Microsoft Global Hackathon Executive Challenge Winner Award, 2023, 2024.
  • Awarded as Microsoft Machine Learning, AI & Data Science Conference Distinguished Contribution Award Winner, 2024.
  • Awarded as Zhejiang Province Excellent Graduate Award, 2018.
  • Awarded by Zhejiang Province Scholarship Awardee (5%), 2017.
  • Awarded by Zhejiang University First-Class Scholarship for Outstanding Students, 2015-2017.
  • Awarded by Zhejiang University Excellent Student Award, 2017.
  • Awarded as First Prize in the College Students' Mathematical Contest in Modeling, Zhejiang Province, 2016.

Academic Service

  • Area Chair: ICLR (26), ARR (25)
  • Conference Reviewer: ICLR (24/25), NeurIPS (24/25), ICML (25), ARR (23-25), KDD (25), AAAI (26), EMNLP (23), COLING (24/25)
  • Journal Reviewer: TMLR, TASLP, TIST

Last Updated: Sep, 2025 Website Hit Counter