1.
Bingyang Wu
, Shengyu Liu, Yinmin Zhong, Peng Sun, Xuanzhe Liu, Xin Jin. LoongServe: Efficiently Serving Long-Context Large Language Models with Elastic Sequence Parallelism.
ACM SOSP 2024
2.
Bingyang Wu
, Ruidong Zhu, Zili Zhang, Peng Sun, Xuanzhe Liu, Xin Jin. dLoRA: Dynamically Orchestrating Requests and Adapters for LoRA LLM Serving.
USENIX OSDI 2024
3.
Bingyang Wu
, Kun Qian, Bo Li, Yunfei Ma, Qi Zhang, Zhigang Jiang, Jiayu Zhao, Dennis Cai, Ennan Zhai, Xuanzhe Liu, Xin Jin. XRON: A Hybrid Elastic Cloud Overlay Network for Video Conferencing at Planetary Scale.
ACM SIGCOMM 2023
4.
Bingyang Wu
, Zili Zhang, Zhihao Bai, Xuanzhe Liu, Xin Jin. Transparent GPU Sharing in Container Clouds for Deep Learning Workloads.
USENIX NSDI 2023