分布式大模型推理Apr 16, 2025 1 min read从通信高效型的模型切分、面向长序列的内存和缓存管理以及经济高效的批量推理框架等三个方面开展大模型推理优化研究。Bowen Zhou 周博文PhD StudentMy research interests include distributed robotics, mobile computing and programmable matter.