dzhengAP

David Zheng dzhengAP

Achievements

vllm-project/vllm vllm-project/vllm Public

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 85.2k 18.9k
vllm-project/llm-compressor vllm-project/llm-compressor Public

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 3.5k 562
On-Device-Agent-for-adaptive-display-optimization On-Device-Agent-for-adaptive-display-optimization Public

We present a novel on-device hybrid agent combining LLMs with retrieval-augmented generation for real-time display optimization. The system achieves 92% accuracy with CoreML acceleration delivering…

Swift 1
ARS-Adaptive-Reasoning-Suppression-for-Efficient-Large-Reasoning-Language-Models ARS-Adaptive-Reasoning-Suppression-for-Efficient-Large-Reasoning-Language-Models Public

Adaptive Reasoning Suppression for Efficient Large Reasoning Language Models
distributed-inference-engine-nano-vLLM distributed-inference-engine-nano-vLLM Public

Python
distributed-training-infra-demo-megatron distributed-training-infra-demo-megatron Public

Python