@douglarek大家部署的 Qwen3.5-27b 的性能如何 中发帖

抽空对 sglang 部署的 qwen3.5-27b-fp8 做了个简单的性能测试(输入 4096 tokens,输出 2048 tokens,并发数 10,共 100 个请求,默认部署的模型开启了 thinking),摘要如下: 
============ Serving Benchmark Result ============
Backend:                                 sglang    
Traffic request rate:                    inf       
Max request concurrency:                 10        
Successful requests:                     100       
Benchmark duration (s): ...
 
 
Back to Top