Open-source LLM models encode tool calls using incompatible wire formats, forcing each of M frameworks (vLLM, SGLang, TensorRT-LLM, transformers) to implement custom parsers for each model. Unlike closed-source APIs where tool calling is abstracted away, open-source requires each framework to reverse-engineer and maintain model-specific format knowledge independently. The article argues for declarative format specifications similar to Hugging Face's standardized chat templates.
Models
The M×N problem of tool calling and open-source models
Each of M open-source inference frameworks (vLLM, SGLang, TensorRT-LLM) must independently reverse-engineer and maintain tool-calling parsers for N incompatible model formats, creating unsustainable M×N maintenance burden that standardized declarative specs could eliminate.
Tuesday, April 14, 2026 12:00 PM UTC2 MIN READSOURCE: Hacker NewsBY sys://pipeline
Tags
models