llama.cpp python library is a simple Python bindings for@ggerganov
llama.cpp. This package provides:
- Low-level access to C API via ctypes interface.
- High-level Python API for text completion
OpenAI
-like APILangChain
compatibilityLlamaIndex
compatibility- OpenAI compatible web server
- Local Copilot replacement
- Function Calling support
- Vision API support
- Multiple Models
Class | Package | Local | Serializable | JS support |
---|---|---|---|---|
ChatLlamaCpp | langchain-community | ✅ | ❌ | ❌ |
Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
---|---|---|---|---|---|---|---|---|---|
✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ | ✅ |
Hermes 2 Pro is an upgraded version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house. This new version of Hermes maintains its excellent general task and conversation capabilities - but also excels at Function CallingSee our guides on local models to go deeper:
langchain-community
and llama-cpp-python
packages:
ChatLlamaCpp.bind_tools
, we can easily pass in Pydantic classes, dict schemas, LangChain tools, or even functions as tools to the model. Under the hood, these are converted to an OpenAI tool schema, which looks like:
{"type": "function", "function": {"name": <<tool_name>>}}.