Class | Package | Local | Serializable | [JS support] | Package downloads | Package latest |
---|---|---|---|---|---|---|
ChatXinference | langchain-xinference | ✅ | ❌ | ✅ | ✅ | ✅ |
Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
---|---|---|---|---|---|---|---|---|---|
✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |
Xinference
through PyPI:
xinference
.
To deploy Xinference in a cluster, first start an Xinference supervisor using the xinference-supervisor
. You can also use the option -p to specify the port and -H to specify the host. The default port is 8080 and the default host is 0.0.0.0.
Then, start the Xinference workers using xinference-worker
on each server you want to run them on.
You can consult the README file from Xinference for more information.
langchain-xinference
package: