TitanML
helps businesses build and deploy better, smaller, cheaper, and faster NLP models through our training, compression, and inference optimization platform.
Our inference server, Titan Takeoff enables deployment of LLMs locally on your hardware in a single command. Most embedding models are supported out of the box, if you experience trouble with a specific model, please let us know at hello@titanml.co.
models
parameter.
You can use embed.query_documents
to embed multiple documents at once. The expected input is a list of strings, rather than just a string expected for the embed_query
method.