OneLLM.dev API Documentation
Welcome to the official API documentation for OneLLM.dev. This document provides a comprehensive guide to interacting with our API. For a formal definition of the API, please refer to the OpenAPI Specification.
Authentication
The OneLLM.dev API uses API keys for authentication. You can obtain your API key from the OneLLM dashboard.
All API requests must include an
Authorization header with a Bearer token containing
your API key.
Authorization: Bearer YOUR_API_KEY
API Endpoint
Do note that streaming is not supported
POST https://onellm.dev/api
This is the primary endpoint for interacting with the language models. It allows you to send a chat conversation and receive a response from the specified model.
Request Body
The request body must be a JSON object containing the details of your request.
| Parameter | Type | Description |
|---|---|---|
model |
string | Required. The ID of the model to use for the completion. See the Supported Models section for a list of available models. |
messages |
array | Required. An array of message objects representing the conversation history. |
temperature |
number | Optional. Controls randomness. A lower value makes the model more deterministic. Range: 0.0 to 2.0. |
max_tokens |
integer | Optional. The maximum number of tokens to generate in the response. If the value is too large, it will be automatically adjusted based on your account balance. |
stream |
boolean |
Optional. If set to true, the response will be
streamed as server-sent events. Defaults to false.
|
top_p |
number |
Optional. The nucleus sampling probability. The model
considers the results of the tokens with top_p
probability mass.
|
stop_sequences |
array | Optional. A list of strings that will cause the model to stop generating tokens. |
...and more |
For a complete list of all possible request parameters, please refer to the OpenAPI Specification. |
Example Request:
{
"model": "GPT-4.1",
"messages": [
{
"role": "user",
"content": "Tell me a joke about computers."
}
],
"max_tokens": 50
}
Response Body
The response will be a JSON object containing the model's output.
| Parameter | Type | Description |
|---|---|---|
provider |
string | The name of the underlying model provider (e.g., `openai`). |
model |
string | The model that was used for the completion. |
role |
string | The role of the message author, typically `assistant`. |
content |
string | The content of the message generated by the model. |
usage |
object | An object containing token usage information for the request. |
finish_reason |
string | The reason the model stopped generating tokens (e.g., `stop`). |
Example Response:
{
"provider": "openai",
"model": "GPT-4.1",
"role": "assistant",
"content": "Why did the computer show up at work late? It had a hard drive!",
"usage": {
"input_tokens": 15,
"output_tokens": 12,
"total_tokens": 27
},
"finish_reason": "stop"
}
Supported Models
The following models are supported and can be used in the
model parameter of your API requests:
GPT-5GPT-5-MiniGPT-5-NanoGPT-5-Chat-LatestGPT-4.1GPT-4.1-MiniGPT-4.1-NanoGPT-o3GPT-o3-proGPT-o3-DeepResearchGPT-o3-MiniGPT-o4-miniGPT-4oGPT-4o-miniGPT-o1GPT-o1-MiniOpus-4Sonnet-4Haiku-3.5Opus-3Sonnet-3.7Haiku-3DeepSeek-ReasonerDeepSeek-Chat2.5-Flash-preview2.5-Pro-preview2.0-Flash2.0-Flash-lite1.5-Flash1.5-Flash-8B1.5-ProMistral-Medium-3Magistral-MediumCodestralDevstral-MediumMistral-SabaMistral-LargePixtral-LargeMinistral-8B-24.10Ministral-3B-24.10Mistral-Small-3.2Magistral-SmallDevstral-SmallPixtral-12BMistral-NeMoMistral-7BMixtral-8x7BMixtral-8x22B
Important Notes
-
If the
max_tokensfield's value is too large, the server will automatically set it to the highest amount that your balance allows. - A minimum balance of USD $0.10 is required to make API requests.