Chat completions
Text and vision LLMs are served at POST /v1/chat/completions with the standard OpenAI schema.
Basic request
curl https://api.ecohash.com/v1/chat/completions \
-H "Authorization: Bearer eco_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Llama-3.1-8B-Instruct",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain retrieval-augmented generation in 3 sentences."}
]
}'
Parameters
All standard OpenAI chat.completions parameters are supported:
| Parameter | Type | Notes |
|---|---|---|
model | string | Required. Model ID from the catalog |
messages | array | Required. {role, content} objects |
max_tokens | integer | Max completion length (defaults to model's remaining budget) |
temperature | 0–2 | Defaults to 1.0 |
top_p | 0–1 | Nucleus sampling |
stream | bool | SSE streaming |
stop | string or array | Stop sequences |
presence_penalty | -2–2 | Penalty on repeated topics |
frequency_penalty | -2–2 | Penalty on repeated tokens |
seed | integer | Deterministic output (when supported by the model) |
response_format | object | {"type": "json_object"} for JSON mode; supported on most models |
tools / tool_choice | array / string | Function calling — supported on Llama 3.1 and Gemma 4 |
Streaming
from openai import OpenAI
client = OpenAI(api_key="eco_...", base_url="https://api.ecohash.com/v1")
stream = client.chat.completions.create(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[{"role": "user", "content": "Tell a short joke"}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta
if delta.content:
print(delta.content, end="", flush=True)
Vision (image input)
For Qwen2.5-VL-7B-Instruct, pass images in the message content:
curl https://api.ecohash.com/v1/chat/completions \
-H "Authorization: Bearer eco_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-VL-7B-Instruct",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
]
}]
}'
You can also pass images as base64 data URLs: "url": "data:image/jpeg;base64,<base64 bytes>".
Function / tool calling
Supported on Llama 3.1 8B and Gemma 4 31B. Standard OpenAI tool-calling schema:
resp = client.chat.completions.create(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[{"role": "user", "content": "What's the weather in SF?"}],
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"},
"unit": {"type": "string", "enum": ["c", "f"]}
},
"required": ["city"]
}
}
}],
)
for call in resp.choices[0].message.tool_calls or []:
print(call.function.name, call.function.arguments)
JSON mode
Force JSON output with response_format:
resp = client.chat.completions.create(
model="meta-llama/Llama-3.1-8B-Instruct",
messages=[
{"role": "system", "content": "Reply only as JSON: {\"summary\": \"...\"}"},
{"role": "user", "content": "Summarize: quantum entanglement"}
],
response_format={"type": "json_object"},
)
import json
data = json.loads(resp.choices[0].message.content)
Errors
| HTTP | Meaning | What to do |
|---|---|---|
| 401 | Bad or missing API key | Check header, regenerate key |
| 402 | insufficient_balance or model_not_priced | Balance can't cover the request, or the model doesn't have pricing set yet |
| 404 | Unknown model | Check the model ID spelling |
| 429 | Rate limited | Back off and retry |
| 503 | All regions unavailable | Rare; retry after a few seconds |
See Errors for the full error reference.