Chat completions

Text and vision LLMs are served at POST /v1/chat/completions with the standard OpenAI schema.

Basic request

curl https://api.ecohash.com/v1/chat/completions \
  -H "Authorization: Bearer eco_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Llama-3.1-8B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Explain retrieval-augmented generation in 3 sentences."}
    ]
  }'

Parameters

All standard OpenAI chat.completions parameters are supported:

Parameter	Type	Notes
`model`	string	Required. Model ID from the catalog
`messages`	array	Required. `{role, content}` objects
`max_tokens`	integer	Max completion length (defaults to model's remaining budget)
`temperature`	0–2	Defaults to 1.0
`top_p`	0–1	Nucleus sampling
`stream`	bool	SSE streaming
`stop`	string or array	Stop sequences
`presence_penalty`	-2–2	Penalty on repeated topics
`frequency_penalty`	-2–2	Penalty on repeated tokens
`seed`	integer	Deterministic output (when supported by the model)
`response_format`	object	`{"type": "json_object"}` for JSON mode; supported on most models
`tools` / `tool_choice`	array / string	Function calling — supported on Llama 3.1 and Gemma 4

Streaming

from openai import OpenAI
client = OpenAI(api_key="eco_...", base_url="https://api.ecohash.com/v1")

stream = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[{"role": "user", "content": "Tell a short joke"}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta
    if delta.content:
        print(delta.content, end="", flush=True)

Vision (image input)

For Qwen2.5-VL-7B-Instruct, pass images in the message content:

curl https://api.ecohash.com/v1/chat/completions \
  -H "Authorization: Bearer eco_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen2.5-VL-7B-Instruct",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "What is in this image?"},
        {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
      ]
    }]
  }'

You can also pass images as base64 data URLs: "url": "data:image/jpeg;base64,<base64 bytes>".

Function / tool calling

Supported on Llama 3.1 8B and Gemma 4 31B. Standard OpenAI tool-calling schema:

resp = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[{"role": "user", "content": "What's the weather in SF?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string"},
                    "unit": {"type": "string", "enum": ["c", "f"]}
                },
                "required": ["city"]
            }
        }
    }],
)
for call in resp.choices[0].message.tool_calls or []:
    print(call.function.name, call.function.arguments)

JSON mode

Force JSON output with response_format:

resp = client.chat.completions.create(
    model="meta-llama/Llama-3.1-8B-Instruct",
    messages=[
        {"role": "system", "content": "Reply only as JSON: {\"summary\": \"...\"}"},
        {"role": "user", "content": "Summarize: quantum entanglement"}
    ],
    response_format={"type": "json_object"},
)
import json
data = json.loads(resp.choices[0].message.content)

Errors

HTTP	Meaning	What to do
401	Bad or missing API key	Check header, regenerate key
402	`insufficient_balance` or `model_not_priced`	Balance can't cover the request, or the model doesn't have pricing set yet
404	Unknown model	Check the model ID spelling
429	Rate limited	Back off and retry
503	All regions unavailable	Rare; retry after a few seconds

See Errors for the full error reference.

Basic request​

Parameters​

Streaming​

Vision (image input)​

Function / tool calling​

JSON mode​

Errors​