Chat completions

OpenAI-compatible chat endpoint. Works with any text or vision LLM — platform models and your own user inference instances alike.

POST https://api.ecohash.com/v1/chat/completions

Headers

Header	Value
`Authorization`	`Bearer eco_YOUR_KEY`
`Content-Type`	`application/json`

Request body

Field	Type	Required	Notes
`model`	string	yes	Model ID — either a platform model (e.g. `meta-llama/Llama-3.1-8B-Instruct`) or a user inference instance (`<name>:<id>` format)
`messages`	array	yes	`{role, content}` objects
`max_tokens`	integer	no	Upper bound on completion length
`temperature`	number	no	0–2, default 1.0
`top_p`	number	no	0–1
`stream`	bool	no	`true` for SSE streaming
`stop`	string or array	no	Stop sequences
`presence_penalty`	number	no	-2 to 2
`frequency_penalty`	number	no	-2 to 2
`seed`	integer	no	Deterministic sampling (when supported)
`response_format`	object	no	`{"type": "json_object"}` for JSON mode
`tools`	array	no	Function-calling tool definitions
`tool_choice`	string or object	no	`"auto"`, `"none"`, or a specific tool

Example request

curl https://api.ecohash.com/v1/chat/completions \
  -H "Authorization: Bearer eco_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Llama-3.1-8B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a concise assistant."},
      {"role": "user", "content": "What is EcoLink?"}
    ],
    "max_tokens": 200,
    "temperature": 0.7
  }'

Response body

Standard OpenAI schema:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1776391234,
  "model": "meta-llama/Llama-3.1-8B-Instruct",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "EcoLink is a GPU cloud platform ..."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 24,
    "total_tokens": 42
  }
}

Streaming

With "stream": true, the response is a sequence of data: <json>\n\n server-sent events:

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant","content":""},"index":0}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},"index":0}]}

data: {"id":"chatcmpl-...","object":"chat.completion.chunk","choices":[{"delta":{"content":" there"},"index":0}]}

data: [DONE]

Vision input

For vision LLMs, images go in the content array as image_url parts:

{
  "model": "Qwen/Qwen2.5-VL-7B-Instruct",
  "messages": [{
    "role": "user",
    "content": [
      {"type": "text", "text": "What is in this image?"},
      {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}}
    ]
  }]
}

url can be either a https:// URL or a data URL: data:image/jpeg;base64,<base64-bytes>.

Response headers

Header	Meaning
`x-ecolink-region`	Region that served this request
`x-ecolink-request-id`	Unique ID for this request (include in support requests)

Error responses

See Errors for the full list. Common:

HTTP	Meaning
401	Bad or missing API key
402	Balance insufficient or model not priced
404	Unknown model ID
429	Rate limited
503	No healthy region for this model

Headers​

Request body​

Example request​

Response body​

Streaming​

Vision input​

Response headers​

Error responses​