Platform Models

Platform models are inference services EcoLink runs for you — always on, OpenAI-compatible, billed per-request. Nothing to deploy, no GPU to manage; just hit the API with your key.

What's available

Text LLMs — Llama 3.1 8B Instruct, Gemma 4 31B IT
Vision LLMs — Qwen2.5-VL-7B-Instruct
Embeddings — text-embedding models for search, RAG, clustering
Reranker — cross-encoder reranker for search pipelines
Image generation — FLUX.1 Schnell (text → image)
Text-to-speech — Kokoro-82M (a dozen natural voices)
Speech-to-text — Whisper Large v3
Video generation — Wan2.1 T2V 1.3B (text → short video, async)

See the full list in the Model catalog with context windows, pricing, and region availability.

How to call

Every platform model uses an OpenAI-compatible endpoint at https://api.ecohash.com/v1/...:

Category	Endpoint
Chat / vision LLM	`POST /v1/chat/completions`
Embeddings	`POST /v1/embeddings`
Reranker	`POST /v1/rerank`
Image generation	`POST /v1/images/generations`
Text-to-speech	`POST /v1/audio/speech`
Speech-to-text	`POST /v1/audio/transcriptions`
Video generation	`POST /v1/video/generations` (async — returns a job, poll for result)

Authentication is a standard Authorization: Bearer eco_... header. See API Keys.

How routing works

When you call /v1/chat/completions with model: "meta-llama/Llama-3.1-8B-Instruct", EcoLink routes your request to the healthiest region serving that model. You don't pick a region — the platform does it for you based on current load and region health.

Regional routing is transparent; the only thing you'll see is the x-ecolink-region header on the response indicating where it was served.

Unpriced models

If you try to call a model that's registered but doesn't yet have pricing set, you'll get:

HTTP 402 Payment Required
{ "error": "model_not_priced: <model_id> is registered but not yet priced — contact admin to set pricing before use" }

Ping the #ecolink-support Slack channel and we'll set pricing.

What about my own models?

If you want to deploy your own model (HuggingFace checkpoint, fine-tune, custom container), that's user inference — a separate feature where you launch your own inference endpoint, get a unique URL, and call it via the same API key.

What's available​

How to call​

How routing works​

Unpriced models​

What about my own models?​

Next steps​

What's available

How to call

How routing works

Unpriced models

What about my own models?

Next steps