Model catalog

All models are accessible at https://api.ecohash.com/v1/... using your API key. Pass the model ID in the model field of your request.

note

The model list evolves. Run GET https://api.ecohash.com/platform/models for a JSON snapshot of what's currently deployed with live pricing.

Text LLMs

`meta-llama/Llama-3.1-8B-Instruct`

General-purpose chat model, fast and cheap.

Context: 128K tokens
Endpoint: POST /v1/chat/completions
Regions: multi-region
Autoscaling: yes
Best for: everyday chat, summarization, extraction, simple reasoning

`google/gemma-4-31b-it`

Larger, higher-quality chat model. Slower and more expensive than Llama 3.1 8B.

Context: 128K tokens
Endpoint: POST /v1/chat/completions
Regions: multi-region (large-VRAM GPU required)
Autoscaling: yes
Best for: reasoning-heavy tasks where you need more headroom than Llama 3.1 8B

Vision LLMs

`Qwen/Qwen2.5-VL-7B-Instruct`

Vision + language. Pass images as base64 or URLs in the chat messages.

Context: 32K tokens
Endpoint: POST /v1/chat/completions
Regions: mv (bf16), tn (fp8), hk (fp8)
Autoscaling: yes
Best for: image captioning, OCR-like tasks, visual QA

Embeddings

`text-embedding-...`

Endpoint: POST /v1/embeddings
Regions: multi-region
Output: 1024-dimensional float32 vectors
Input: single string or array of strings, up to 8192 tokens each

Reranker

`BAAI/bge-reranker-...`

Cross-encoder reranker. Given a query and candidate documents, returns a relevance score per document.

Endpoint: POST /v1/rerank (non-OpenAI extension, but the shape mirrors Cohere's /rerank)
Regions: multi-region

Image generation

`flux-1-schnell`

Fast 4-step diffusion model from Black Forest Labs. Great quality for the speed.

Endpoint: POST /v1/images/generations
Sizes: 512×512, 768×768, 1024×1024, 1024×768, 768×1024
Regions: multi-region
Autoscaling: fixed 1 replica per region

Audio

`large-v3` (Whisper)

Speech-to-text. Handles a wide range of languages and accents.

Endpoint: POST /v1/audio/transcriptions
Input: audio file (mp3, wav, m4a, webm, ogg)
Regions: multi-region
Max duration: 30 minutes per request

`kokoro-82m` (Kokoro TTS)

Text-to-speech. ~12 natural voices available via voice parameter.

Endpoint: POST /v1/audio/speech
Input: text up to 4000 characters
Output: mp3 (default) or wav
Voices: af_bella, af_nicole, af_sarah, af_sky, am_adam, am_michael, bf_emma, bf_isabella, bm_george, bm_lewis and more (see TTS guide)
Regions: multi-region

Video generation

`wan21-t2v-1-3b`

Text-to-video. Short clips (typically 2–4 seconds, 16 fps, up to 480p).

Endpoint: POST /v1/video/generations (async — see Video generation guide)
Regions: mv
Autoscaling: fixed 1 replica

Availability note

A region being listed above means the model is routinely served there. Actual routing depends on real-time load and health — we may route your request to a different region than your "nearest" one. The response header x-ecolink-region tells you which region served each request.

If a model is temporarily unavailable in all regions (rare), you get 503 Service Unavailable and can retry. If you get 404 Not Found, the model ID isn't registered — double-check the spelling against GET /platform/models.

Text LLMs​

meta-llama/Llama-3.1-8B-Instruct​

google/gemma-4-31b-it​

Vision LLMs​

Qwen/Qwen2.5-VL-7B-Instruct​

Embeddings​

text-embedding-...​

Reranker​

BAAI/bge-reranker-...​

Image generation​

flux-1-schnell​

Audio​

large-v3 (Whisper)​

kokoro-82m (Kokoro TTS)​

Video generation​

wan21-t2v-1-3b​

Availability note​