Skip to main content

Model catalog

All models are accessible at https://api.ecohash.com/v1/... using your API key. Pass the model ID in the model field of your request.

note

The model list evolves. Run GET https://api.ecohash.com/platform/models for a JSON snapshot of what's currently deployed with live pricing.

Text LLMs

meta-llama/Llama-3.1-8B-Instruct

General-purpose chat model, fast and cheap.

  • Context: 128K tokens
  • Endpoint: POST /v1/chat/completions
  • Regions: multi-region
  • Autoscaling: yes
  • Best for: everyday chat, summarization, extraction, simple reasoning

google/gemma-4-31b-it

Larger, higher-quality chat model. Slower and more expensive than Llama 3.1 8B.

  • Context: 128K tokens
  • Endpoint: POST /v1/chat/completions
  • Regions: multi-region (large-VRAM GPU required)
  • Autoscaling: yes
  • Best for: reasoning-heavy tasks where you need more headroom than Llama 3.1 8B

Vision LLMs

Qwen/Qwen2.5-VL-7B-Instruct

Vision + language. Pass images as base64 or URLs in the chat messages.

  • Context: 32K tokens
  • Endpoint: POST /v1/chat/completions
  • Regions: mv (bf16), tn (fp8), hk (fp8)
  • Autoscaling: yes
  • Best for: image captioning, OCR-like tasks, visual QA

Embeddings

text-embedding-...

  • Endpoint: POST /v1/embeddings
  • Regions: multi-region
  • Output: 1024-dimensional float32 vectors
  • Input: single string or array of strings, up to 8192 tokens each

Reranker

BAAI/bge-reranker-...

Cross-encoder reranker. Given a query and candidate documents, returns a relevance score per document.

  • Endpoint: POST /v1/rerank (non-OpenAI extension, but the shape mirrors Cohere's /rerank)
  • Regions: multi-region

Image generation

flux-1-schnell

Fast 4-step diffusion model from Black Forest Labs. Great quality for the speed.

  • Endpoint: POST /v1/images/generations
  • Sizes: 512×512, 768×768, 1024×1024, 1024×768, 768×1024
  • Regions: multi-region
  • Autoscaling: fixed 1 replica per region

Audio

large-v3 (Whisper)

Speech-to-text. Handles a wide range of languages and accents.

  • Endpoint: POST /v1/audio/transcriptions
  • Input: audio file (mp3, wav, m4a, webm, ogg)
  • Regions: multi-region
  • Max duration: 30 minutes per request

kokoro-82m (Kokoro TTS)

Text-to-speech. ~12 natural voices available via voice parameter.

  • Endpoint: POST /v1/audio/speech
  • Input: text up to 4000 characters
  • Output: mp3 (default) or wav
  • Voices: af_bella, af_nicole, af_sarah, af_sky, am_adam, am_michael, bf_emma, bf_isabella, bm_george, bm_lewis and more (see TTS guide)
  • Regions: multi-region

Video generation

wan21-t2v-1-3b

Text-to-video. Short clips (typically 2–4 seconds, 16 fps, up to 480p).

  • Endpoint: POST /v1/video/generations (async — see Video generation guide)
  • Regions: mv
  • Autoscaling: fixed 1 replica

Availability note

A region being listed above means the model is routinely served there. Actual routing depends on real-time load and health — we may route your request to a different region than your "nearest" one. The response header x-ecolink-region tells you which region served each request.

If a model is temporarily unavailable in all regions (rare), you get 503 Service Unavailable and can retry. If you get 404 Not Found, the model ID isn't registered — double-check the spelling against GET /platform/models.