Playground
The playground lives in the console at console.ecohash.com — the Playground item in the sidebar. It's a single page with tabs for each model category (Chat, Vision, Image, Speech, Video, Embeddings, Reranker). Every call from the playground bills against your account the same way as an API call — just the UI is different.
Chat playground
Test text and vision LLMs.
Features:
- Model selector (Llama 3.1 8B, Gemma 4 31B, Qwen2.5-VL, or any of your user inference instances)
- System prompt field
- Message history
- Streaming token display
- Token count and latency metrics (TTFT / TPOT shown in the corner)
- Drag-and-drop images for vision models
- Copy button for the last response
Typical flow:
- Pick a model
- Set a system prompt
- Send a message
- Iterate on the prompt / parameters
- Once it works, copy the equivalent curl / Python from the "Code" tab
If the backend is briefly busy under heavy load, the playground shows a soft notice — "Warning: backend is experiencing too many concurrent inference requests now, retry shortly" — rather than a hard error. Just resend.
Omni & multimodal models
Some chat models accept more than text and can reply with more than text. The Chat playground adapts to whatever the selected model supports — the attach options and the speech toggle only appear when the model actually handles them.
Multimodal input. When the selected model accepts images, audio, or video, an attach control appears next to the message box. Add a file (or several) alongside your text prompt and send — the model sees them together. The attach options shown are exactly the input types the model declares: a vision model offers image attach; an omni model additionally offers audio and video.
Audio output (speech). When the model can speak, a speaker toggle (🔊) appears. Turn it on and the model's reply comes back as audio in addition to text — the response renders with an inline audio player you can play back right in the page. Leave it off for a normal text reply.
Which models? The model selector lists every chat model your account can reach, including vision and omni models. Pick one and the controls reconfigure to match its capabilities — there's nothing to configure manually.
Audio and video count toward the request's cost the same as in an API call: multimodal inputs are billed as input tokens, and generated speech is billed by its duration. See How billing works for the per-request model.
Image playground
Test FLUX.1 Schnell (and any of your user inference instances deployed in the image category).
Features:
- Prompt input (multiline, up to 1000 characters)
- Size selector (512, 768, 1024, portrait, landscape)
- Seed field for reproducibility
- Generated image displays with right-click-to-save
- History sidebar showing previous generations
Embeddings playground
Inspect what an embedding model actually produces.
- Paste text
- See the vector preview (first 16 dimensions as a sparkline, full vector downloadable as JSON)
Reranker playground
Test the reranker with a query and candidate documents.
- Paste a query
- Paste a numbered list of candidate documents
- See the re-ranked order with scores
Useful for checking that a reranker actually helps your retrieval quality before plumbing it into production.
Audio playground
TTS + STT in one page.
- TTS tab: pick a voice, type text, play / download the MP3.
- STT tab: drag-and-drop or record audio in-browser, see the transcript with timestamps.
Cost of playground use
Every playground call is a real API call — same billing. A typical chat message with a small LLM costs a fraction of a cent; an image generation costs a few cents; a video takes one-to-two minutes and costs a few cents.
User inference instances in the playground
Once you launch your own inference instance, it appears as an additional model option in the chat / image / embeddings / reranker playgrounds (depending on its category). That lets you test your deployed model the same way as a platform model.
Code export
Most playgrounds have a Code tab that shows the exact curl + Python + TypeScript snippet that reproduces what you just did — model ID, parameters, message history, everything. Copy it into your application to go from "works in the playground" to "works in my code" without rewriting.