GPU Compute
EcoLink's GPU Compute lets you launch containerized GPU environments — like a GPU dev box — and manage them through the console or API. Two flavors:
- GPU Instances — single-pod environments with 1, 2, 4, or 8 GPUs. Think "an EC2 instance but with GPUs and dev-friendly tooling." Browser-based terminal, Jupyter access, attachable storage.
- GPU Clusters — multi-replica Deployments behind a single endpoint URL. Use when you want to run the same container on N GPUs with load-balanced traffic (e.g., a home-grown inference service you want to scale horizontally).
Both run on our multi-region NVIDIA GPU fleet.
Which do I want?
| Use case | Pick |
|---|---|
| Interactive dev / experimentation / training | GPU Instance |
| Jupyter notebook for ML work | GPU Instance with a Jupyter image |
| Terminal shell for running scripts and installing packages | GPU Instance (via browser terminal) |
| Deploy your own inference service behind a managed endpoint | User Inference Instance (not a cluster) — see User Inference |
| Run your own API/backend container on GPU behind a single URL, load-balanced, autoscaling off | GPU Cluster |
| I just want to call Llama / FLUX / Whisper without deploying anything | Platform Models — see Platform Models |
Default behavior
estimated_duration_hourson launch is a commitment, not a hint. When it elapses, the instance/cluster auto-terminates and we refund unused credit. If you want to keep running, click Extend before expiry (or after — if you're lucky, watchdog is generous, but don't rely on it).- Credit hold at launch covers the full estimated duration. If your account balance can't cover it, the launch fails with
402 Payment Required. - Auto-stop at duration: when the hour mark hits, we kill the pod cleanly and refund the unused portion of the hold.