GPU Compute

EcoLink's GPU Compute lets you launch containerized GPU environments — like a GPU dev box — and manage them through the console or API. Two flavors:

GPU Instances — single-pod environments with 1, 2, 4, or 8 GPUs. Think "an EC2 instance but with GPUs and dev-friendly tooling." Browser-based terminal, Jupyter access, attachable storage.
GPU Clusters — multi-replica Deployments behind a single endpoint URL. Use when you want to run the same container on N GPUs with load-balanced traffic (e.g., a home-grown inference service you want to scale horizontally).

Both run on our multi-region NVIDIA GPU fleet.

Which do I want?

Use case	Pick
Interactive dev / experimentation / training	GPU Instance
Jupyter notebook for ML work	GPU Instance with a Jupyter image
Terminal shell for running scripts and installing packages	GPU Instance (via browser terminal)
Deploy your own inference service behind a managed endpoint	User Inference Instance (not a cluster) — see User Inference
Run your own API/backend container on GPU behind a single URL, load-balanced, autoscaling off	GPU Cluster
I just want to call Llama / FLUX / Whisper without deploying anything	Platform Models — see Platform Models

estimated_duration_hours on launch is a commitment, not a hint. When it elapses, the instance/cluster auto-terminates and we refund unused credit. If you want to keep running, click Extend before expiry (or after — if you're lucky, watchdog is generous, but don't rely on it).
Credit hold at launch covers the full estimated duration. If your account balance can't cover it, the launch fails with 402 Payment Required.
Auto-stop at duration: when the hour mark hits, we kill the pod cleanly and refund the unused portion of the hold.