Skip to main content

GPU Compute

EcoLink's GPU Compute lets you launch containerized GPU environments — like a GPU dev box — and manage them through the console or API. Two flavors:

  • GPU Instances — single-pod environments with 1, 2, 4, or 8 GPUs. Think "an EC2 instance but with GPUs and dev-friendly tooling." Browser-based terminal, Jupyter access, attachable storage.
  • GPU Clusters — multi-replica Deployments behind a single endpoint URL. Use when you want to run the same container on N GPUs with load-balanced traffic (e.g., a home-grown inference service you want to scale horizontally).

Both run on our multi-region NVIDIA GPU fleet.

Which do I want?

Use casePick
Interactive dev / experimentation / trainingGPU Instance
Jupyter notebook for ML workGPU Instance with a Jupyter image
Terminal shell for running scripts and installing packagesGPU Instance (via browser terminal)
Deploy your own inference service behind a managed endpointUser Inference Instance (not a cluster) — see User Inference
Run your own API/backend container on GPU behind a single URL, load-balanced, autoscaling offGPU Cluster
I just want to call Llama / FLUX / Whisper without deploying anythingPlatform Modelssee Platform Models

Default behavior

  • estimated_duration_hours on launch is a commitment, not a hint. When it elapses, the instance/cluster auto-terminates and we refund unused credit. If you want to keep running, click Extend before expiry (or after — if you're lucky, watchdog is generous, but don't rely on it).
  • Credit hold at launch covers the full estimated duration. If your account balance can't cover it, the launch fails with 402 Payment Required.
  • Auto-stop at duration: when the hour mark hits, we kill the pod cleanly and refund the unused portion of the hold.

Next steps