Uploading files

A few ways to move files into a running GPU instance. Pick based on file size and your workflow.

Small files — paste into the browser terminal

For a requirements.txt, a small Python script, or a short config:

Open the browser terminal.
cat > my_file.txt <<'EOF' (end with EOF on its own line to finish).
Paste your content.
Hit Enter, then type EOF.

Works for anything under a few KB. Awkward for larger files.

Medium files — `wget` / `curl` from a public URL

# Download a dataset, model checkpoint, or file from a public bucket
wget https://example.com/my-dataset.tar.gz
curl -O https://example.com/my-file.zip

# From HuggingFace
pip install huggingface_hub
huggingface-cli download meta-llama/Llama-3.1-8B --local-dir ./llama

Fastest for large public files — bandwidth in our DCs is generous.

From Git

apt-get update && apt-get install -y git
git clone https://github.com/you/your-repo.git
cd your-repo

If the repo is private, use a personal access token or deploy key.

Using a Jupyter image

If you launched with a Jupyter image (quay.io/jupyter/pytorch-notebook:cuda12-latest etc.), the Jupyter UI has a file upload button in the file browser panel. Drag-and-drop small files directly into the current working directory.

Persistent files — use storage

Files in the container's default filesystem are lost when the instance terminates. For anything you want to keep between sessions:

Cloud drive (block storage, single-instance at a time) — good for notebooks, model checkpoints, datasets attached to one box. Mount at /workspace or /mnt/data. See Storage.
Shared filesystem (network storage, multi-instance) — good for shared team datasets or model weights accessed by multiple instances. Mount at /shared.

Typical workflow with a cloud drive

Create a my-notebooks cloud drive (say, 50 GB).
Launch an instance, attach the drive at /workspace.
Work in /workspace — git clone, save notebooks, download datasets.
Terminate the instance when done. Drive persists.
Next session: launch a new instance, attach the same drive, everything's still there.

Typical workflow with a shared filesystem

One team member creates a team-datasets shared filesystem, uploads the dataset once.
Every team member launches their own instance with team-datasets attached at /shared.
Everyone reads from /shared/datasets/* — no need to re-download per user.

Very large files (> 20 GB)

From a public source: wget / huggingface-cli as above — bandwidth is fine, the transfer happens server-side.
From your laptop: slower, limited by your home/office upload. Options:
- Upload to S3 / GCS first, then aws s3 cp or gsutil cp inside the instance. Our bandwidth is faster than yours to/from cloud providers.
- Use scp over the web — actually, we don't offer SSH. So pipe via Jupyter upload or cloud-intermediary.
- Ping the #ecolink-support Slack channel if you're moving TB-scale data regularly — we can arrange bulk-transfer tooling.

Downloading files back out

Same principle:

Small files: cat my_file | base64 in the terminal, copy the base64, decode locally.
Medium/large: push to a cloud bucket (aws s3 cp, gcloud storage cp, or huggingface-cli upload) — outbound bandwidth is also plentiful.
Jupyter file browser: right-click → download.

Storage — cloud drives and shared filesystems
Jupyter access
Duration and extending — so the instance doesn't terminate while you're mid-upload

Small files — paste into the browser terminal​

Medium files — wget / curl from a public URL​

From Git​

Using a Jupyter image​

Persistent files — use storage​

Typical workflow with a cloud drive​

Typical workflow with a shared filesystem​

Very large files (> 20 GB)​

Downloading files back out​

Related​