ProductApr 24, 20267 min read

Podflare ships interactive PTY + smarter nearest-healthy routing

Two major improvements in one release: first-class pseudo-terminal sessions inside every sandbox (run htop, npm init, interactive installers, REPLs across agent turns) plus an edge router that skips down regions in zero milliseconds via an in-memory SWR cache. E2B parity on PTY, Podflare-native reconnect semantics, and a 5-10 ms faster happy path.

Robel TegegnePodflare, founder

Two shipped-today improvements. The first is an architectural addition — PTY as a first-class sandbox primitive, reaching parity with E2B's interactive terminal surface. The second is an invisible performance win — the edge router now routes to the nearest healthy region in zero added milliseconds, and evicts regions that 5xx almost instantaneously.

What we shipped

1. Interactive PTY — sandbox.pty.*

Podflare's run_code gives you stdout / stderr / exit_code for one-shot scripts. That's the right primitive for 90% of what LLM agents do. But it breaks the moment the program wants a real terminal:

  • Interactive installers: npm init and apt install (without -y) block forever on a Y/N prompt under pipe-backed stdin
  • REPLs: python3 -i,node, irb don't echo, don't show prompts, and expect a tty for readline
  • TUIs: htop, vim, less, top can't render at all without ncurses ioctls
  • Programs that check isatty(): typically turn off colors, switch to a simplified layout, or refuse outright

sandbox.pty.* is the fix. A first-class pseudo-terminal inside every Podflare Pod microVM, backed by a real openpty(3) pair with binary-safe streaming, reconnect support, and the full suite of primitives E2B ships — plus a few of our own.

The full API in one example:

from podflare import Sandbox

with Sandbox(region="eu") as sbx:
    # Spawn a real tty. timeout_ms=0 = keep alive until close()
    pty = sbx.pty.create(
        cmd="npm init",
        cols=120, rows=40,
        on_data=lambda chunk: print(chunk.decode(), end=""),
        timeout_ms=0,
    )

    pty.send_input("\r" * 9)         # accept every default
    pty.send_input(b"\x03")           # Ctrl-C
    pty.resize(cols=200, rows=50)     # child receives SIGWINCH
    exit_code = pty.wait()            # blocks until child exits

Reconnect works because hostd keeps a 1 MiB ring buffer of recent output per session + a tokio::sync::broadcast channel for live events. When you attach to an existing session, the server first drains the ring buffer (historical output you missed) then joins you to the live broadcast:

# Turn 1 — start a long build
pty = sbx.pty.create(cmd="bash", timeout_ms=0)
pty.send_input(b"./build.sh 2>&1 | tee /tmp/build.log\n")
pty_id = pty.id

# Turn 2 (different process / network blip / hours later)
pty = sbx.pty.attach(pty_id, on_data=stream_to_user)
# Replays last 1 MiB of buffered output, then streams live

The session only ends when the child process exits — never when a client disconnects. That's the semantic difference from a simple streaming exec: you can close your laptop, come back three hours later, and the same Bash shell is still sitting at the same prompt.

2. Interactive PTY via MCP (new tools)

LLM agents driving Podflare via mcp.podflare.ai get five new tools for multi-turn interactive sessions, plus a one-shot tool for commands that just need a tty:

ToolUse for
pty_execOne-shot commands that need a tty — apt install -y, bun x tsc, npm init -y
pty_createStart a background session, get a pty_id that survives across tool calls
pty_inputSend keystrokes (UTF-8 or base64 raw bytes for Ctrl-C / arrow keys)
pty_readLong-poll for new output since a byte cursor — no streaming connection required
pty_resizeTIOCSWINSZ / SIGWINCH for reflowing TUIs
pty_killSIGKILL the session's process group

A typical multi-turn flow driving python3 REPL across four tool calls:

Agent: pty_create  {command: "python3"}           → {pty_id: "abc…"}
Agent: pty_read    {pty_id, since_seq: 0}         → ">>> "
Agent: pty_input   {pty_id, data: "2 + 2\n"}
Agent: pty_read    {pty_id, since_seq: 4}         → "2 + 2\n4\n>>> "
Agent: pty_input   {pty_id, data: "exit()\n"}
Agent: pty_read    {pty_id, since_seq: 14}        → {done: true, exit_code: 0}

Tool descriptions explicitly steer the LLM toward the right primitive (run_python for plain Python, pty_exec for one-shot tty, pty_create+input+read for multi-turn). Protects against agents defaulting to PTY for everything.

3. Nearest-healthy edge routing (invisible win)

The api.podflare.ai Worker routes every inbound request to the best regional hostd. It already ranked by haversine distance and skipped unhealthy regions via a KV-cached capacity snapshot. Two changes today make the hot path tighter:

  1. In-memory SWR cache per Worker isolate. The Worker reads health from an isolate-local Map instead of KV on each request. Entries under 15 s are served as-is (no I/O); 15–90 s entries are served stale while the Worker schedules a refresh via ctx.waitUntil() that costs the current request zero ms.
  2. Eager eviction on 5xx / timeout. A region that errors on a request is immediately evicted from the cache AND marked unhealthy in KV. The next request in the same isolate skips it directly — no re-incurring the 3.5 s upstream timeout.

Plus: we cut the upstream time-to-first-byte timeout from 8 s → 3.5 s so a TCP-blackholed region fails over in half the time. Streaming endpoints (exec, pty_create) aren't affected — the timer clears once headers arrive, body streams run free.

Every response now carries an x-podflare-origin-ms header with per-request origin RTT, so slow regions are visible without log diving.

How to try it today

Python:

pip install -U podflare  # 0.0.22 ships pty

from podflare import Sandbox
with Sandbox(region="eu") as sbx:
    pty = sbx.pty.create(cmd="htop", on_data=print)
    # ... interact ...
    pty.kill()

TypeScript:

npm install podflare@latest  # 0.0.21 ships pty

import { Sandbox } from "podflare";
const sbx = await Sandbox.create({ region: "eu" });
const pty = await sbx.pty.create({
  cmd: "htop",
  onData: chunk => process.stdout.write(chunk),
});

MCP / any AI tool: no install. Point your agent at mcp.podflare.ai with your API key and the six new PTY tools show up in tools/list automatically.

What's next

PTY is fully rolled out on the EU region today. us-west / us-east / us-central / singapore will follow as each node finishes its hostd rebuild — the same ~10-second downtime per region we saw for eu. The router already skips regions that don't have the new binary, so your PTY calls land on a PTY-capable region automatically until the rollout completes.

Further down the roadmap: a WebSocket endpoint for browser-embedded terminals (xterm.js) and an in-dashboard live terminal view. If your use case benefits from either, drop us a note at hello@podflare.ai.

Links

#podflare pty#sandbox pty#interactive terminal sandbox#pseudo terminal ai agent#reconnect sandbox#npm init sandbox#htop sandbox#cloud sandbox routing#nearest healthy region#edge router#swr cache#mcp pty

Frequently asked questions

+Why did Podflare need a PTY primitive — isn't run_code enough?

run_code gives you stdout / stderr / exit_code for one-shot scripts, which is perfect for 90% of what LLM agents do. It breaks down the moment the program expects a real terminal: interactive installers refuse to run (apt install -y works, `apt install` without -y blocks on a Y/N prompt), REPLs don't echo, TUIs like htop can't render, and programs that check isatty() take a different code path (no color output, simplified layouts). PTY is the escape hatch — a real pseudo-terminal with stdin, stdout, and ioctl support, fully streamable, reconnect-capable.

+How is this different from E2B's PTY API?

Same primitives at the SDK level: pty.create(cmd, onData, onExit), sendInput, resize, wait, kill, attach(pty_id) for reconnect. Same timeoutMs: 0 for indefinite sessions. Two Podflare-specific wrinkles: (1) reconnect replays the last 1 MiB of output from a hostd-side ring buffer so your agent can attach mid-stream and see recent history, (2) a matching MCP tool surface (pty_create, pty_input, pty_read, pty_kill, pty_resize) lets LLMs drive interactive sessions across tool-call turns without holding a streaming connection.

+What problem does nearest-healthy routing solve?

Previously the edge router read per-region health from KV on every request (1-5 ms per lookup) and used an 8-second upstream timeout — so a region going down meant 8 s of waiting before the failover fired, multiplied by however many requests hit during the cron's 1-minute poll window. Now an in-memory SWR cache serves health state in zero milliseconds on the happy path, eager-eviction on 5xx means the second request after a failure skips the bad region instantly, and a 3.5 s upstream timeout halves the worst-case failover latency. Net effect: customers don't notice when a region blips.

+Do I need to update my SDK to get this?

The routing improvement is edge-side — every customer gets it automatically, no version bump. For PTY, pip install -U podflare (0.0.22) or npm install podflare@latest gets you sandbox.pty.*. If you use Podflare via MCP only, the new six tools show up automatically in tools/list — no install needed.

+Can I try PTY without writing code — from my AI tool?

Yes. Any MCP-compatible AI tool (Claude Code, Cursor, Cline, Codex, Windsurf, ChatGPT custom GPTs) pointed at mcp.podflare.ai now sees pty_exec for one-shot tty commands and pty_create/input/read/kill/resize for multi-turn interactive sessions. Agents can drive npm init, apt install, python3 REPL, or an interactive debugger across several tool-call turns without holding a streaming connection.

Keep reading

Ship an AI agent on Podflare in under a minute.

Hardware-isolated microVM per sandbox, ~190 ms round-trip, 80 ms fork(), full Python REPL persistence. Free tier includes $200 credit.

Get started free
Podflare ships interactive PTY + smarter nearest-healthy routing — Podflare