Vercel’s eve turns agents into directories

OraCore Editors

Back to home

[TOOLS] June 19, 202613 min readOraCore Editors

Vercel’s eve turns agents into directories

I break down Vercel’s eve framework and show the directory-first pattern you can copy for agent workflows.

agents

Share LinkedIn

Vercel’s eve turns agent workflows into a directory structure you can copy.

I’ve been building with agent frameworks long enough to know when something feels tidy on the surface and annoying underneath. The demos look great. The prompts are clean. The tools are wired up. Then you actually try to ship it and the whole thing turns into a pile of hidden state, weird conventions, and one giant file that nobody wants to touch twice. I keep running into the same mess: where do I put the instructions, where do I keep the tools, how do I version a workflow without breaking the rest of the app, and why does every framework make me feel like I’m reverse-engineering somebody else’s habits?

That’s why Vercel’s eve caught my attention. Not because it promises magical agents, and not because I need another “AI platform” pitch. I care because it tries to make agent work look like something I already understand: directories, files, and a structure I can inspect without opening a black box. That’s a much better starting point for real software teams. The New Stack wrote about it here: Vercel launches eve, an open-source framework that treats agents as directories.

The part I wanted to unpack is simple: if agents are going to live inside product code, they need to behave more like code and less like a cloud of prompts. Eve is Vercel’s attempt to make that happen.

Why I trust directories more than agent abstractions

Get the latest AI news in your inbox

Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.

No spam. Unsubscribe at any time.

“Vercel launches eve, an open-source framework that treats agents as directories”

What this actually means is that the framework is trying to make agent behavior discoverable through the filesystem. Instead of hiding everything in a runtime config blob, you organize agents the way you organize features: folder by folder, concern by concern.

I like this because I’ve spent too many hours hunting through agent projects where the prompt lived in one place, the tool schema in another, the routing logic in a third, and the actual business rules were scattered across README files and half-finished examples. That’s not architecture. That’s scavenger hunt design.

When a framework says “directory,” I immediately think about a few things: can I grep it, can I diff it, can I review it in a pull request, and can I tell a teammate where to start without giving them a three-hour tour? If the answer is yes, I’m interested. If the answer is no, I already know how the postmortem goes.

How to apply it: if you’re building an agent system today, start by mapping each agent to a folder. Put the instructions, tools, tests, and examples next to each other. Don’t centralize everything into one “agent.ts” file unless you enjoy pain. The filesystem is boring, but boring is good when the software is still changing every week.

Open source matters here because teams need to inspect the guts

The article frames eve as an open-source framework, and that matters more than the branding. If I’m going to build on top of a new agent system, I want to see the implementation, not just the marketing page. I want to know what assumptions it makes about execution, state, and tool calls.

Open source also changes the adoption math. A proprietary agent platform asks me to trust the vendor’s opinionated runtime. An open-source framework asks me to inspect it, patch it, and maybe even fork it when my use case gets weird. And agent use cases get weird fast. One team needs human approval steps. Another needs per-tenant tool permissions. Another wants the agent to act like a support triage bot, not a code generator. One abstraction rarely fits all of that.

I’ve seen teams get burned by frameworks that were elegant in a demo and brittle in production. The moment they needed logging, replay, or a custom tool boundary, they were stuck waiting on someone else’s roadmap. Open source doesn’t solve that automatically, but it at least gives you a fighting chance.

Inspect the runtime behavior before you commit.
Check whether the framework makes state explicit or hides it.
Look for extension points around tools, memory, and routing.

How to apply it: if you’re evaluating eve or anything like it, don’t start with the happy path demo. Start with the ugly questions. Can you debug a failed run? Can you test an agent without calling the real tools? Can you version one agent without redeploying the whole system? If the answer is fuzzy, keep digging.

Directory-first design is really a packaging decision

What I think Vercel is aiming at here is not just organization. It’s packaging. A directory is a unit you can move, copy, review, and deploy. That makes the agent itself feel like a feature module instead of a mystical service somewhere off to the side.

This is the part that clicks for me. Most teams don’t fail because they can’t write a prompt. They fail because they can’t package the prompt with the rest of the behavior. The prompt says one thing, the tool expects another, the guardrails live elsewhere, and nobody knows which version is actually live. A directory can hold all of that together.

I ran into this exact problem on a project where we had four small agents doing different support tasks. The prompt logic was fine. The problem was coordination. Every time we changed one workflow, we accidentally changed another because the shared config was too shared. If those agents had lived in separate directories with explicit interfaces, we would have saved ourselves a week of cleanup and a lot of annoyed Slack messages.

How to apply it: define a standard agent folder layout and stick to it. I’d want something like instructions, tools, tests, examples, and metadata in predictable places. If your framework lets you co-locate those pieces, use that aggressively. If it doesn’t, build your own wrapper so your team can still reason about the structure.

The real win is reviewability, not novelty

Agent systems fail in boring ways. A tool gets renamed. A prompt changes tone. A guardrail is removed because it was “blocking progress.” Then the next release quietly starts doing the wrong thing. This is exactly why directory-based organization matters: it makes changes visible.

When I review code, I want to see what changed and why. When I review an agent workflow, I want the same thing. If the agent is just a blob of runtime state, I can’t tell whether the new behavior came from the prompt, the tool schema, or a hidden default. That’s a bad place to be when the agent is touching real customer data or internal systems.

Vercel’s framing suggests that eve is trying to make agents feel more like ordinary software artifacts. That’s not glamorous, but it’s useful. The best infrastructure tools I’ve used were never the ones with the loudest pitch. They were the ones that made the next debugging session less miserable.

How to apply it: make every agent change reviewable in a pull request. Keep prompts in versioned files. Keep tool contracts in code, not in a wiki nobody checks. Add tests that prove the agent still behaves the way you expect after a change. If a directory layout helps you do that, it’s doing real work.

Don’t confuse a neat file tree with a finished system

I do want to be careful here. A directory-first framework can still be a mess if the underlying runtime is sloppy. Good organization doesn’t automatically give you observability, permission control, retry logic, or safe execution boundaries. It just gives you a better place to put the mess.

That’s why I’m interested in the pattern, not in pretending the pattern solves everything. If eve really treats agents as directories, then the next question is what happens at runtime. How are agent runs traced? How are tools sandboxed? How are failures represented? How do you stop one agent from stepping on another?

Directory structure helps humans navigate the system.
Runtime controls help the system survive production.
You need both, or you just get prettier chaos.

I’ve been around enough platform work to know the trap: teams adopt a clean abstraction, then skip the operational plumbing because “the framework handles it.” That sentence has caused more trouble than I care to count. If you can’t inspect, trace, and test the behavior, the directory tree is just decoration.

How to apply it: treat the directory layout as the source of truth for humans, then add explicit runtime checks for machines. Log every tool call. Record agent inputs and outputs where appropriate. Add approval points for risky actions. And if the framework doesn’t make that easy, don’t pretend it’s fine. It isn’t.

The best part is that this fits how developers already think

What makes this idea stick is that it doesn’t ask me to learn a totally alien mental model. I already think in folders, modules, and boundaries. I already expect a repo to tell me where the important stuff lives. Eve seems to be betting that agent workflows should fit into that same habit instead of fighting it.

That matters because adoption is mostly about friction. If I can drop an agent into a directory, understand its shape, and review it like normal code, I’m far more likely to use it. If I need to learn a brand-new control plane for every tiny workflow, I’m going to get annoyed and look for a simpler path.

There’s also a team effect here. A directory is a shared object. It gives frontend, backend, and platform folks the same thing to point at. That lowers the translation tax. Nobody has to guess where the agent lives or which file matters. That sounds small until you’ve sat in a meeting where three engineers are describing the same workflow using three different mental models.

How to apply it: standardize the shape of your agent repos. Put a README at the root, keep one agent per folder when possible, and make the folder names describe outcomes, not implementation details. I’d rather see support-triage than agent-v2-final-final. We’ve all seen enough of that nonsense.

The template you can copy

If I were adopting the directory-first idea today, I’d start with a layout like this. It keeps the agent readable, testable, and easy to hand off without a long explanation.

# agent-repo/

agent-repo/
  README.md
  agents/
    support-triage/
      instructions.md
      tools.ts
      policy.md
      examples/
        happy-path.json
        edge-case.json
      tests/
        triage.spec.ts
      metadata.json

    refund-helper/
      instructions.md
      tools.ts
      policy.md
      examples/
        refund-approved.json
        refund-denied.json
      tests/
        refund.spec.ts
      metadata.json

  shared/
    tools/
      customer.ts
      billing.ts
    policies/
      safety.md
      pii.md
    utils/
      logging.ts
      tracing.ts

  runtime/
    executor.ts
    router.ts
    approvals.ts

  docs/
    how-to-add-an-agent.md
    how-to-test-an-agent.md

# instructions.md
You are the support triage agent. Classify the request, gather missing context,
and either resolve it or route it to the right workflow.

# policy.md
- Never expose PII in logs.
- Ask for approval before any billing mutation.
- Prefer a short answer when confidence is high.
- Escalate when the request is ambiguous or high risk.

# metadata.json
{
  "name": "support-triage",
  "owner": "customer-support-platform",
  "version": "1.0.0",
  "triggers": ["ticket.created", "chat.message"],
  "tools": ["customer.lookup", "billing.lookup", "ticket.update"]
}

# tests/triage.spec.ts
import { describe, it, expect } from "vitest";

describe("support triage agent", () => {
  it("routes billing issues to the refund helper when needed", () => {
    expect(true).toBe(true);
  });

  it("does not reveal sensitive customer data", () => {
    expect(true).toBe(true);
  });
});

# how to use this layout
1. Put one agent in one folder.
2. Keep instructions, tools, policy, and tests together.
3. Make shared code boring and explicit.
4. Review agent changes like normal code changes.
5. Add runtime logging and approvals outside the prompt layer.
6. Version the folder, not just the prompt text.

That’s the whole point for me. If the framework makes this kind of structure natural, it’s worth paying attention to. If it doesn’t, I’d still borrow the pattern and build it myself. Either way, the directory-first idea is the part I’d keep.

Source attribution: I broke this down from The New Stack’s article Vercel launches eve, an open-source framework that treats agents as directories. The directory-first framing is theirs; the workflow advice and template here are my own synthesis.

For related context, I also looked at Vercel, Vercel on GitHub, and the broader open-source agent tooling conversation around LangChain and OpenAI’s agents tooling. Those are useful reference points, but the point here is simpler: if you can make agents look like directories, you make them easier to ship.

// Related Articles

Vercel’s eve turns agents into directories

Why I trust directories more than agent abstractions

Get the latest AI news in your inbox

Open source matters here because teams need to inspect the guts

Directory-first design is really a packaging decision

The real win is reviewability, not novelty

Don’t confuse a neat file tree with a finished system

The best part is that this fits how developers already think

The template you can copy

Rust forum week 25 turns ideas into shipping work

Claude Code Rust trims TUI overhead to one binary

Open source tools that make vibe coding safer

Model triage turns coding tests into a cost win

Fine-Tuning LLMs Locally: SFT, LoRA, DPO

Obscura: Rust headless browser for AI agents