[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-prompt-to-harness-ai-engineering-shift-en":3,"tags-prompt-to-harness-ai-engineering-shift-en":30,"related-lang-prompt-to-harness-ai-engineering-shift-en":42,"related-posts-prompt-to-harness-ai-engineering-shift-en":46,"series-industry-f4f874da-d4bb-4c33-b808-2115085e35e9":83},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":11,"keywords":12,"language":18,"translated_content":10,"views":19,"is_premium":20,"created_at":21,"updated_at":21,"cover_image":11,"published_at":22,"rewrite_status":23,"rewrite_error":10,"rewritten_from_id":24,"slug":25,"category":26,"related_article_id":27,"status":28,"google_indexed_at":29,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":10,"topic_cluster_id":10,"embedding":10,"is_canonical_seed":20},"f4f874da-d4bb-4c33-b808-2115085e35e9","From Prompting to Harness Engineering","\u003Cp>OpenAI says one team built a product with more than 1 million lines of code using \u003Ca href=\"https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-codex\u002F\" target=\"_blank\" rel=\"noopener\">Codex\u003C\u002Fa>, three engineers, and five months of work. The team merged about 1,500 pull requests, which works out to roughly 3.5 PRs per engineer per day. That number matters because it hints at a deeper shift: the scarce skill is moving from typing code to shaping the system around the agent that writes it.\u003C\u002Fp>\u003Cp>The word for that system is \u003Cem>harness\u003C\u002Fem>. In OpenAI’s framing, a harness is the environment, tooling, checks, and feedback loops that let an agent work reliably inside a real codebase. If prompting was the first wave of AI engineering, and tool use was the second, harness design is the next one.\u003C\u002Fp>\u003Cp>That sounds abstract until you look at what actually changed. The best teams are no longer asking, “How do we get the model to produce code?” They are asking, “How do we make the model safe, fast, and useful inside our repo, our CI, and our review process?” That is a much harder question, and it is where the real engineering work is moving.\u003C\u002Fp>\u003Ch2>What OpenAI’s Codex team actually did\u003C\u002Fh2>\u003Cp>OpenAI’s \u003Ca href=\"https:\u002F\u002Fopenai.com\u002Findex\u002Fharness-engineering\u002F\" target=\"_blank\" rel=\"noopener\">Harness engineering\u003C\u002Fa> post is interesting because it is packed with operational detail instead of marketing language. The team did not treat Codex like a chat interface. They treated it like a contributor that needed guardrails, fast feedback, and a workspace designed around agent behavior.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775630059508-4wd6.png\" alt=\"From Prompting to Harness Engineering\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The headline numbers are the part people will remember, but the process is the part worth studying. Three engineers, five months, about 1,500 merged PRs, and a codebase that crossed 1 million lines. That is a lot of output for a small team, but it also implies a lot of invisible work: setting up tests, making failures legible, defining task boundaries, and deciding when the agent should stop and ask for help.\u003C\u002Fp>\u003Cul>\u003Cli>Team size: 3 engineers\u003C\u002Fli>\u003Cli>Timeline: 5 months\u003C\u002Fli>\u003Cli>Pull requests merged: about 1,500\u003C\u002Fli>\u003Cli>Codebase size: more than 1 million lines\u003C\u002Fli>\u003Cli>Average output: about 3.5 PRs per engineer per day\u003C\u002Fli>\u003C\u002Ful>\u003Cp>Those numbers do not mean engineers are obsolete. They mean the bottleneck has moved. The hard part is no longer generating code at all costs. The hard part is creating a setup where generated code is worth merging.\u003C\u002Fp>\u003Ch2>Why the harness matters more than the prompt\u003C\u002Fh2>\u003Cp>Prompting helped people get useful outputs from large language models, but it was always fragile. A prompt can describe what you want, yet it cannot enforce test coverage, code style, repo conventions, or deployment safety. A harness can do those things because it is built into the workflow around the model.\u003C\u002Fp>\u003Cp>This is why agent-first development is different from the old “ask the model for a snippet” habit. A good harness gives the agent access to the right files, constrains the task size, runs checks automatically, and feeds failures back into the loop. In practice, that means the model is not improvising in a vacuum. It is operating inside a system that narrows mistakes and makes correction cheap.\u003C\u002Fp>\u003Cp>OpenAI’s post makes a subtle but important point: the best productivity gains come from engineering the environment, not from writing a better prompt. That is a big shift for teams that still think AI adoption means asking developers to chat with a model more often.\u003C\u002Fp>\u003Cblockquote>“The future of software development is not about replacing developers, but about augmenting them with AI.” — Satya Nadella, Microsoft Build 2023 keynote\u003C\u002Fblockquote>\u003Cp>Nadella’s quote is broad, but the Codex example gives it a concrete shape. Augmentation is not a vague promise here. It is a workflow where the human designs the system, and the agent executes inside it with measurable output.\u003C\u002Fp>\u003Ch2>How this compares with older AI coding workflows\u003C\u002Fh2>\u003Cp>For the last couple of years, most AI coding talk centered on copilots, autocomplete, and chat-based code generation. Those tools were useful, but they still assumed the human would do most of the integration work. The agent-first model changes the ratio.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775630043962-dsgx.png\" alt=\"From Prompting to Harness Engineering\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>Instead of asking a model for a function and then manually stitching it into a product, the engineer defines a task, lets the agent work, and reviews the result. That sounds similar on paper, but the numbers show the difference in practice. When a small team can merge around 1,500 PRs in five months, the review loop becomes a production line for validated changes, not a pile of half-finished suggestions.\u003C\u002Fp>\u003Cul>\u003Cli>\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Ffeatures\u002Fcopilot\" target=\"_blank\" rel=\"noopener\">GitHub Copilot\u003C\u002Fa> is mostly an inline assistant for writing code faster\u003C\u002Fli>\u003Cli>\u003Ca href=\"https:\u002F\u002Fopenai.com\u002Findex\u002Fintroducing-codex\u002F\" target=\"_blank\" rel=\"noopener\">Codex\u003C\u002Fa> in an agent setup can take on multi-step repo tasks\u003C\u002Fli>\u003Cli>\u003Ca href=\"https:\u002F\u002Fwww.anthropic.com\u002Fclaude-code\" target=\"_blank\" rel=\"noopener\">Claude Code\u003C\u002Fa> follows a similar agentic pattern for terminal-first coding\u003C\u002Fli>\u003Cli>\u003Ca href=\"https:\u002F\u002Faider.chat\u002F\" target=\"_blank\" rel=\"noopener\">Aider\u003C\u002Fa> shows how much of the work can be pushed into a repo-aware loop\u003C\u002Fli>\u003C\u002Ful>\u003Cp>The difference is not cosmetic. Autocomplete helps you type faster. An agent with a good harness can work on a branch, run tests, fix failures, and come back with something reviewable. That is a higher level of abstraction, and it changes how teams think about staffing, scheduling, and code ownership.\u003C\u002Fp>\u003Ch2>What engineers need to build now\u003C\u002Fh2>\u003Cp>If \u003Ca href=\"\u002Fnews\u002Fharness-engineering-long-running-multi-agent-systems-en\">harness engineering\u003C\u002Fa> becomes the default pattern, then the job description for strong engineers changes fast. You still need people who understand architecture and debugging, but you also need people who can design agent-friendly workflows. That means clear task decomposition, reliable test suites, readable logs, and repo structures that do not confuse automated contributors.\u003C\u002Fp>\u003Cp>It also means teams need to think about failure modes in a more disciplined way. An agent can be fast and still be wrong in expensive ways. The job is to make mistakes cheap to catch. That includes tighter CI, better diff review, stronger permission boundaries, and clearer instructions for what the agent should not touch.\u003C\u002Fp>\u003Cp>For teams trying to adopt this style, the practical checklist is already visible:\u003C\u002Fp>\u003Cul>\u003Cli>Break work into tasks the agent can finish in one pass\u003C\u002Fli>\u003Cli>Make tests fast enough that the agent can iterate without waiting forever\u003C\u002Fli>\u003Cli>Keep logs and errors readable for humans and machines\u003C\u002Fli>\u003Cli>Limit write access so the agent cannot wander across unrelated parts of the repo\u003C\u002Fli>\u003Cli>Measure merged output, not just generated output\u003C\u002Fli>\u003C\u002Ful>\u003Cp>That last point is the one many teams miss. Generated code is cheap. Merged code is what counts. A harness is good when it turns messy model output into code that survives review, tests, and deployment.\u003C\u002Fp>\u003Ch2>What this means for the next wave of AI teams\u003C\u002Fh2>\u003Cp>The most interesting part of OpenAI’s example is not that a model wrote code. Plenty of teams can get a model to write code. The interesting part is that a small group of engineers built a system where the model could keep producing acceptable changes over months, inside a real product workflow.\u003C\u002Fp>\u003Cp>That suggests a near-term split in the market. Teams that treat AI as a chat assistant will get incremental gains. Teams that build harnesses around their agents will get compounding gains, because every better test, every clearer repo rule, and every cleaner review loop makes the next task easier.\u003C\u002Fp>\u003Cp>My guess is that the next hiring signal will not be “prompt engineer,” which already feels dated. It will be people who can design \u003Ca href=\"\u002Fnews\u002Fcursor-3-agent-interface-update-en\">agent workflows\u003C\u002Fa>, debug model failures in production-like settings, and turn messy codebases into places where agents can work without constant rescue.\u003C\u002Fp>\u003Cp>If you want to see where software engineering is heading, watch the teams that can merge more useful code with fewer humans, not the teams that can generate the longest prompt. The real question now is simple: when your agent writes code, what is the harness doing for it?\u003C\u002Fp>","OpenAI says one team shipped a 1M-line product with 3 engineers and Codex, merging about 1,500 PRs in 5 months.","zhuanlan.zhihu.com","https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F2024239882059399752",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775630059508-4wd6.png",[13,14,15,16,17],"OpenAI","Codex","harness engineering","AI coding agents","software engineering","en",0,false,"2026-04-08T06:33:44.872395+00:00","2026-04-08T06:33:44.679+00:00","done","f04ed45f-2343-47e0-9dd3-d7d706eab71e","prompt-to-harness-ai-engineering-shift-en","industry","fcc8d167-dc0f-4514-8d6b-4f4230547616","published","2026-04-08T09:00:45.986+00:00",[31,33,36,38,40],{"name":13,"slug":32},"openai",{"name":34,"slug":35},"Harness Engineering","harness-engineering",{"name":16,"slug":37},"ai-coding-agents",{"name":14,"slug":39},"codex",{"name":17,"slug":41},"software-engineering",{"id":27,"slug":43,"title":44,"language":45},"prompt-to-harness-ai-engineering-shift-zh","從 Prompt 到 Harness 工程","zh",[47,53,59,65,71,77],{"id":48,"slug":49,"title":50,"cover_image":51,"image_url":51,"created_at":52,"category":26},"cf1863f5-624d-4b5f-bc32-d469c2149866","why-ai-infrastructure-is-now-the-real-moat-en","Why AI infrastructure is now the real moat","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778875858866-4ikl.png","2026-05-15T20:10:38.090619+00:00",{"id":54,"slug":55,"title":56,"cover_image":57,"image_url":57,"created_at":58,"category":26},"6ff3920d-c8ea-4cf3-8543-9cf9efc3fe36","circles-agent-stack-targets-machine-speed-payments-en","Circle’s Agent Stack targets machine-speed payments","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778871659638-hur1.png","2026-05-15T19:00:44.756112+00:00",{"id":60,"slug":61,"title":62,"cover_image":63,"image_url":63,"created_at":64,"category":26},"1270e2f4-6f3b-4772-9075-87c54b07a8d1","iren-signs-nvidia-ai-infrastructure-pact-en","IREN signs Nvidia AI infrastructure pact","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778871059665-3vhi.png","2026-05-15T18:50:38.162691+00:00",{"id":66,"slug":67,"title":68,"cover_image":69,"image_url":69,"created_at":70,"category":26},"b308c85e-ee9c-4de6-b702-dfad6d8da36f","circle-agent-stack-ai-payments-en","Circle launches Agent Stack for AI payments","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778870450891-zv1j.png","2026-05-15T18:40:31.462625+00:00",{"id":72,"slug":73,"title":74,"cover_image":75,"image_url":75,"created_at":76,"category":26},"f7028083-46ba-493b-a3db-dd6616a8c21f","why-nebius-ai-pivot-is-more-real-than-hype-en","Why Nebius’s AI Pivot Is More Real Than Hype","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778823055711-tbfv.png","2026-05-15T05:30:26.829489+00:00",{"id":78,"slug":79,"title":80,"cover_image":81,"image_url":81,"created_at":82,"category":26},"b63692ed-db6a-4dbd-b771-e1babdc94af7","nvidia-backs-corning-factories-with-billions-en","Nvidia backs Corning factories with billions","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778822444685-tvx6.png","2026-05-15T05:20:28.914908+00:00",[84,89,94,99,104,109,114,119,124,129],{"id":85,"slug":86,"title":87,"created_at":88},"d35a1bd9-e709-412e-a2df-392df1dc572a","ai-impact-2026-developments-market-en","AI's Impact in 2026: Key Developments and Market Shifts","2026-03-25T16:20:33.205823+00:00",{"id":90,"slug":91,"title":92,"created_at":93},"5ed27921-5fd6-492e-8c59-78393bf37710","trumps-ai-legislative-framework-en","Trump's AI Legislative Framework: What's Inside?","2026-03-25T16:22:20.005325+00:00",{"id":95,"slug":96,"title":97,"created_at":98},"e454a642-f03c-4794-b185-5f651aebbaca","nvidia-gtc-2026-key-highlights-innovations-en","NVIDIA GTC 2026: Key Highlights and Innovations","2026-03-25T16:22:47.882615+00:00",{"id":100,"slug":101,"title":102,"created_at":103},"0ebb5b16-774a-4922-945d-5f2ce1df5a6d","claude-usage-diversifies-learning-curves-en","Claude Usage Diversifies, Learning Curves Emerge","2026-03-25T16:25:50.770376+00:00",{"id":105,"slug":106,"title":107,"created_at":108},"69934e86-2fc5-4280-8223-7b917a48ace8","openclaw-ai-commoditization-concerns-en","OpenClaw's Rise Raises Concerns of AI Model Commoditization","2026-03-25T16:26:30.582047+00:00",{"id":110,"slug":111,"title":112,"created_at":113},"b4b2575b-2ac8-46b2-b90e-ab1d7c060797","google-gemini-ai-rollout-2026-en","Google's Gemini AI Rollout Extended to 2026","2026-03-25T16:28:14.808842+00:00",{"id":115,"slug":116,"title":117,"created_at":118},"6e18bc65-42ae-4ad0-b564-67d7f66b979e","meta-llama4-fabricated-results-scandal-en","Meta's Llama 4 Scandal: Fabricated AI Test Results Unveiled","2026-03-25T16:29:15.482836+00:00",{"id":120,"slug":121,"title":122,"created_at":123},"bf888e9d-08be-4f47-996c-7b24b5ab3500","accenture-mistral-ai-deployment-en","Accenture and Mistral AI Team Up for AI Deployment","2026-03-25T16:31:01.894655+00:00",{"id":125,"slug":126,"title":127,"created_at":128},"5382b536-fad2-49c6-ac85-9eb2bae49f35","mistral-ai-high-stakes-2026-en","Mistral AI: Facing High Stakes in 2026","2026-03-25T16:31:39.941974+00:00",{"id":130,"slug":131,"title":132,"created_at":133},"9da3d2d6-b669-4971-ba1d-17fdb3548ed5","cursors-meteoric-rise-pressures-en","Cursor's Meteoric Rise Faces Industry Pressures","2026-03-25T16:32:21.899217+00:00"]