AI agent papers worth tracking in one repo

OraCore Editors

Back to home

[IND] June 18, 20265 min readOraCore Editors

AI agent papers worth tracking in one repo

A curated repo of 4 agent paper themes helps you find planning, skills, harnesses, and surveys fast.

GitHub AI agents

Share LinkedIn

AI agent papers worth tracking in one repo

This repo curates AI agent papers by theme so you can scan the field fast.

This GitHub collection tracks AI agent research in themed buckets and updates it biweekly, with 1,494 stars showing strong community use. If you want a fast way to follow planning, skills, harnesses, and surveys without reading every arXiv feed, this list shows where to start.

Item	What it covers	Example signals
Harness	Runtime structure for agent execution	Safety, search, production workflows
Skills	Reusable agent abilities	Skill creation, governance, evaluation
Survey	Field overviews	Taxonomy, trends, benchmarks
Architecture	How agents are organized	Single-agent, multi-agent, ops
Applications	Where agents are used	Web, software, data, research

1. Harness papers for runtime design

Get the latest AI news in your inbox

Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.

No spam. Unsubscribe at any time.

The harness section is the best entry point if you care about how agents actually run in production. It gathers papers on execution substrates, safety checks, search behavior, and architecture patterns, which makes it useful for builders who need more than model prompts.

Representative papers in this bucket include AI Harness Engineering: A Runtime Substrate for Foundation-Model Software Agents, Is Grep All You Need? How Agent Harnesses Reshape Agentic Search, and Harness-Bench: Measuring Harness Effects across Models in Realistic Agent Workflows.

Focuses on execution, not just prompting
Useful for agent ops, evaluation, and safety work
Includes survey and benchmark entries

2. Skills papers for reusable agent abilities

If your interest is what agents can learn to do repeatedly, the skills section is the most practical cluster. It covers skill creation, selection, governance, and self-evolution, so you can compare papers that treat skills as modular parts of an agent system.

That makes it a strong fit for teams building long-lived agents. Papers such as SkillOS: Learning Skill Curation for Self-Evolving Agents, SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution, and SkillGrad: Optimizing Agent Skills Like Gradient Descent show how broad the topic has become.

Skill themes you will see here:
- skill generation
- skill memory and management
- least-privilege enforcement
- skill evaluation
- self-evolving skill systems

3. Survey papers for fast field orientation

The survey bucket is the quickest way to understand where the research is going. Instead of one method, these papers map taxonomies, techniques, and open questions, which is helpful when you need a clean overview before choosing a subtopic.

For a broad starting point, A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications and Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning show the repository’s survey style. The collection also points to related work on collaboration, failure attribution, and self-evaluation.

Good for literature reviews and slide prep
Helps identify subfields worth deeper reading
Pairs well with benchmark papers

4. Architecture papers for agent system design

The architecture section organizes papers around single-agent, multi-agent, and agent-ops patterns. That is useful if you are deciding how to structure a product, because the papers here are about system shape as much as model behavior.

Use this section when you need to compare coordination styles or operational patterns. The repo’s links make it easy to jump from broad architecture choices to more specific application areas like digital agents or enterprise agents.

Single-agent setups for focused tasks
Multi-agent setups for coordination and division of labor
Agent-ops and UX for production deployment

5. Application papers for domain-specific use cases

The application sections are where the repository becomes especially useful for practitioners. Instead of staying abstract, it sorts papers into embodied, web, mobile, software, data, research, API, deep research, enterprise, and finance agents.

That lets you jump straight to the environment you care about. If you are building a browser worker, a coding assistant, or a research copilot, the application pages narrow the reading list quickly and reduce time spent on irrelevant papers.

Examples of application clusters:
- Web agents
- GUI agents
- Software agents
- Research agents
- Enterprise agents

How to decide

Pick harness papers if you care about execution and safety, skills papers if you want reusable capabilities, surveys if you need orientation, and architecture or application papers if you are building a system for a specific setting. For most readers, the fastest path is survey first, then harness or skills, then the application area that matches the product.

Because the repo is updated biweekly, it works well as a living reading list rather than a one-time roundup.

// Related Articles

AI agent papers worth tracking in one repo

1. Harness papers for runtime design

Get the latest AI news in your inbox

2. Skills papers for reusable agent abilities

3. Survey papers for fast field orientation

4. Architecture papers for agent system design

5. Application papers for domain-specific use cases

How to decide

Anthropic’s Fable shows AI can outsmart constraints

OpenAI’s partner network is a delivery strategy, not a logo program

The Anthropic ban proves Congress should regulate frontier AI now

Anthropic’s safe Claude Mythos 5 turns access into tiers

G7 should treat AI CEOs as power brokers, not guests

KuCoin’s AI stack turns blockchain into AI plumbing