Long-running Agents in Research Preview

Cursor

Cursor introduced autonomous agents capable of operating over extended periods to tackle larger, more complex tasks in research preview. These agents employ planning before execution and complete difficult work without human intervention, resulting in larger and more complete pull requests with fewer obvious follow-ups. During internal testing, long-running agents achieved over 1,000 commits per hour across hundreds of concurrent agents during a week-long test run.


Overview

Cursor announced a major advancement in autonomous coding with the release of long-running agents in research preview. These agents represent a significant leap forward in AI-assisted development, capable of working autonomously over extended time horizons to complete larger and more complex tasks than previously possible.

Core Capabilities

Long-running agents employ a fundamentally different approach compared to standard agents. Rather than immediately executing tasks, these agents begin by creating a comprehensive plan. This planning phase allows the agents to break down complex work into manageable components and establish clear execution strategies before making any code changes.

The agents are designed to operate independently without requiring constant human intervention. This autonomous operation enables developers to assign substantial features, complex refactors, or challenging bug fixes and return hours later to find completed work ready for review.

Performance and Results

During Cursor’s internal testing and research preview period, long-running agents demonstrated remarkable capabilities. The system scaled to over 1,000 commits per hour across hundreds of concurrent agents running simultaneously during a week-long continuous test run. This level of sustained autonomous activity represents a new frontier in AI-assisted software development.

Early participants in the research preview used long-running agents to accomplish tasks that were previously too challenging for standard agents. Common use cases included implementing large features spanning multiple files and systems, refactoring complex codebases with intricate dependencies, fixing challenging bugs requiring deep investigation, overhauling performance across entire applications, and creating comprehensive test coverage.

Quality of Output

The output quality from long-running agents showed measurable improvements over standard agent workflows. Cursor reported that the agents produced larger and more complete pull requests, meaning the initial implementation covered more edge cases and requirements. Additionally, the work required fewer obvious follow-ups, indicating that the agents anticipated and addressed common issues during their autonomous execution rather than requiring iterative human feedback.

Agent Architecture

Long-running agents utilize multiple specialized agents that check each other’s work. This multi-agent architecture creates a system of checks and balances, where different agents can review code quality, verify test coverage, and ensure implementations align with the original plan. This collaborative approach among agents helps maintain high code quality even during extended autonomous operation.

Availability

Cursor made long-running agents available at cursor.com/agents for users on Ultra, Teams, and Enterprise plans. The feature launched as a research preview, allowing Cursor to gather real-world usage data and continue refining the system based on developer feedback.