Jellyfish reposted this
Spent the last week at Jellyfish researching what autonomy in coding agents actually looks like in practice, specifically for Claude We found out that most sessions are short, but once you zoom into the top 0.1%, the picture changes and a lot Three things jumped out: 1. Productivity scales steeply with duration: Going from sub-P90 turns to P99.9+ turns, commits per turn go up roughly 20x and merged PRs about 11x. The work mix barely shifts — slightly more features, slightly less refactor — so these long runs look like generalists doing way more, not specialists doing something different. 2. Over 80% of long-running turns happen with the human barely involved. And with new models like Opus 4.7 pushing how long agents can stay on task, coding agents are entering another era of scale and volume. 3. Output concentrates in those low-supervision long runs. Average net lines of code lands around 11,800 per turn, vs. ~1,000 when supervision is high for the same duration. About 11x. My guess for 2026-2027: the bottleneck shifts away from model coherence and toward CI queues, review capacity, and whatever infra has to sit underneath fleets of these agents running in the background. Having a surface area for agents to interact 24/7 will become business critical. Link in comments to the complete details. #AI #Agents #Claude #Autonomy