The Fetch: Week 25, 2026

OCR without page limits, agents joining the org chart, fund sleuthing, and faster Tailwind cleanup

Anshul Desai

Jun 24, 2026

Unlimited Ocr: OCR Stops Thinking in Pages

github.com/baidu/Unlimited-OCR | License: MIT

The Motion: One Shot Beats Document Stitching

Unlimited Ocr is chasing a very specific pain point: OCR that falls apart the second a document gets long, multi-page, or visually messy. The big sell is one-shot long-horizon parsing, which lets the model read across entire documents instead of treating every page like an isolated island. It handles single images, multi-page inputs, and PDFs, with a 32,768 context window pushing way past the usual OCR comfort zone. Honestly, that’s why stars showed up fast. The repo landed right as people are getting tired of brittle page-by-page extraction pipelines.

The Wave: Document AI Wants Longer Memory

The interesting part is who this pulls in next. Anyone building invoice extraction, report ingestion, research pipelines, or enterprise document tools should be watching closely, because multi page parsing is where a lot of OCR stacks still get weird. This feels like the next step after strong OCR models that can read a page but still struggle to understand a document. What would make this unstoppable is a dead-simple benchmark story for real-world PDFs, especially ugly scans, tables, and mixed-layout files. If those wins stay obvious, Unlimited Ocr has a real shot at becoming the default thing people test first.

Stars: 6,215 | Language: Python

Zhengxi Views: Fund Research With Receipts

github.com/lyra81604/zhengxi-views | License: Other

The Motion: Traceable Alpha Brain

This repo turns one fund manager’s public record into a surprisingly usable AI skill. Zhengxi Views is built on 2012 to 2026 source material from Zheng Xi, then layers in method distillation backed by direct quotes and real fund data across his own products plus roughly 27,000 market-wide funds. That combo is why people are starring it now. AI finance tools usually sound confident and make things up. This one is obsessed with provenance, and that hits a nerve fast when everyone is tired of slick investment answers with zero receipts.

The Wave: Niche Today, Sticky Tomorrow

The interesting part is how portable this feels. It is not just a knowledge base. It is an opinionated Agent Skill that works across Claude, Cursor, ChatGPT, Gemini, and WorkBuddy, with extras like scorecards and fund comparison baked in. That makes it bigger than a fan project. Anyone building domain-specific AI research tools should pay attention, especially in finance where trust matters more than fluency. The next move would be making the trust layer even more visible with cleaner demos, benchmark evals, and side-by-side examples of sourced answers versus generic AI output. That would make this unstoppable.

Stars: 987 | Language: Python

Agent Apprenticeship: Agent Work Should Compound

github.com/Forsy-AI/agent-apprenticeship | License: MIT

The Motion: Training Data From Actual Work

This one is chasing a big idea with surprisingly concrete parts. Agent Apprenticeship turns everyday agent runs into reusable experience, with Contribution Bundles, Experience Packs, and a public ecosystem for sharing traces, lessons, and task rollouts. The gap is obvious: agents keep re-solving the same problems from scratch, and all that useful execution data disappears. Here, each run can feed the next one. Honestly, the traction makes sense because it ships with 500+ seed tasks, 1,000+ execution traces, and support for tools people already use like Claude Code, Cursor, and Codex.

The Wave: Open Source Apprenticeships for Agents

If this keeps moving, it could become the default exchange layer for agent learning outside closed labs. That matters for anyone building long-horizon workflows, local agent tooling, or post-training pipelines that need more than benchmark theater. The interesting part is the economic angle: tasks are not just completed, they are treated like reusable job training for future agents. The next move is making trust and quality signals impossible to ignore. Better ranking, provenance, and clear feedback loops would make this feel unstoppable as more community-contributed experience starts flooding in.

Stars: 901

Agent Space: Slack for Digital Coworkers

github.com/HKUDS/AgentSpace | License: Apache-2.0

The Motion: Agents Move Into the Org Chart

Agent Space is building an agent-native workspace where humans and AI workers actually share the same operational context. The standout is AgentRouter, which lets the same agent hop across Claude Code, Codex, OpenClaw, Hermes, and OpenCode without losing identity, task history, or permissions. That matters because most agent setups still live in one person’s terminal and fall apart the second a team gets involved. Honestly, the traction makes sense. This is catching stars now because it turns scattered agent experiments into something durable, governable, and team-shaped.

The Wave: Governance Becomes the Product

The bigger bet here is that companies will want digital employees with owners, approvals, audit trails, and transferability, not just clever chats with tools attached. Agent Space is early, but the direction is very right for ops teams, AI platform builders, and anyone trying to make agents survive contact with real orgs. The interesting part is how much it treats permissioning and handoffs as product features, not cleanup work. The next move that would make this unstoppable is a dead-simple onboarding path with killer templates for common teams, because distribution follows clarity.

Stars: 349 | Language: TypeScript

Cnfast: Tailwind Class Merging, Minus the Drag

github.com/aidenybai/cnfast | License: Other

The Motion: The Smallest Bottleneck Finally Matters

Cnfast is a drop-in replacement for the cn helper that powers half the Tailwind component ecosystem, especially in shadcn-heavy apps. The pitch is suspiciously clean: same API, byte-identical output to tailwind-merge, and benchmarks showing 3.8x faster performance on average, with bigger wins on re-render-heavy UIs like data grids and dashboards. The interesting part is the timing. Frontend teams are getting more performance-sensitive, and this attacks a tiny utility that gets called everywhere. The built-in migrate command and shadcn registry install make switching almost frictionless.

The Wave: Performance Nerd Candy With Mass Appeal

This has real breakout energy because it turns an invisible primitive into an obvious upgrade. Teams shipping Tailwind at scale, design system maintainers, and anybody living in component churn should pay attention. The tagged template mode is especially sharp, with call-site caching that squeezes even more speed from repeated renders. Honestly, projects like this spread fast because they ask for almost nothing and give back measurable gains. The next move that would make this unstoppable is broader engine-level guidance, especially around Bun and Safari, so adoption feels just as confident outside the V8 world.

Stars: 870 | Language: TypeScript

Discussion about this post

Ready for more?