
The Rise of Reasoning Models: How AI is Learning to Think Step by Step
A concise introduction to reasoning models as the next frontier in AI, highlighting their structured thinking capabilities beyond text prediction.
Peek into the practical side of AI agent development. We cover key frameworks like LangGraph, essential protocols like the new MCP and A2A, and tips to help developers build effective AI agents.
You've seen it, right? LLMs got incredibly good, incredibly fast. They are now at the level of an intern dev, making plenty of mistakes, but at least very enthusiastically.
That rapid progress naturally makes us ask: what’s next? And how do we bring out the most of what we have today?
The continuing maturation of "AI agents" is one trend we can be sure of.
AI systems that don't just talk but take actions in the digital world use LLMs as their brain.
Usually, we see three key functions:
Still, the current tech has its challenges, like LLM context limits and the occasional hallucination.
Though there are ways to manage them, you often need use case specific workarounds.
In this post, we want to cut through the hype and give you a practical overview of where AI agent development is right now—plus some predictions for the near future. We'll look at the core ideas, the frameworks we use, and some tools that might change how you build things.
For the broad strokes of how you should effectively build agentic systems, Anthropic’s blog post is probably the gold standard. Key points:
1. Distinction between agentic workflows and full agents
2. Principles
3. Customization to use case
These points resonate with the earlier overview of our own agent development approach, which is very accessible even for non-technical audiences. (Give it a read if you need to cover more basics.)
So we’ve covered how to think about building agents, but which actual tools and concepts do we use to put them into practice?
Now, AI agents didn't appear overnight. Here are some highlights from recent years that made an impact on our daily work:
We first had foundational concepts like ReAct and Tree of Thoughts establish how the building blocks of agents should collaborate. These were put in practice, operationalized through the help of software frameworks like LangChain and LlamaIndex.
Initial excitement around early agents, like AutoGPT, exposed challenges. This led to us focusing on more structured, controllable systems.
There's also a clear trend towards frameworks offering explicit control over execution logic and state, moving from simple chains to graph-based approaches and multi-agent orchestration.
That brings us to LangGraph, made specifically for building stateful agents using controllable graph workflows. For us, it quickly became a go-to for building agents in 2024, and still is today. A huge plus is its checkpointing system. Every step, every state the agent goes through in the graph, can be saved. You can see exactly how many tokens were used, what decisions were made. This makes monitoring way easier. You don't have to build a custom logging system from scratch. You can save this state in memory, or in a SQL database, and easily pull out all that info after the agent has run. It gives you incredible visibility into what the agent actually did (see the demo towards the end of our latest webinar). Of course, it’s far from the only option if you’re looking for these capabilities, and there are other players who specialize in different interaction patterns.
Initially, we eagerly followed the progress that seemingly took parallel tracks: flexible open-source frameworks, vs. integrated platform solutions of big labs. Luckily, there is now a stronger initiative for interoperability and standardization in context management and tool integration.
OpenAI has just released their own 34 page long practical guide to building agents. (See page 17 for multi-agent systems.) This white paper is great for starting out, however, there are actually some who question its contents.
Harrison Chase, the CEO of Langchain, took it deeply to heart when OpenAI referred to their own Agents SDK 'framework' as non-declarative, subtly criticizing more declarative frameworks like LangGraph or LlamaIndex Workflows to be inflexible, complex, hard to maintain—which is misleading.
In reality, the mentioned frameworks enable developers to clearly define high-level structure while also having control over behavior and logic precisely by adopting declarative and imperative approaches.
This is missing in OpenAI's Agents SDK—or Google's for that matter. They are both sets of abstractions. Great to quickly put together demos, spin up tool-calling agents, but they do not have robust orchestration layers. Simply defining handoffs and tool calls will yield an AI app with less human control (high variance) and 'fake' flexibility.
Chase was motivated enough to start comparing agent frameworks, and put together a spreadsheet, (which will hopefully be updated from time to time).
With all that said, the trend is clear:
Agentic systems nowadays are much more reliable and easier to build, despite the growing number of capabilities and complexity of available options and sources. How is that possible?
Building an agent's "brain" is one thing. The trick was always getting them to reliably interact with APIs, databases, files, other tools, and other AI agents.
Historically, every time you wanted your agent to use a new tool or data source, you had to build a custom integration. Which could have scaling or reliability issues, or both.
Enter Model Context Protocol.
Anthropic kicked off MCP in late 2024, aiming to create a universal standard for connecting LLMs to anything. Not just their own models, but anyone’s.
How’s it different from tool calling? Well, it works consistently across different models and platforms.
Whether it's talking to a Git repo, a SQL database, or a web API, MCP aims for minimal setup.
It basically works like this:
Why is this a big deal for developers?
It makes adding tools to your agents way easier, especially for frameworks like LangChain or even ReAct-style agents.
Define your tool server once. Hook it into your custom code, whatever. No more reinventing the wheel for every integration.
But MCP only really started gaining steam when OpenAI embraced the standard in March 2025 as well, along with their SDK and Responses API releases.
The community jumped on it. We quickly saw collections like the awesome-mcp-servers repo pop up, linking agents to hundreds of real tools.
Just last week, LangChain even launched an open-source library to connect any LLM to MCP tools easily.
Then, Google entered the chat with their own standard agent dev kit (ADK) and Agent-to-Agent Protocol (A2A). Things are moving fast!
A2A focuses specifically on how multiple agents can collaborate securely and effectively, especially across different platforms or when tasks take a long time.
MCP acts as the standard for an agent (the "Host") to talk to its tools (via MCP Servers). A2A, on the other hand, is designed for agents to talk to each other. It adds features MCP doesn't focus on, like secure authentication between agents, managing shared tasks and state, and discovering what other agents can do.
They're complementary. You might have multiple agents (communicating via A2A) where each agent uses MCP to access its own set of tools. This distinction is quite important.
It's still early days, and we might see some competition as both protocols evolve. But the overall trend towards standardization is great news for us devs trying to build more complex, interconnected systems.
So, these tools and frameworks are great incremental steps. But what's the bigger picture?
Many expect the next leap to come from inference scaling, aka allowing models to run much longer, more complex planning and action sequences.
Agents would be one step closer to automating entire workflows, not just small parts of tasks.
There's this idea floating around: a new Moore's Law for AI agents. The length of a task an AI can complete autonomously is doubling every 7 months or so.
Today, agents can tackle tasks that take humans an hour. If trends hold, in a few years, they might handle tasks that take us a month. That's a path towards seriously powerful capabilities.
What does this mean for us developers (besides a lot of reward hacking)?
Will AI make our jobs obsolete? We don't think so, at least not in the next 5-6 years.
Instead, we foresee a shift:
The generator vs validator problem won’t go away for some time (AI generates, humans validate the results). Even though we already see o3 being able to do almost 50% of OpenAI’s internal pull requests (see figure 20 of their System Card).
But the idea of "vibe coding"—prompting without deep technical understanding—probably won't scale very well in the short term.
Enterprise-grade applications still need skilled humans to:
Until we reach superintelligence or the literal technological singularity, it's still humans augmenting AI for almost all real-world uses.
But even if we go with the most conservative and grounded scenario of AI evolution, the trend that coding assistants themselves are becoming increasingly autonomous and capable is undeniable. Tools like GitHub Copilot, Cursor, Replit, Firebase Studio or Codex CLI can tackle bigger tasks across your whole project, suggesting edits, analyzing errors, etc… helping generate full-stack apps from prompts in record time, if you know what you’re doing.
You could say that yeah, AI assistance in your IDEs is great for speeding up things. But deploying full-stack applications is a bit more involved. Indeed, it is.
We will be releasing a writeup—coming next month—with specific best practices and architecture blueprints for every step of the development lifecycle (code review, bug fixes, testing, etc). This white paper will be sourced directly from hands-on experience from our most recent, real-world client projects. If you really don’t want to miss it, follow us on Linkedin.
Beyond the immediate impact on developer workflows and tools, deploying agents at scale across an organization suggests a fundamental shift in how enterprises might operate. Thinking about that future, here are some potential operating principles that seem to be emerging:
Some takeaways from our recent projects and the overall trends discussed above:
So, what's the bottom line?
A clear direction towards simplification.
Building basic agents, especially ones that call tools, is definitely getting easier.
As the underlying models get smarter, they also need less hand-holding.
Expect less reliance on rigidly defined workflows. We’ll only be needed for very high level agent design.
The goal is for agents to deeply understand what you need and figure out how to get there themselves. That's the exciting part.