Smolagents are amazing
Smolagents is one of those projects that will change the industry. It was released at just the right time to build upon the recent open-source advances in reasoning models like deepseek-R1.
The key insight here is that LLMs have been trained on a large corpus of Python code and can express procedures in Python better than in English or JSON. So let the LLMs do that and then evaluate the code and give feedback to the model. Huggingface calls this process an agent.
Reasoning models build upon the LLMs knowledge of Python to produce reasoning about code. They do this by talking through the problem using thinking tokens. “Talking to themselves” gives them room to express concepts and then prompt themselves to improve their decisions.
In my experimentation, I have found that the reasoning models respond well to feedback and can fix errors in their code without any help from humans. They only need feedback from the real world. This almost entirely eliminates hallucinations because they can reconcile their knowledge with reality and correct their mistakes.
Let’s put aside the abilities of the reasoning models for a moment and look at the code. This repository is an example of really great software architecture and development. It uses exactly the right amount of abstraction for the problem at hand, which is not very much.
The CodeAgent is the innovative part of the project. It uses a LocalPythonInterpreter (or a remote sandbox) to evaluate the code generated by the model. The LocalPythonInterpreter code is a great example of how to build an interpreter for a language.
The LocalPythonInterpreter is a great start if you wanted to build a Python interpreter of your code. Here the code is parsed into a abstract syntax tree.
expression = ast.parse(code)
After it is parsed into an expression it is recursively evaluated. There are
evaluate_
result = evaluate_ast(node, state, static_tools, custom_tools, authorized_imports)
There is a state
dictionary that is passed along with the evaluation of the
AST. I have used that to trace the tools that were called during the
evaluation.
The LocalPythonInterpreter is a hidden gem inside this project. It may not be extremely secure, it is a great tool for learning about building interpreters.
The CodeAgent works by defining tools that can be used within the Python code that is evaluated by the LocalPythonInterpreter. These tools can call any Python code and gives the agent access to the real world.
Exploring this project raised a few questions for me:
- How well could an agent incorporate feedback from the tools?
- How good are they at problem solving when given incomplete information?
- Can they reach out into the real world to find out what they need to know to complete a task?
- What if we gave them access to all the tools that developers have while they are coding? Can they produce the same or better code?
In future posts I’ll explore these questions with tools that interact with real world systems to see how they fare.