<aside> 🦥

Sources

State of Agent Engineering – LangChain

Effective context engineering for AI agents – Anthropic

Demystifying evals for AI agents – Anthropic

LLM Prompt Injection Prevention – OWASP

</aside>

🦥 Sloth's Simple Version

<aside> 🦥

Making an agent look smart in a demo is easy. Making one that works reliably when real people use it is the actual game. In a survey of 1,300+ teams, the #1 thing blocking agents from production was quality, not cost. These habits are how you get there.

</aside>

1. Start Stupidly Narrow

<aside> 🦥

Give the agent one clear job with clear inputs, outputs, and limits. Agents fall apart when the task is vague, that is when they hallucinate and wander off. Nail one task, then expand.

</aside>

2. Context Engineering Is The Whole Job

<aside> 🦥

When an agent fails, it is usually not a dumb model. It is the wrong context. Your job is to put the smallest set of high-signal info in front of the model at each step. Four moves to remember:

Sloth rule: context is like milk, best served fresh and condensed. 🥛

</aside>

3. Give Fewer, Better Tools

<aside> 🦥

A few well-described tools beat a pile of overlapping ones. Write clear tool names and descriptions, use formats the model already knows, and design tools that are hard to use wrong. If the agent keeps picking the wrong tool, fix the descriptions before you blame the model.

</aside>

4. Keep A Human In The Loop

<aside> 🦥

5. Evaluate, Do Not Just Vibe-Check

<aside> 🦥

You cannot improve what you do not measure. Build evals (test cases your agent runs against):

Then add observability: trace every step so you can see where it went sideways. Nearly 9 in 10 teams running agents in production do this. It is table stakes.

</aside>

6. Respect The Token Bill

<aside> 🦥

Agentic loops can burn tokens fast, and plenty of teams have been shocked by the bill. Use cheaper and faster models for easy steps, keep context short, and do not let the agent re-read everything on every single turn.

</aside>

7. Security Is Not Optional

<aside> 🦥

Prompt injection is the #1 security risk for LLM apps. Untrusted text (a web page, an email, a file) can secretly tell your agent to do bad things. Defend yourself:

🦥 Common Mistakes (Save Yourself The Pain)

<aside> 🦥

🦥 The Takeaway