Learn to develop, evaluate, and continuously improve AI Agents using Coding Agents, W&B Weave, and W&B Skills. We named this learning content Agentforge — forging AI Agents to be stronger.
Give an AI agent a real setup and let it experiment autonomously — modify, evaluate, keep or discard, repeat. You wake up to a log of experiments and a better system. Any metric you can efficiently evaluate can be autoresearched.
"All LLM frontier labs will do this. It's the final boss battle."
— Andrej Karpathy
Choose a course to begin your learning journey.
Use W&B Weave as the foundation for tracing, evaluation, and monitoring of AI Agents, and learn to improve them with Coding Agents and W&B Skills.
Hands-on exercises and practical assignments for the AI Agent Quality course.
You use coding agents to build and optimize AI agents, and want a structured workflow for it.
You want to learn how to go from human evaluation to building a scalable evaluation framework.
Your team wants a repeatable process for continuously improving agents in production.
You want hands-on experience using Weave for evaluation, monitoring, and labeling of AI agents.
You want to learn implementation and evaluation best practices for Auto Research agents.
This course uses W&B Weave and W&B Skills as core tools throughout every chapter.

The observability and evaluation platform for AI applications.
Reusable agent workflows for optimization and improvement.