Hands-on with W&B and Coding Agents

Agent
Auto Improving

Learn to develop, evaluate, and continuously improve AI Agents using Coding Agents, W&B Weave, and W&B Skills. We named this learning content Agentforge — forging AI Agents to be stronger.

What is Auto Research?

Give an AI agent a real setup and let it experiment autonomously — modify, evaluate, keep or discard, repeat. You wake up to a log of experiments and a better system. Any metric you can efficiently evaluate can be autoresearched.

"All LLM frontier labs will do this. It's the final boss battle."

— Andrej Karpathy

Courses

Choose a course to begin your learning journey.

Who is this for?

Agent Optimization with Coding Agents

You use coding agents to build and optimize AI agents, and want a structured workflow for it.

From Human Eval to Automated Systems

You want to learn how to go from human evaluation to building a scalable evaluation framework.

Continuous Agent Improvement Teams

Your team wants a repeatable process for continuously improving agents in production.

W&B Weave Power Users

You want hands-on experience using Weave for evaluation, monitoring, and labeling of AI agents.

Auto Research Best Practices

You want to learn implementation and evaluation best practices for Auto Research agents.

Powered by W&B

This course uses W&B Weave and W&B Skills as core tools throughout every chapter.

W&B Weave

W&B Weave

The observability and evaluation platform for AI applications.

  • TracingRecord every agent action with full context
  • EvaluationsStructured quality assessments with datasets and scorers
  • LabelingHuman review workflows that feed into automated evaluation
  • MonitoringProduction dashboards, alerts, and regression detection
  • FeedbackCapture user signals and route them to improvement pipelines
Learn more about Weave →

W&B Skills

Reusable agent workflows for optimization and improvement.

  • Agent OptimizationImprove prompts, retrieval, and knowledge bases from evaluation data
  • Evaluation SchemesCreate scorers and evaluation pipelines
  • Weave IntegrationAdd tracing and monitoring to existing agents
Learn more about Skills →