Skip to content
← Back to work

AI Agent Workflows for Automated Testing

Agentic workflows that automate unit test generation and documentation, reducing manual effort and increasing coverage across large codebases.

Claude APIAgentic WorkflowsCI/CDTypeScript

The Problem

The codebase had grown to hundreds of components across multiple micro-frontends, but test coverage on business logic was low. Engineers spent significant time on mechanical test writing — setting up mocks, writing boilerplate assertions, matching project conventions — rather than thinking about what to test.

Approach

Designed an agentic workflow using the Claude API that operates in four stages: analyse (dependency graph, type scanning), plan (human-reviewed test plan), generate (convention-matching test code), and validate (run tests, self-correct up to 3 iterations). The key design decision was keeping humans in the loop for the planning step — the agent handles the mechanical work, engineers decide what's worth testing.

Architecture

The agent runs as a CLI tool integrated into the development workflow. It reads source files and their imports, builds a context window with project conventions (existing tests, shared utilities, custom matchers), generates a Markdown test plan for review, then produces Vitest + React Testing Library tests. A validation loop runs the tests and iterates on failures. All generated code goes through standard code review before merge.

Results

Significant coverage improvement on business logic within the first quarter. Time to write tests reduced substantially — engineers spent time reviewing and refining instead of writing from scratch. Generated tests surfaced real bugs that existing tests had missed. Multiple teams adopted the workflow as part of their standard development process.

Lessons Learned

AI-augmented engineering works best when the AI handles mechanical tasks and humans handle judgement calls. The plan step is essential — skipping it leads to tests that pass but don't test the right things. Investing in context (project conventions, test utilities) dramatically improves output quality. Never let the agent commit directly; code review catches the subtle issues automation misses.