Playwright — Ch. 15

I used to skip writing tests. Then a client’s checkout flow broke on a Friday evening. A CSS change had shifted a button off-screen on mobile. Nobody caught it because nobody tested on mobile before deploying. We lost an estimated $3,000 in transactions over the weekend before a customer complained Monday morning. That was the last time I shipped without end-to-end tests.

Playwright changed how testing felt for us. Not because it made tests easy — testing is still work. But because it made tests reliable. And then something unexpected happened: Playwright became the way AI agents interact with browsers. Claude Code doesn’t use a mouse. It uses Playwright. The testing framework became the eyes and hands of AI.

What Playwright is

An open-source browser automation framework built by Microsoft. It controls Chromium, Firefox, and WebKit with a single API. 33 million weekly npm downloads, 91% developer satisfaction in State of JS, 42% faster than Selenium.

But this chapter isn’t just about testing. Playwright now has three products: Playwright Test (the full test runner), Playwright MCP (giving AI agents browser control through structured accessibility snapshots), and Playwright CLI (a command-line interface designed specifically for coding agents, optimized for token efficiency).

Why Playwright wins

If you’ve written Selenium tests, you know the pain. Flaky tests. Random timeouts. We had a client project with 200 Selenium tests where 5-15 failed randomly on any CI run — not because the code was broken, but because tests raced the browser. Migrating to Playwright cut the flake rate to near zero.

Auto-waiting. Playwright waits for elements to be ready before interacting. Click a button — it waits until it’s visible, enabled, and stable. No manual sleeps.

Smart locators. Playwright encourages locators that mirror how users see the page, not how developers built it. Role-based and label-based locators survive CSS refactors. We had a test suite using selectors like .btn-primary.mt-4. The designer updated the spacing system. Thirty-one tests broke overnight, none testing anything related to spacing. We rewrote every locator to use roles. Tests haven’t broken from a CSS change since.

Test isolation. Each test runs in its own browser context. No shared state. Tests run in parallel without interference.

Playwright for AI agents

Here’s where Playwright’s story diverges from every other testing framework. Microsoft has explicitly built it to be the browser for AI agents.

Let Claude Code write your tests. Point it at your app with user stories and ask it to generate tests. I tested this on a client project with zero coverage. Gave Claude Code the user stories, pointed it at the running app. Forty minutes later: 23 tests covering signup, login, dashboard, invoices, and payments. Seventeen passed immediately. Four needed locator fixes. Two had assertion errors. Under two hours to get 23 working tests. Previous estimate for manual writing: two weeks.

Playwright MCP gives any MCP-compatible agent direct browser control through 34+ tools. The key innovation: it uses accessibility tree snapshots (2-5KB) instead of screenshots (500KB), preserving context window space. Ask Claude to navigate your staging site, log in, and verify a new feature — it opens a browser, navigates, clicks, and reports back.

Playwright CLI is the most token-efficient option for coding agents. A benchmark of identical 30-action flows found CLI used 4.6× fewer tokens than MCP. It saves page state to disk instead of flooding the context window. Microsoft’s own README says it: “Modern coding agents increasingly favor CLI-based workflows over MCP because CLI invocations are more token-efficient.”

Use MCP for exploratory automation and persistent browser sessions. Use CLI when you’re a coding agent juggling browser automation alongside a large codebase.

Practical workflows

Self-QA during development. Engineers at EltexSoft run Playwright before opening a PR. Claude Code opens the browser, tests the flow, captures screenshots. Takes 30 seconds instead of 5 minutes of manual testing. One developer caught a z-index issue where a dropdown rendered behind a modal — it would have taken a user exactly one click to find it.

Visual regression. Playwright has built-in screenshot comparison. First run captures a baseline. Subsequent runs compare and fail on any visual difference, generating a diff image. Catches CSS regressions that functional tests miss.

API testing. Playwright includes a full API testing client. Mix browser tests and API tests in the same suite. Mock external services to eliminate flakiness and test error scenarios.

The bottom line

Playwright is two things in 2026. The best end-to-end testing framework — faster, more reliable, and more developer-friendly than anything else. And the standard interface for AI agents to interact with browsers.

For testing: install Playwright, write tests using role-based locators, run them in CI on every push. Let Claude Code generate tests from your user stories. You’ll go from zero to meaningful coverage faster than you think.

For AI: install the CLI and skills for the most token-efficient way to give your coding agent browser control. The combination of AI-generated tests on a framework designed for AI is a feedback loop that makes your code more reliable with less effort.

This is the free web edition of Chapter 15. The full text — with Playwright test examples, MCP and CLI setup walkthroughs, Page Object Model patterns, CI/CD configurations, and visual regression guides — is available in 42: The AI Builder’s Stack, coming Q3 2026 on Amazon in hardcover, paperback, and digital.