What remains of quality engineering when AI writes the code?
Fifteen years of test architecture and quality engineering — at insurers, banks, government and national utilities. Since last year I've been applying that discipline to the agentic AI layer on top: agent orchestration, review gates for LLM output, and the audit trail regulated sectors still need. On this site you'll find how I work, the cases where it showed up, four agents running in production, open-source templates on GitHub, and the pieces I publish.
What I do myself is selective. What I share isn't.
Four pillars beneath a software development process where quality stays in hand.
Every engagement touches all four, in different mixtures. My approach is always effective and pragmatic: the best solution that fits your organisation for the long term.
Test architecture that outlives the team.
Software testing patterns, AC/TS traceability, per-feature coverage matrix, Language-First. The foundation for any quality architecture, with Cypress and/or Playwright. Built so AI coding agents and the surrounding Quality Gates can safeguard quality continuously.
AI coding agents in your release pipeline.
Claude Code, Cursor, Copilot. In-repo subagents, AGENTS.md, slash commands, review gates. The agents run in your repo and your CI; your source never ends up anywhere else.
Audit trail by design.
DORA, GDPR, NIS2, ISO/IEC 25010, TMMi. Every change traces back to an acceptance criterion, every gate is documented. What you do, you can defend — to an inspector or to internal audit.
Quality as a financial story.
A report that both your CFO and your auditor understand. What does a production bug cost? What does a green CI save? I make the invisible costs in the development process visible and actionable.
Where the work has been applied.
A selection. The same discipline, in different contexts — from a worldwide government rollout to a solo SaaS I build myself.
Solo SaaS, four agents.
Founder · Full-stack with Claude Code
A customer-deployed dashboard that turns Jira / Azure DevOps / GitHub / GitLab signals into financial ROI for QA. Four in-repo subagents: release-reviewer, deploy-monitor, onboarding-smoke-tester, requirements-guard.
Their AI test stack, productized.
Architecture · Framework · Claude Code skill
Built an AI-augmented Playwright architecture, framework and reusable Claude Code skill for a Dutch digital agency. Designed to plug-and-play into any current or future client engagement, not for a single project. Codifies project structure, Page Object pattern, AC/TS traceability and per-feature coverage matrix. The agency now ships AI-augmented test suites to clients through a single Claude Code skill — productized AI-testing assets at agency scale.
Quality Framework rollout.
Quality Assurance Manager
TMMi-aligned Quality Framework on top of ISO/IEC 25010, embedded in delivery pipelines for a national utility. Quality maturity expressed in financial impact — defensible to a CFO and an auditor in the same room.
Language-First in gov.
Cypress + Playwright architecture
Test architecture across multiple government departments where different testing tools, specifications, scenarios and tests share one continuous human-readable layer. Presented at CypressConf 2024 — "Beyond the Battle: Empowering Test Automation with a Language-First Approach." The same Language-First approach I now extend into AI-augmented delivery.
Architect for the long run.
Test Automation Architect
Cypress + Lit Elements test architecture with Cucumber traceability, integrated into Azure DevOps. Page Object discipline and spec-to-test traceability that let the team keep the suite maintainable after I left. Every change anchored to a spec, every spec traceable to an acceptance criterion. Built to outlive me; handed back to the team.
Global rollout, audited.
QA Architect & Test Manager
Test management for a worldwide rollout under ISO 25010 / TMap discipline. Every change traceable, every gate documented, every decision defensible to an inspector. Earlier engagement covered Cypress, Angular and Docker on Azure DevOps.
What I share openly.
Most of what I do at clients can't be shared. This can: two orchestration templates for Playwright and Cypress, the pieces I publish on Medium, and a Playbook landing later this year. Fork what's useful; tell me if it improves.
orchestration-playwright-agents
Drop-in Claude Code orchestration template for Playwright E2E: master prompt as a skill, 8 specialised sub-agents, slash commands, starter e2e/ folder. Adapt it to your repo in a day.
View on GitHub →orchestration-cypress-agents
Sister template for Cypress: master prompt as a skill, 8 sub-agents, slash commands, starter cypress/ folder. Same pattern, framework-native.
View on GitHub →Field notes from AI-augmented engineering
Long-form on Language-First test design, in-repo subagents and the messy reality of shipping with LLMs. Picking up cadence in Q3 2026 — see the writing roadmap.
Read on Medium →The AI-Augmented E2E Playbook
15-page PDF: Language-First architecture, AGENTS.md scaffolds, Page Object pattern with AC/TS traceability, per-feature coverage matrix. Bundles three Medium pieces into one printable artifact.
Get notified →What turns out to work in CI pipelines.
Four review gates I distilled out of client work over the last two years. They run in my own codebase and at a handful of teams. No catalog, no pre-order — only what I'm willing to show because it makes it to production. Anyone who wants to try one knows where to find me.
release-reviewer
Reviews every push for risk patterns: secrets in the diff, coverage thresholds, destructive migrations, touched auth code. Posts a verdict on the PR with the failing rule IDs. Running on every commit in my own codebase since 2024.
Email me about it →deploy-monitor
Verifies container digests on the target VPS match the released artifact. Catches the silent drift between "CI was green" and "what's actually running in production."
Email me about it →onboarding-smoke-tester
Walks the full onboarding flow end-to-end through the real API on every release. Catches the "registration is broken in prod" class of regression before a customer does. Runs independently; opens an issue on failure.
Email me about it →requirements-guard
Reconciles the written spec against the live code on every PR. Flags drift between what was promised and what was built — before it reaches an auditor or a customer. The discipline the other three agents lean on.
Email me about it →Three sectors, one recurring conversation.
The domains I've worked in for fifteen years: insurance, financial services and government. The common question — from auditor, regulator, internal audit — is how AI-augmented delivery stays explainable to someone who doesn't read code.
DORA is here. So is DNB.
Insurers, banks, payment platforms, asset managers. DORA, GDPR, NIS2, internal audit and third-party ICT risk — plus the regulators behind them. For Dutch insurers: DNB and AFM oversight, Wft implications, Solvency II reporting systems, IFRS 17 reconciliation pipelines. The regulator isn't asking whether you use AI any more — they're about to ask how you control it.
Auditable at delivery, by default.
Ministries, public-service implementers, government IT bodies. Algoritmeregister, AVG, BIO, NPR 5326, EU AI Act. AI-assisted delivery that survives both an inspector and a change of administration — with privacy and data residency answered by architecture, not paperwork. The discipline I built at RvO and the Ministry of Foreign Affairs.
Two questions. One answer needed.
CTOs, VPs of Engineering, Heads of QA in regulated organisations. Since Claude Code, Cursor and Copilot accelerated everything, two questions land on your desk: the auditor wants to know how it's controlled, the CFO wants to know what it's worth. One story for both, or you have the conversation twice.
The work doesn't fit everywhere.
A generic AI vendor with no regulatory story, a one-off Cypress audit divorced from architecture, or a pure consumer-internet context where "move fast, break things" is still the operating model — that's a different field. More honest to name it here than discover it in week six.
The three questions your CISO, DPO and auditor ask first.
Honest answers, named risks. The Trust & Data pack — sub-processor list, DPA, regional data-flow diagram, continuity arrangements, security questionnaire — is available to send to your inkoop, DPO and internal auditor before the first POC.
Inside your repo. Inside your CI.
The agents run inside your repository and your CI runners — no proprietary cloud holds your source. LLM access goes through your existing Claude Code, Cursor or Copilot enterprise tenant: your region, your DPA, your training opt-out. Sub-processor list, regional flow diagram and DPA highlights ship with the engagement pack.
Solo founder. Named risk.
Paul is one engineer; pretending otherwise wastes everyone's time. Continuity arrangements — runbooks, named backup contractor, source-code escrow options — are scoped per engagement and signed before kick-off. Request the Trust & Data pack for the specifics that apply to your contract shape.
NL-bijsluiter, DORA / AVG / Wft.
KvK-registered company, standard DPA, sub-processor list, security questionnaire (CAIQ-lite) and a Nederlandstalige one-pager covering DORA, AVG en Wft-implicaties — voor inkoop, DPO en interne auditor. De bijsluiter wordt op aanvraag toegestuurd; vraag 'm aan via de knop hieronder.
What I bring into your repo.
Pragmatic, opinionated, and chosen for AI extension — not novelty.
15+ years across enterprise & government.
A selection — earlier roles span ING, SBB, Ministry of Foreign Affairs, ZLM, KPN and lecturing at The Hague University of Applied Sciences.
Pragmatic guides on Cypress, testing & automation.
On Medium since 2020, with 10+ deep-dives on Cypress patterns, ROI for testing, and test strategy. New AI-augmented engineering pieces landing on this site through 2026.
Pro-tip: stub the window object. A practical walkthrough for the multi-tab problem Cypress users hit constantly.
Cross-origin testing is finally there. What changed, what to watch for, and how to migrate your auth flows.
And why you should care. Where the lines are, why teams confuse them, and how to pick the right tool for the assertion.
On stage, on a podcast, in your team's Slack.
Cypress.io Ambassador, conference speaker, certified didactical trainer.
Cypress Ambassador
Active community work
Conference Speaker
CypressConf 2024 + 2025 workshops
Certified Trainer
Software testing & QA · Post-HBO didactical
Talks on YouTube
Recorded talks and workshops.
Cypress: The Bad Practices Workshop
Hands-on tour of the Cypress anti-patterns we keep meeting in real codebases — and how to refactor out of them. Co-presented with Frits van der Sloot.
Watch on YouTube →Beyond the Battle: Empowering Test Automation with a Language-First Approach
How specs, scenarios and tests can share one continuous human-readable layer — and why that shape makes AI extension tractable.
Watch on YouTube →Effective Test Automation Design
The architecture decisions that make a test suite outlive the team that wrote it — Page Objects, traceability, and the discipline behind a pyramid that holds.
Watch on YouTube →QualityProfit
My solo SaaS that makes quality costs visible for teams. Same discipline, in product form. The four agents above run inside it today.
An hour, your release pipeline, and honest questions.
No sales call. Write me a few lines about your team — that's enough — and I'll send a short agenda back. If it fits, we go further. If it doesn't, I'll say so honestly.