Ai-Agents

Build an eval harness for 184 AI agent prompts with promptfoo

How to build an LLM-as-judge eval system that scores AI agent prompts on quality, identity, and safety.

Git hooks are your best defense against AI-generated mess

Git hooks have always enforced standards before code enters a repo. With AI agents writing commits autonomously, they’ve become essential.

Skills for applying codified context to your own codebase

Two Claude Code skills for applying and maintaining the three-tier codified context architecture — what they do, how they work, and how to get started.

Cold memory: specs, MCP tools, and on-demand context retrieval

How subsystem specs and MCP retrieval tools handle architectural knowledge too large for hot memory — and why stale specs are worse than no specs.

Domain specialist skills: teaching AI to think like your senior dev

What specialist skills are, why the 50% domain knowledge rule matters, and how waaseyaa’s spec-backed orchestration keeps AI consistent across a 29-package PHP monorepo.

Writing a CLAUDE.md that actually works

How to structure your CLAUDE.md as a routing layer so AI agents always know where to look.

Why AI agents lose their minds in complex codebases

Token limits aren’t the real problem with AI in large codebases — inconsistent context is. Here’s what breaks and why a three-tier architecture fixes it.