You ask Claude Code to build a dashboard. It generates twenty components in minutes. Everything looks good until you notice the primary buttons disagree: bg-blue-500 in one place, bg-indigo-600 in another, #4F46E5 hardcoded in a third.
This isn't a model quality problem. It's a structural problem with how generic UI libraries interact with probabilistic code generation.
Why AI Gets Colors Wrong Consistently
Generic component libraries are built for human readers, not machine consumers. When a human reads an example using bg-blue-500, they understand from context that this is "the primary color in this project." When an AI reads the same example, it records a correlation: buttons sometimes use bg-blue-500. Next time it generates a button, it draws from the probability distribution of values it has seen — which might produce bg-blue-500, bg-blue-600, bg-indigo-500, or #3B82F6.
There's no design intent encoded in the value. The AI cannot distinguish "this is your primary brand color" from "this was the example in the docs."
The second problem is context window limits. Even with fifty existing components in context, an AI might misremember the exact shade used in component twelve, mix patterns from two different sections, or interpolate a value that doesn't exist in your system. The drift is gradual and hard to catch in code review because each individual choice looks plausible.
What Semantic Tokens Change
A semantic token encodes meaning, not just value. bg-primary tells the AI something bg-blue-500 cannot: this is the primary brand color, use it wherever primary brand color is appropriate.
The difference in generated code is categorical, not incremental. When semantic token names appear consistently in the codebase, AI generates consistent names. The output is anchored by the token vocabulary rather than floating across a probability distribution of color names.
// Without semantic tokens — what AI produces across sessions
<Button className="bg-blue-500 hover:bg-blue-600 text-white">Sign In</Button>
<Button className="bg-indigo-600 hover:bg-indigo-700 text-white">Create Account</Button>
<Button className="bg-primary-600 hover:bg-primary-700 text-white">Continue</Button>
// With semantic tokens — what AI produces with a token-aware system
<Button variant="primary">Sign In</Button>
<Button variant="primary">Create Account</Button>
<Button variant="primary">Continue</Button>
The second pattern requires no manual cleanup. The token resolves to whatever your system defines as primary — and changes to that definition propagate everywhere.
The Additional Cost That Doesn't Show Up in Estimates
Teams that generate UI code with AI tend to estimate the generation time but not the cleanup time. The cleanup is consistent: finding hardcoded values, aligning colors back to the system, explaining in PR review why one button is a different shade than the others.
With semantic tokens, the cleanup loop breaks. AI generates variant="primary" because that's what appears in the codebase. The color comes from the token definition. Design system intent transfers from the token to the output without an intermediate human correction step.
The other cost that's easy to miss is theme support. If your tokens map to CSS variables — var(--foreground-accent), var(--background-page) — then dark mode comes from redefining the variables, not from duplicating component logic. AI-generated components inherit this automatically because they use the same token names.
Setting Up Semantic Tokens with FramingUI
FramingUI's components use semantic CSS variables throughout. Install with pnpm:
pnpm add @framingui/ui @framingui/core @framingui/tokens tailwindcss-animate
Button variants map directly to semantic roles:
<Button variant="primary">Primary Action</Button>
<Button variant="secondary">Secondary Action</Button>
<Button variant="destructive">Delete</Button>
Each variant resolves through the token layer — primary uses var(--foreground-accent), destructive uses var(--border-error). AI tools working with FramingUI components generate these variants by name rather than guessing color values.
The MCP server extends this further by making your token definitions queryable at generation time:
npx -y @framingui/mcp-server@latest init
After running init, Claude Code can retrieve your actual token catalog when generating UI, rather than relying on training data patterns.
The Practical Implication
Generic UI libraries work well when humans write code carefully. Careful is not a property of probabilistic generation. The cleanup cost is not a one-time debt — it recurs with every generation session until the underlying signal problem is fixed.
Semantic tokens fix the signal. AI generates consistent names because consistent names are what the codebase contains. That's the mechanism, and it doesn't require prompting discipline or post-generation review to enforce.