I Built the /html Skill Tariq Said Not to Build

Last week I read Tariq’s post on the unreasonable effectiveness of HTML and had that very specific sensation of someone reaching into your skull, pulling out a thought you hadn’t quite finished forming yet, and naming it on a public blog. The man had articulated something I’d been doing accidentally for months — asking Claude for HTML files instead of markdown when I wanted to actually look at the output. Status reports. Plans. Code reviews. Anything I’d want to share with another human and not have to apologize for first.

Halfway down the post, he says this:

I’m a little bit afraid that people will read this article and turn it into a /html skill or something. While there might be some value in that, I want to emphasize that you don’t need to do much to get Claude to do this. You can just ask it to “make a HTML file”.

I read that. I nodded thoughtfully. I made a small considered noise that suggested I agreed with him. Then I spent a weekend building exactly the /html skill he warned against, with the studious focus of a man who’d been asked to do it.

TL;DR. I built a Claude Code skill that turns “make me an HTML status report” prompts into single-page artifacts that look like siblings of Tariq’s twenty demos. Shared styles.css, a patterns.md of HTML building blocks, twenty frozen reference demos, a SKILL.md that routes intents. Verification regenerated one of the demos from scratch using only the skill files and surfaced three gaps in my own design. Externalized the CSS as a <link> to save ~40% in output tokens and stopped pretending the single-shareable-file promise was load-bearing. Tariq was right that you don’t need a skill on day one. I think he’s wrong that you don’t want one by day seven.

This post is three things at once: a trip report from that weekend, a quiet argument for why visually rich notes will rewire how you work with AI, and a small love letter to the moment a workflow folded back on itself and started cross-examining its own output. Stay for at least one of them.

The Soft Default Hits a Wall

Tariq’s argument against the skill is that it would over-constrain something that already works as a soft default. Just prompt. Don’t add ceremony. He’s right about day one. I would have actively resented reading a 200-line SKILL.md before figuring out that I could just ask for an HTML status report and it would come out fine.

But after a few weeks of asking, I had a new problem: the outputs didn’t look consistent. Lined up next to each other, the artifacts I’d generated over a month read like five different consultants had been hired by the same client and quietly stopped speaking to each other halfway through the engagement. Specifically:

One came back with the clay/serif/ivory thing Tariq’s demos use, and I felt very proud of myself for a moment.
The next was Tailwind beige with rounded-2xl shadows, the kind of design that made me feel like I was about to be asked to upgrade to Pro.
The third looked like a Notion export from 2022 that had been forgotten in a junk drawer.
The fourth was, somehow, a midcentury Bootstrap revival with a gradient header that hadn’t been cool since 2014.
The fifth was actively beautiful but in a way that suggested it had been redesigned by a startup founder who’d just discovered Figma and was very excited.

Each artifact, in isolation, was fine. Lined up, they looked like the AI equivalent of a corkboard at a senior project showcase, with every poster solving the same problem in the dialect of a different professor.

The thing I wanted to lock in wasn’t HTML qua HTML — Tariq is right that the model can do that without help. The thing I wanted was a fixed visual identity so every artifact would look like a sibling. His twenty demos already had it: ivory background, serif h1, mono eyebrow, clay accent, oat secondary. I wanted that, every single time, without retyping the palette in every prompt like a peasant.

That’s the argument for a skill that even Tariq might let me have.

The Weekend the Process Skills Earned Their Keep

I work with a stack of process skills that hard-gate me through brainstorming → spec → plan → execution. Most of the time it feels like ceremony I’m paying a small productivity tax to perform. This time it earned its keep with interest.

We landed on a hybrid design: a comprehensive styles.css carrying the shared design system, a patterns.md listing ten recurring HTML building blocks as markup-only snippets, and a frozen references/ folder containing all twenty of Tariq’s demos as structural exemplars. A SKILL.md on top routes intents — “make a status report” → read demo 11; “make a code review” → read demo 03; etc. — and lays down the hard rules.

Writing the spec took half an hour. Writing the plan took twenty minutes. Then I switched to subagent-driven execution, where I dispatch a fresh implementer subagent per task, then a spec-compliance reviewer subagent, then a code-quality reviewer subagent, then move on. Three subagents per task. It is a lot of subagents.

I do not enjoy spec compliance reviewers. They are the kind of pedant who notice that my commit message describes what I did rather than why I did it, and they will tell me. Repeatedly. Across tasks. In a way that makes me reflect on my life choices and resolve to read more Strunk & White. The code-quality reviewers are worse because they’re nicer about it, which I find structurally harder to argue with.

The final implementation was nine commits. None of them were interesting individually. The interesting moment came after.

The Verification Step That Cross-Examined Its Own Maker

The plan’s last task was: regenerate one of the twenty reference demos from scratch, using only the skill files (SKILL.md, styles.css, patterns.md), without reading the original demo. Then compare the regenerated artifact to the real one. If the skill is doing its job, they should look like siblings.

I picked the weekly status report. Dispatched a subagent. Watched it work for five minutes.

Three gaps surfaced.

The first. The original demo used a class called .stat-num for the big 44-pixel hero numbers in the summary strip. My styles.css only had .stat-value at 28px. Without the bigger variant, the regenerated artifact produced a status report with mid-size stats — visually correct, structurally anemic. Like a courtroom drama where everyone’s wearing the right clothes but nobody’s raising their voice.

The second. The original had a tight bullet list under each section with clay-square bullets — .highlights. Not in styles.css. Not in patterns.md. The regenerated artifact had to invent its own bullet styling, like a freshman who’d been told to “match the house style” without being given a copy of the house.

The third. The original had a row pattern for “carryover items” — each row had a mono status tag, an owner, and a short description. Not a documented pattern. I’d read those demos three times by then and not noticed this as a recurring shape, because I was too close to the work to see the shape of the work.

I patched all three. Each one a separate commit. Re-ran the verification (fresh subagent, no memory of the previous run). The artifacts matched. The skill was now actually capable of producing a sibling of the demo it had been built from.

This is the part that surprised me. Verification didn’t just confirm the skill worked — it told me what was missing from my own design. I’d had those demos open for three days. I had not seen .stat-num or .highlights as separate things from .stat-value and a default <ul>. Generating from incomplete context surfaced them in a way that staring at the demos never did.

If you’ve ever written tests for code you didn’t fully understand, you know this feeling. The tests aren’t checking the code. They’re cross-examining your mental model and waiting for you to flinch.

The 250-Line CSS Receipt Claude Slid Across the Table

I was about to call the skill done when I mentioned offhand that some people online were arguing HTML burns more tokens than markdown. (They are. They’re not wrong.) I floated a fix: externalize the CSS to a <link rel="stylesheet" href="./styles.css"> and stop having the LLM emit the stylesheet on every artifact. Roughly 40% token savings for the kinds of documents I generate.

Claude, with the polite throat-clearing of a tutor who has noticed you’ve been carrying the one wrong for several pages but waited until you finished the equation to bring it up, pointed out that my own skill — the one I had just finished building — inlined styles.css into every generated artifact. About 250 lines of CSS per output. Pure repetition, no information added. I was paying a token bill, every artifact, for the privilege of re-reading the stylesheet I’d already read the last time.

So I rebuilt the rule. The new hard rule: the artifact links ./styles.css, and the skill ensures that file exists in the output directory before generating it. The “single shareable HTML file” promise died. Sharing now means zipping the html and the styles.css together, or uploading both to the same folder. I have not yet missed the promise. Two files is one zip away from one file. The token bill is paid every day; the sharing inconvenience is a Tuesday once a month.

I committed the refactor. The skill is now smaller, faster, and produces output that doesn’t carry 250 lines of stylesheet boilerplate like a man dragging a printed PowerPoint deck to a Zoom meeting.

The Loop Closing on Itself

Three things happened in sequence this weekend, and they share a shape I keep noticing in AI tooling but don’t quite have a name for:

I read a post arguing HTML beats markdown for AI-generated documents you want a human to actually read.
I built a skill using a documented workflow — spec, plan, subagent execution — the artifacts of which were a SKILL.md, a stylesheet, a snippets file. All markdown. All read by future Claudes. The skill itself, in other words, is the kind of document the post is arguing we should be writing as HTML.
The skill’s verification step regenerated one of the demos it was built from, using only the skill’s own internal documents, and surfaced gaps in those documents.

There’s a small vertigo in this I want to name. The skill produces HTML. The verification of the skill produced HTML (a regenerated demo) using the skill itself. The post you’re reading right now is markdown, because Hugo wants markdown, but I’m going to ask Claude for a visually rich HTML version of it once I’m done. Inside that HTML version will probably live a card-grid linking to the twenty demos that started this whole thing.

And then there’s the bigger circle, the one that’s been quietly running for forty years:

First we wrote HTML for the browsers. Then we wrote HTML with programming languages — the PHP era, when half the internet was a .php file pretending to be a .html file. Then we invented template engines so we’d stop concatenating strings in production like cavemen. Then we invented React so we could write HTML as JavaScript objects pretending to be HTML. Then we re-invented server-side rendering plus hydration so we could render React back to HTML on the server and re-attach it on the client, like a magician sawing a person in half and putting them back together for applause. Then the agents arrived and we switched to writing markdown files for them, because they were better at reading prose than parsing JSX. And now we’re asking those same agents to make .html files for us again. Forty years of inventory. We end up exactly where we started, except now the computer types it.

When I share that HTML version with three people on Slack, I will get more responses than I would for the markdown version. I know this because I’ve already run that experiment several times this year. The same content, in two formats, gets two completely different reading experiences. The HTML one gets read. The markdown one gets starred for later, which is the polite version of being thrown in a drawer.

What I’m Actually Keeping (The Argument)

Visually rich notes have changed how I work in three ways I didn’t predict:

I read what Claude writes again. For most of last year, I’d skim the first paragraph of a long plan and accept it. The plans were dense, undifferentiated grey text, and reading 200 lines of them felt like a chore I was paying myself to skip. The HTML versions, with section numbers and summary strips and inline SVG diagrams, I actually read. I notice when something is wrong. I push back. The collaboration is real again, because the artifact is readable.

I share. I’d been sitting on a pile of internal notes I never sent to anyone, because nobody wants to scroll a wall of markdown on a Tuesday afternoon. Once those notes became styled HTML pages with hover-linked glossaries and inline diagrams, I sent them. People responded. People who hadn’t responded to anything I’d sent in a quarter responded.

The skill makes the visual richness automatic. This is the part Tariq doesn’t have to think about, because his taste is already calibrated. Mine isn’t, or wasn’t, or — let’s be honest — varies a lot depending on whether I’d had coffee yet. The skill is my taste, externalized. I made a small, opinionated visual identity once, and now every artifact I generate inherits it. I no longer think about whether the heading is serif. It just is.

Where Tariq Was Right (And Where He Was Day-One Right)

He’s right that you don’t need a skill to ask for HTML. Day one, prompt for it. Day three, prompt with one of his examples attached. Day seven, when you find yourself copy-pasting the same palette into every prompt and still getting subtly inconsistent output, build the skill. The graduation is gradual.

The skill I built doesn’t replace the prompt — it lifts the visual identity out of the prompt so I can spend the prompt on what’s actually different between this artifact and the last. The patterns and tokens are the parts of every prompt that should be invariant. Once they are, the prompt itself gets to be smaller, sharper, and more about the content.

(Yes, I’m writing this with Claude. The skill’s first real client was the post about the skill. The recursion is intentional. The grandiosity is probably unearned.)

If you’ve read Tariq’s original post (or his thread on X) and felt the same itch, the skill is at github.com/ekinertac/skills/tree/main/skills/html-effectiveness. It’s portable. Replace styles.css with your own palette and the router and patterns will still work.

Tariq doesn’t have to know.