Original HTML
Changed HTML

HTML Diff: Compare Two HTML Files Online

Paste, format, and compare two HTML documents side-by-side. Tag-aware syntax highlighting and live well-formedness checks built in.

What is the HTML diff tool?

A free, in-browser tool for comparing two HTML documents. Paste the old version of a landing page on the left, the new one on the right, and the changes light up tag by tag and attribute by attribute. Nothing is uploaded.

The diff itself is character-level. The Format button on each pane reflows your HTML with consistent indentation, the way Prettier or HTML Tidy would. Syntax highlighting follows the WHATWG HTML Living Standard, so void elements, boolean attributes, and entity references all render correctly while you read the diff.

If you have ever shipped a copy edit to a marketing page and lost an hour figuring out why a class name on the hero section also changed, this is the tool that surfaces it in seconds. For plain prose, our text diff is the right pick. HTML's strict cousin, XML, has a dedicated XML diff. And because Markdown often compiles to HTML in static-site builds, the Markdown diff is a useful neighbour.

How the diff actually works

This is a text-level diff with HTML-aware syntax highlighting, not a DOM diff. The comparison runs at the character level, then a semantic post-processing pass shifts the highlights so they land on tag names, attribute values, and visible text rather than random punctuation. Insertions show in green on the right pane, deletions in red on the left.

HTML has more parsing quirks than people remember, and most of them only matter when you compare two files as text. Per the parsing algorithm, attribute order is not significant, attribute quoting can be single, double, or absent for some values, and tag and attribute names are case-insensitive. A text diff still flags every reorder and case swap. The practical workaround is to format both sides with the same tool (Prettier with parser: "html" works well) so the panes are normalised before you read the diff.

This tool does not parse the document into a DOM tree and compare nodes. That means two files that render identically in a browser, for example one with <img src="a.png"> and one with <img src='a.png' />, will still appear different here. If you need a true structural compare, parse both sides with DOMParser and walk the trees yourself, or run both through Tidy first. For 95% of code-review use cases (a copy edit, an attribute swap, an aria-label addition), the text diff is faster and clearer.

How to compare HTML in three steps

Two text panes, one diff. No signup, no upload, no server round-trip.

  1. 1

    Paste or upload your HTML

    Paste the old HTML on the left, the new HTML on the right. Or click Upload on either side to load an .html, .htm, or .eml file directly. The Sample button fills both panes with a small landing-page snippet if you want to see the tool in action first.

  2. 2

    Format both sides for a fair comparison

    Click Format on each pane to pretty-print with consistent indentation. This normalises whitespace and line breaks so the diff focuses on real content changes, not formatting noise from a Windows CRLF file vs a Unix LF file. The validation badge turns green when the document parses cleanly.

  3. 3

    Read the diff

    Deletions appear with a red highlight on the left, insertions with a green highlight on the right. The change counts in each header tell you how many distinct edits the diff found across tag names, attribute values, and visible text. Scroll either pane and the other follows in lockstep.

When HTML diff is the right tool

Comparing two server-rendered template outputs

A CMS template change goes out and a downstream consumer reports broken hero markup. Curl the rendered HTML before and after the deploy, paste both into the diff, and the offending wrapper <div> or class swap is obvious. Useful for WordPress theme upgrades, Hugo and Jekyll layout edits, and any server-rendered output where the source template is not directly readable.

Diffing email-template HTML before an A/B test

Email HTML is unforgiving. Outlook still uses the Word rendering engine, so a swapped <table> nesting or a renamed inline style attribute can shatter the layout. Diff variant A against variant B before you push the campaign, then run both through Litmus or Email on Acid. Catching a missing cellpadding="0" in the diff is cheaper than catching it in a thousand inboxes.

Reviewing accessibility-attribute changes in a PR

A PR adds aria-label, role, and aria-describedby attributes across half the components. The text diff makes it trivial to confirm every interactive element gained the right attribute, no role="presentation" was applied to a real button, and no focusable element lost its accessible name. Pair this with axe DevTools or Lighthouse for the actual audit.

Comparing a marketing landing page after a copy edit

A copywriter sends back the homepage with revised headlines and a new CTA. Paste the live HTML against the proposed HTML and the diff tells you exactly which <h1>, <p>, and button text changed, plus any class or href edits that came along for the ride. Faster than a tracked-changes Word doc and it survives the round-trip into the codebase intact.

Auditing an HTML sanitizer's output for XSS regressions

After a DOMPurify or sanitize-html version bump, regression-test the sanitizer with known-bad payloads and diff the output against a previous good baseline. A sanitizer that suddenly preserves <svg onload="..."> or javascript: URLs in href attributes is exactly the kind of regression you want to catch in CI before it ships. The diff makes single-character changes (a missing escape) jump out.

HTML quick reference

A short cheat sheet for the parsing details this tool surfaces most often. All of it is grounded in the WHATWG HTML Living Standard.

TopicWhat this tool does
Void elements<img>, <br>, <input>, <meta>, <link>, <hr> and friends have no closing tag. The XHTML-style trailing slash <br /> is allowed in HTML5 but renders as a no-op; the parser ignores it.
Attribute orderNot significant per the HTML spec. <a href="/x" class="btn"> and <a class="btn" href="/x"> are identical to a parser. A text diff flags the swap; format both sides to keep order stable.
Attribute quotingDouble quotes, single quotes, or no quotes at all for values without spaces or special characters. id=hero, id='hero', and id="hero" are equivalent. Most style guides require double quotes for consistency.
Boolean attributesPresence is what matters. <input disabled>, <input disabled="">, and <input disabled="disabled"> all disable the input. The spec recommends the bare form; XHTML required the verbose form.
Case sensitivityTag names and attribute names are case-insensitive in HTML (<DIV> equals <div>). Attribute values are case-sensitive (id="Hero" differs from id="hero"). The convention since HTML5 is lowercase tags and attributes.
Character entitiesFive built-ins: &amp; &lt; &gt; &quot; &apos;. Plus named entities like &nbsp; and numeric references like &#233; for accented letters. CDATA sections are only valid in foreign content (SVG, MathML), not regular HTML.
DOCTYPEHTML5 uses the short form <!DOCTYPE html>. Older XHTML 1.0 doctypes are still parsed but trigger no-quirks mode the same way. A missing or malformed DOCTYPE drops the page into quirks mode, which changes box-model behaviour.
WhitespaceRuns of spaces, tabs, and newlines collapse to a single space when rendered, except inside <pre>, <textarea>, and elements with white-space: pre CSS. The source text is preserved, so a text diff sees every byte.

HTML diff: frequently asked questions

Why use the HTML diff view instead of plain text for markup?

Under the hood it is the same character-level algorithm, but the editor uses HTML mode highlighting so tags, attributes, and entities render with familiar syntax colouring while you read the diff. The Format button uses an HTML-aware pretty-printer, not a generic text reflow, which keeps element nesting intact. For prose without markup our text diff is fine; for any file where tag boundaries matter, this view is easier to scan.

Does the order of attributes on an element matter?

Not to a browser. The HTML Living Standard says attribute order on a start tag is not significant, so <a href="/x" class="btn"> and <a class="btn" href="/x"> render identically. A text diff will still flag the swap because it sees the raw characters. The fix is to format both sides with the same tool (Prettier and html-validate both sort sensibly) so attribute order is stable before you compare.

Why does whitespace between elements show up in the diff?

HTML collapses most runs of whitespace into a single space when rendering, so two files that look identical in a browser can have wildly different source text. Inside <pre> and <textarea>, whitespace is preserved. The diff is text-level, so every space, tab, and newline counts. Format both sides first to normalise indentation; only the meaningful changes will remain highlighted.

Can I compare DOM structure while ignoring formatting?

This tool does a text diff, not a DOM diff, so the most reliable way to ignore formatting noise is to format both sides with the same pretty-printer (Prettier, HTML Tidy, or htmlhint --format) before pasting. That normalises indentation, attribute quoting, and trailing whitespace. For a true tree-level compare, parse each side with DOMParser and walk the nodes; for code review and copy-edit work, the format-then-diff workflow is faster and catches the same bugs.

How are inline JavaScript and CSS handled?

The contents of <script> and <style> are diffed as plain text, character by character. That means a one-line CSS change inside a <style> block or a renamed function inside an inline <script> will show up exactly where you expect. For larger script blocks consider extracting to .js files and diffing those separately. The HTML mode in the editor still highlights script and style boundaries so the surrounding markup stays readable.

What about character encoding and entities?

Files are read as UTF-8, which is the default for HTML5 and what the <meta charset="UTF-8"> tag declares. Named entities like &amp;, &lt;, &gt;, and &nbsp; diff as their literal source text, not their decoded form, so a swap from &nbsp; to a real space character will show as a change. If two files render identically in the browser but the diff lights up at the start, suspect a UTF-8 byte order mark on one of them.

Privacy and how this works

Your HTML never leaves your browser. The formatter, validator, and diff all run on your machine, locally. No analytics on your input, no logs, no "helpful" cloud round-trip. Syntax highlighting follows the WHATWG HTML Living Standard, and the older W3C HTML 5.2 recommendation is a useful cross-reference. Element-level reference docs are on MDN, and accessibility patterns come from the ARIA Authoring Practices Guide. Background reading on the format itself is on Wikipedia.