A "soft 404" is the mismatch between an HTTP status code and the page content the server actually rendered. The server returns HTTP 200 (success) but the body is a "Page Not Found" / "We can't find that" / "404 Error" template. From the protocol layer, the response looks like a successful page; from the user / search engine perspective, the page is broken.
Why it matters:
- Search engine indexing: Google and Bing treat HTTP 200 responses as indexable content. A soft-404 page gets indexed, appears in search results, and consumes "crawl budget" (the rate at which crawlers visit your site). Real 404s are skipped after detection -- soft-404s are revisited indefinitely.
- Lost link equity: Inbound links to a missing URL (from another site, an old social share, etc.) carry SEO value. Returning a real 404 lets the linking site know to remove the link or correct it; returning a 200 makes the link permanently broken without signal.
- Analytics distortion: Page-view metrics include soft-404 visits as "real" traffic.
- User experience: Users who shared a URL that has since been removed see a "not found" page that the browser thinks is fine (no native 404 styling, no warning).
The fix is universally simple but framework-dependent:
- Django: ensure
Http404exceptions propagate to the 404 handler withraise Http404instead of rendering a template directly with status 200 - Express/Node:
res.status(404).send(...)notres.send(...) - Rails:
head :not_foundorraise ActiveRecord::RecordNotFound - WordPress: ensure the theme's
404.phpis invoked viais_404()rather than just rendering insideindex.php - Static sites: many CDN / hosting providers treat
/404.htmlas the not-found page automatically; verify the response code in the browser dev tools network tab.
The BeaverCheck soft-404 analyzer detects the mismatch by looking for "page not found" / "404" template phrases in the page title or H1 when the HTTP status was 2xx. It deliberately ignores generic "not found" wording in body text (would false-positive on tutorial pages explaining how 404s work) and ignores 4xx status responses (those are working correctly).
Related: soft-500 (200 + "internal server error" template) and soft-redirect (200 + "you are being redirected" page) are similar anti-patterns where the status code disagrees with the body content, but soft-404 is by far the most common.