SEO patterns for headless WordPress: the seven things most migrations break
Headless WordPress sells on Core Web Vitals, content reuse, and editorial speed. It buries seven specific SEO signals if no one is paying attention. We have shipped enough of these migrations to know which ones cost weeks of recovery and which ones are mechanical to keep right.
This article is the checklist. It is not a substitute for the headless WordPress service pillar, which makes the architectural case. It is what we run before, during, and after every migration, in that order.
TL;DR
- Preserve canonical URLs at the URL level, not the slug level.
- Preserve hreflang in HTML, not just in the sitemap.
- Render meta tags and JSON-LD on the server, not on the client.
- Migrate redirect history before changing URLs, not after.
- Keep one sitemap as the source of truth, not two.
- Block the WordPress origin from search index visibility.
- Carry image alt text and structured image data forward.
Pattern one: canonical URLs survive the migration
A canonical URL is a promise. Every external link, every Google index entry, every social share counts on it. A headless migration that quietly trims a path segment, changes case, or reorders query parameters has just broken every one of those promises silently.
Two rules. First, capture the full canonical URL set from the legacy WordPress build before touching the front. We export every published post, page, and term page with the URL Google has indexed; that is the source of truth. Second, write the canonical URL into the headless front’s HTML response, not into a client-side <head> mutation. Generative engines and answer engines parse the initial HTML; client-side meta updates do not exist for them.
If you must change a URL, redirect 301 from the old to the new and keep doing so for at least a year.
Pattern two: hreflang is HTML, not JSON
Multilingual WordPress sites use WPML, Polylang, or a custom solution to manage translations. The mapping ends up correct in the database. The headless front then has to render <link rel="alternate" hreflang="..."> for every language variant in the HTML response.
The pattern most agencies miss: hreflang must be self-referential. The English page lists itself plus all translated alternates. The Polish page lists itself plus all alternates. The two lists agree. Tools like the Search Console international targeting report flag the mismatch when one side forgets.
We treat hreflang generation as part of the build, not as a runtime decision. The path map is computed at build, hashed, and any drift fails the build.
Pattern three: meta tags and JSON-LD render on the server
The most common SEO regression we have seen in headless migrations: meta tags and JSON-LD inserted via JavaScript after the page loads. The browser sees them. Googlebot can sometimes see them. Generative engines, voice assistants, and most LLM crawlers usually do not.
Two rules. Render the meta tags, the canonical, the Open Graph, and every Schema.org JSON-LD block in the initial HTML response. If you use Astro, that is the default. If you use Next.js, that means rendering on the server (App Router metadata, or the legacy getServerSideProps path) and not relying on next/head re-runs in the client.
The same applies to images: an <img> element with alt and src in HTML is indexable. An <img> injected after a client effect is invisible to most crawlers and to AI training data pipelines.
Pattern four: structured data inherits, not gets rewritten
WordPress with Yoast SEO or Rank Math already produces good Article, Product, and Organization JSON-LD. The temptation in a headless migration is to rewrite it from scratch on the front end. Resist.
Read the existing JSON-LD from the WordPress origin via the REST or GraphQL endpoint. Pass it through. Add only what the front end legitimately knows that WordPress does not (for example, build timestamps for dateModified if your editorial workflow does not touch dates). Two systems generating overlapping JSON-LD is how Search Console’s rich-result reports start failing.
For our own pages we use the Phase 0 components in src/components/seo/: DirectAnswer, FAQ, and Quote. Each emits its own minimal JSON-LD without overlapping with the page-level Article schema.
Pattern five: redirects move before URLs do
Order matters. The pre-migration checklist captures every internal and external redirect, including the silent ones (/wp-content/... to /uploads/..., country-code redirects, AMP variants). The new front-end ships those redirects on day zero, before the public switch to the new URLs.
On Cloudflare Pages we keep the _redirects file under the platform’s 2000-rule cap. A build that would exceed the cap fails. Anything that needs more than 2000 rules ends up in a Worker for parameterised redirect logic instead.
When the public DNS finally cuts over to the new front, no redirect is ever new in production: every rule was tested as part of the build for weeks before the cutover.
Pattern six: only one sitemap
WordPress 5.5 added a default /wp-sitemap.xml. Yoast SEO and Rank Math add their own sitemaps. The headless front-end framework produces a sitemap as well. Three sitemaps at the same domain is a recipe for Search Console pulling its hair out.
The rule: pick one canonical sitemap and disable or redirect the others. We typically generate the sitemap from the front-end framework so URLs match the actual public site exactly, then 301 the WordPress origin sitemap to the front-end one. The WordPress origin then becomes invisible to search.
Pattern seven: the WordPress origin must be invisible to search
A headless migration leaves the WordPress origin running, usually at a subdomain or a private hostname. It still serves rendered HTML, has a working sitemap, and answers REST queries. Search engines that find that origin will index it as a duplicate of the public site, and the duplicate will not be the one that ranks.
Three controls. The origin’s robots.txt blocks all paths except the REST and GraphQL endpoints. The origin sends X-Robots-Tag: noindex, nofollow in HTTP headers for every HTML response. The origin’s sitemap is removed or returns a 410.
If your origin is on the same domain as the public site under a path prefix (for example, /wp-admin/ or /wp/), the same controls apply scoped to those paths.
Where this fits in the cluster
This article supports the Headless WordPress service pillar. For decision-time framing, see Headless WordPress, Next.js vs Astro 2026. For the broader visibility story including LLM citations, the AI and LLM visibility playbook is the canonical statement of what we ship for AEO and GEO on top of these SEO foundations.

