Jono is a digital strategist, marketing technologist, and full stack developer. He's into technical SEO, emerging technologies, and brand strategy.
Front-end vs back-end technologies
Traditionally, most websites have a distinct ‘back-end’ (or ‘server-side’) and ‘front-end’ (or ‘client-side’). Note that this isn’t the same as “the public or user-facing parts of my website” or “my WordPress admin area“; we’re talking about the technologies which power the website, not the different parts of a WordPress website.
In WordPress, the back-end is usually a combination of PHP (a scripting language) and MySQL (a database technology). These store and process your data, and run the logic which constructs your front-end.
As you navigate through a site, this process repeats each time you visit a different page. Whenever you click a link, the page you’re on unloads, the new page is sent from the server to your browser.
If you’d like to learn more about the difference between back-end and front-end development, we recommend this guide by Chris Castiglione on One Month.Photo by Greg Rakozy on Unsplash
For more complex websites, some user interactions (or other scenarios) might mean that you want to update parts of a page without having to reload it.
For example, we might want to let a user submit a form and see a message based on the outcome, without reloading the page.
This kind of functionality powers many of the interactive tools and processes we’ve all become familiar with as the web has evolved; like editing content, exploring maps, submitting forms, and using live chat.
From an SEO perspective, it’s important to remember that any content which is only loaded after user interaction might be undiscoverable by search engines and other systems. In general, you shouldn’t load critical page content via Ajax.
If you’d like to learn more about Ajax, we recommend starting with this W3Schools tutorial.A schematic of how Ajax works by Daniel Haischt, via Wikimedia Commons
Ajax in WordPress
In WordPress, many plugins and themes use Ajax to dynamically get, set, and update the contents of pages based on user interactions. They typically do this in one of two ways:Requesting a WP REST API endpointRequesting /wp-admin/admin-ajax.php (with an action)
If you recognize the admin-ajax.php filename, that might be because you’ve seen it show up in your robots.txt file.
WordPress’ default robots.txt file contains an instruction to explicitly allow crawling of this URL, even though its parent folder (/admin-ajax/) is blocked.User-agent: * Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.php
This is because many themes and plugins relied so heavily on (sending, and) retrieving data and content from this endpoint that much of it wasn’t discoverable. By permitting crawlers to follow links to admin-ajax.php endpoints (like admin-ajax.php?action=get-some-critical-content), search engines could crawl and index otherwise ‘invisible’ content.
Well-developed plugins and websites shouldn’t need to rely on exposing content in this way. Instead, they should consider how their content is discovered, crawled and indexed regardless of which of the Ajax approaches they use.
This article won’t go into depth around comparing the technical differences between these approaches, other than to highlight that the WP REST API is far superior to the (legacy) admin-ajax.php approach. It’s easier to develop with, provides a better framework for managing permissions, and is significantly faster to respond.
The admin area
Single page applications
Over the past few years, a new type of website has emerged. As users expect websites to behave more and more like apps, single page applications (or ‘SPAs’) have become increasingly common.
All of the logic for how the app work is stored and executed client-side. The app only communicates with the back-end (via Ajax) when it needs to retrieve new content or data.
This makes it much easier to build responsive, fast experiences, where the page doesn’t need to reload when there are updates to content, layouts and components.
Websites which have lots of moving, changing parts, which update as you interact with them are well-suited for being built as SPAs. In fact, many of the ‘app-like’ websites which you use in daily life – like Gmail, Facebook, Twitter and even PayPal – use SPAs in their main web services. You’ll notice that as you browse through their views, and interact with their tools, the page doesn’t ‘reload’.
One of the key concepts in SEO, and in how Google works, is that a webpage should be about a thing. A given page should have a clear topic, and a clear focus. That page resides on a URL, which identifies it uniquely, and which shouldn’t change or be removed.
This is how most conventional websites, and certainly most WordPress websites, work. You write pages about topics.
But for websites which behave more like apps, it isn’t always easy to answer the question, “what is a page?”. As parts of pages change and transition as the user interacts with them (with or without changing the URL), it becomes difficult to know which state of each ‘page’ is about which topic.
There are performance problems, too. Because your browser needs to download a (potentially large, complex) application before it can construct the webpage, that can take some time.
Even if the app is split into smaller pieces, and only the necessary ‘route’ is downloaded for the requested page, that can still take much longer to load than a conventional ‘static’ website might (though clever techniques like using WPGraphQL can make passing back-end data to the front-end much simpler and faster; and even more-so when you integrate it with Yoast SEO via this plugin).
When Google (and others) can’t reliably access and assess your content, that can lead to serious SEO problems.
To avoid some these problems, you can use different approaches to rendering your content, and ensure that search engines and social networks can see and digest your content.
If your back-end is smart, it can run a process which loads the website application, and stores the rendered HTML output. We can build up a cache for our pages, and serve that “pre-rendered” content users or search engines. Pre-rendering is often done on demand (when a page is first loaded). but more sophisticated approaches might run on a schedule, or when content is changed or updated.
One possible solution is to use a technique called hydration. In this approach, the back-end still responds with a pre-rendered, static version of the page. But it also responds with a version of the application code. Once this code loads, the static page (or, parts of it) is seamlessly transformed into the app.
This kind of approach can be extremely effective, but can also be more complex to configure and maintain. It’s also tricky to avoid performance pitfalls, where the app (re)loads content which was already downloaded in the initial response.
Whilst this could technically be considered a form of cloaking by Google’s definitions, they go to lengths in their documentation to explain that dynamic rendering is not the same as cloaking, and is their preferred solution.
Dynamic rendering isn’t a perfect solution, though. It still means that you’re essentially managing two versions of the website (which must be ‘kept in sync’). You’re also micromanaging which types of users and systems see which version. That’s potentially a lot of micromanagement and a lot of things which could go wrong.
In an ideal world, you wouldn’t need to maintain two different versions of your website, or to rely on complex hydration techniques. You’d just have one set of code and logic, and the site would deliver server-side generated content, but also allow app-like interactions.
Surprisingly, relatively few WordPress websites today are deployed as Single Page Applications. We can speculate that this may be due to a combination of factors – including differing developer skillsets (WordPress is heavily biased towards PHP & MySQL), and a lack of broad theme/plugin support.
That’s not to say that it’s not possible, however. In fact, WordPress’ REST API (and/or solutions like WP GraphQL) can be used to provide the data and back-end connectivity for a SPA. This Medium article by Brijesh Dhanani shows how easy it is to create a simple SPA using WordPress as the back-end and React for the front-end.
Challenges with headless SEO
One of the drawbacks to headless sites is that it’s hard for the front end to know how the ‘SEO stuff’ on the page should work. On even the simplest of sites, complex logic is often required to determine how a page should behave from an SEO perspective. Crawling controls, indexing directives, meta tags and structured data must all interact correctly, without error.
For simple posts and pages, this logic isn’t overly complex. But when you need to start considering post archives, pagination, indexing controls and other edge-cases, it becomes much more complex. When there’s no ‘back end’, the SPA must define and manage all of these rules from scratch. Even the logic for ‘simple’ components like canonical URL tags is surprisingly complex when defined and built from scratch. Many headless sites and SPAs fall afoul of this complexity and fail to achieve basic SEO standards.
This has meant that, despite being on the cutting-edge of performance and user experience, many headless sites have performed poorly when it comes to SEO.
Headless SEO in WordPress, with Yoast SEO
That’s why, in Yoast SEO 14.0 (in April 2020), we added a headless SEO API. Since then, headless WordPress sites running our plugin automatically get all of the appropriate <head> content, for whatever type of request/page, appended to their standard WordPress API requests. Developers who want even customization can request the <head> output for any URL by using our special REST API syntax. E.g., this URL returns all of the meta tags we output for our Yoast SEO plugin page.
Static HTML sites
One of the advantages of headless websites is that they can be completely de-coupled from the back-end. Because the front-end only communicates with the back-end via APIs to fetch content and data, it doesn’t matter where those individual pieces of the site live.
This has given rise to a trend for people to develop static websites.
Frameworks like Gatsby (who talk specifically about headless WordPress here) generate the HTML and content as part of a ‘build’ process (which runs initially, then whenever a post is updated or site setting is changed). That HTML can be hosted and accessed somewhere separate from any back-end components like databases and admin interfaces. That can provide significant security and performance benefits.
Increasingly, there’s an overlap between isomorphic SPAs, static HTML sites, and headless sites. These technologies are converging, and some of the most flexible, powerful and performant websites use a blend of each approach.
This is a distinctly different technical and conceptual approach to how a conventional WordPress website might operate; which is often on a LAMPstack (Linux, Apache, MySQL and PHP) – though a JAMstack website might rely on a WordPress site’s REST API for content and ‘back-end’ requests, which may use a LAMPstack.
Building and managing any flavour of headless, static WordPress SPA from scratch is a huge amount of work, and getting it wrong can be disastrous for SEO. Thankfully, there are a range of popular frameworks and tools which make it much easier, and we have some suggestions on what best practice looks like.
What does ‘best practice’ look like?
This can be achieved via any of the rendering solutions we’d identified, but is most reliable when using an isomorphic approach.
Of the many tools, libraries and frameworks available, we recommend using more mature and fully-featured support for SEO and rendering management. In practice, depending on your technical environment, that tends to mean React (or its lightweight sibling, Preact), Next, Nuxt (which uses Vue.js), or Angular (using Angular Universal).
Let us know if you have any questions, or if we’ve missed anything!