Published on November 11, 2024
JavaScript might be the most least discussed, yet highly critical, topic in the SEO community. It’s the bedrock of most frontend web frameworks, and can have an instrumental/monumental impact on indexation of your content and thus its performance.
While the specific JavaScript configuration might be out of your control due to the chosen tech stack for your website, there are many checks and steps you can take to ensure your content’s visibility in organic search. We’ll focus much of this article’s content on how to perform these checks and diagnose any issues, but first, let’s explore how Google’s role in determining the visibility of your content.
Google wasn’t always able to crawl JavaScript. Less than a decade ago, Google’s web crawler Googlebot was not able to understand or render pages that use JavaScript. In 2015, Google announced their ability to crawl and render just like the modern web browser — as long as you weren’t blocking JavaScript or CSS resources (more on this later). This was revolutionary at the time because this effectively ended our need as webmasters to serve two different types of pages, one for human users and one for search engines crawlers like Googlebot.
Fast forward to the present, Google now has a two-step process for rendering and indexing content using JavaScript. First, Google will crawl the static HTML layer of a webpage. When it discovers JavaScript and other resources, it adds these to a rendering queue for processing to index the page’s content and links.
While a page’s time in the rendering queue will vary, this recent study by Vercel and MERJ revealed that most pages spend less than 20 seconds in the rendering queue. Their highly recommended study also debunked a series of myths related to JavaScript and SEO, including:
Myth 1: “Google can't render client-side JavaScript.”
Reality: False, Google renders 100% of HTML pages, including those with complex JavaScript.
Myth 2: “Google treats JavaScript pages differently.”
Reality: False, there was no significant difference in Google’s success rate in rendering pages, regardless of JavaScript complexity.
Myth 3: “Rendering queue and timing significantly impact SEO.”
Reality: Most pages are rendered within minutes, not days or weeks, with a majority of pages spending less than 20 seconds in the rendering queue.
Myth 4: “JavaScript-heavy sites have slower page discovery.”
Reality: Google successfully discovers links regardless of rendering method, while XML sitemaps eliminate any discovery differences across rendering methods.
JavaScript rendering types include Client-Side Rendering (CSR), Server-Side Rendering (SSR), Static Site Generation (SSG), Incremental Static Regeneration (ISR), or Rehydration. Dive in deeper here, directly from the horse’s mouth. A typical configuration these days may utilize a combination of the aforementioned rendering methods to deliver content.
Google hasn’t always been crystal clear about which method to use, but they have been relatively clear that CSR, in which rendering is executed by the browser, is the most heavily disadvantaged rendering type for SEO. Historically, CSR posed inaccuracy issues due to differences from the static layer delivered. CSR can also still be a challenge to render for search engines other than Google.
To stay on the safe side, this leaves us with any of the remaining JavaScript rendering options — or combinations therein — all of which deliver some form of static HTML to the browser so that Googlebot can more easily crawl and render a page’s content, and thus more efficiently index the content.
Let’s look at ways you can improve crawl efficiency and ensure your JavaScript is in SEO best practices.
One thing Google has been pretty clear about in recent years is what’s called “progressive enhancement.” This is the process of building the core elements of your pages in static HTML such as its structure and navigation, and then layering in your JavaScript to improve the page’s appearance and functionality.
In this static HTML layer, you’ll want to include any and all elements you deem essential to your page such as on-page text as well as critical SEO elements like the page title and meta tags. This is highly beneficial and recommended to ensure your content is properly indexed and earns its full weight for ranking purposes.
At all costs, avoid injecting any critical elements through JavaScript, such as canonical link elements, meta robots tags, hreflang markup, rel=“nofollow”
attributes, etc., as this can confuse search engines if there’s any difference in the rendered and non-rendered versions.
If you use a “noindex” tag or the page is canonicalized, Googlebot will not return to render the page’s content. It sees the “noindex” meta robots tag or the canonicalization as part of phase one and does not return. This is where differences between what’s served in your HTML layer and what’s rendered differ and are critical to indexation.
To get a better understanding of the visibility of your critical content, use any combination of either Google Search Console, your browser, or Screaming Frog’s SEO Spider ($), to test how Googlebot is rendering your content.
GSC’s URL Inspection Tool can help you understand if Google can properly render your page’s content.
Select a page to test and enter its URL in GSC’s search bar.
Then, click the “View crawled page” button.
In the HTML tab, click the search icon and enter the text or element you want to test.
If it doesn’t turn up, then you know Google can’t render your page’s content as desired.
If your page is not on Google or you’d like to test changes you’ve made, you can click “test live URL” on the main URL result.
Once the test is completed, click “View tested page” and repeat the steps under #2.
To get a better understanding of the visibility of your content, testing the difference between what’s rendered in the browser and what’s not can help you diagnose and better understand where JavaScript is updating your content.
Using the Chrome browser, you can see the difference between the raw HTML using “View Page Source” versus the DOM using “Inspect.”
“View Page Source" – Refers to the option available in web browsers (namely Chrome) that allows users to see the original HTML source code of a web page as it was delivered by the server. This includes the static HTML markup and any embedded JavaScript or CSS.
“Inspect” – Viewing the “DOM source” refers to the dynamic representation of the Document Object Model (DOM) of a web page. It includes the HTML source code above, as well as any dynamic modifications made by JavaScript or other client-side scripting languages, which is closer to the content that Googlebot sees.
While Screaming Frog’s SEO Spider does require a paid subscription in order to enable “JavaScript rendering” within their tool, running “Text only” mode crawl is included in the free version and will reveal how heavily your website relies on client-side rendering. If it does, the results in this mode will be likely limited to the start page (e.g., the homepage) with a few JavaScript and CSS files. No other pages or links to other pages will be included, since they can’t be rendered or seen.
To ensure you are putting in the important stuff for SEO on initial page load, run the crawl in “JavaScript rendering mode.” This will allow you to see where your JavaScript is updating critical elements such as page titles or meta descriptions. Once your crawl has completed, navigate over to the “JavaScript” results tab in the dropdown menu for this section, and starting with “Noindex Only in Original HTML,” begin reviewing each of the subsequent filters highlighted below to ensure none of your page’s content has these critical elements altered by JavaScript, as these could prevent proper indexation or cause ranking issues.
This is the simplest but most important step of all. Hands down.
Less than a decade ago, it was still commonplace to block resources such as JavaScript or CSS files, often by preventing folders that contained these essential files from being crawled in your robots.txt file. For example, before Google made the announcement in 2015 referenced earlier about its ability to render JavaScript, it was even considered best practice to block these resources to prevent their indexation and improve crawl efficiencies.
As noted in our technical SEO guide’s section on Robots.txt files, you do not want to block any resources such as JavaScript or CSS files as they are now critical to the search engine’s ability to render a page’s content and links properly, and exclusion can prevent indexation. Therefore, your robots.txt file should not be blocking anything and should just include the location of your XML sitemap(s).
Furthermore, including much of anything in the robots.txt file is a bit of a competitive risk. Everyone knows where this file is, so if you’re trying to hide things, this is definitely not the best way to do it. Instead, don’t block anything here and instead manage page indexation or site section indexation on a page level with meta robots tags.
Also check to make sure you’re not making resource requests to owned subdomains or additional domains you own that could be blocked via other robots.txt files. These can be easily overlooked.
Navigate to your domain’s root URL and enter /robots.txt to the path. In most cases, it’s recommended for your robots.txt file to look something like ours:
To check if any critical resources or pages are blocked by your robots.txt file:
Navigate to “Indexing > Pages” and look for “Blocked by robots.txt”
To check a specific page (a recommended exercise for your critical pages), use GSC’s URL Inspection Tool:
Select a page to test and enter its URL in GSC’s search bar.
Then, click the “View crawled page” button.
Then click “More info” in the sidebar and look for JavaScript errors in the “JavaScript console messages” section.
If your page is not on Google or you’d like to test changes you’ve made, you can click “test live URL” on the main URL result.
Once completed, click “View tested page” and repeat the above steps.
First, ensure JavaScript rendering is enabled in the crawl setting.
Once the crawl is completed, navigate to the “JavaScript” tab and select “Pages with Blocked Resources” from the drop down menu.
Then select a URL from the list and click the “Rendered Page” tab in the bottom window.
This will display all the blocked resources along with why they are blocked.
In our example below, you can see that the majority of blocked resources are on third-party domains and thus out of our control, however, there is one image from a domain we own called ctfassets.net
that is returning a 404 error that we need to fix.
Googlebot may use the Chrome browser to render web pages, but it doesn’t exactly behave like users do on said pages. They won’t wait for additional links to load and they don’t typically click buttons. Let’s break this down to look at each type more closely.
Search engines don’t tend to click buttons, but they will see the content as long as there’s no action needed in order to view it. So, if your content requires a click to view it in the DOM, then it’s likely this content may be invisible to search engines. Instead, use internal links to help search engines discover your pages. Also, avoid using “onclick” attributes where possible, which instruct the browser to run a script when the visitor clicks a link.
While tabbed may be necessary and helpful for UX to group similar content and help prevent long pages or heavy scrolling, this content may be a challenge for search engines to see.
If possible, it’s ideal to not hide this content behind tabs, especially if it’s important to the context of the page in terms of keywords. Make sure the most important content is the visible tabbed content. You can also include all of the tabbed content in the initial static HTML response so that it doesn’t require JavaScript to deliver it to search engines.
Scrolling used to be a bit of a mystery, however we’ve learned in recent years that Google has a very clever way of seeing all of a page’s content. On mobile, Googlebox loads the page with a screen size of 411 x 12,140 pixels, making it a very long page. On desktop, it does relatively the same, loading the page with a screen size of 1024 x 9307 pixels.
This is particularly useful if you have a page that loads more content on scroll such as blog posts or product pages. The good news is, that if the checks below do not reveal this content, you can rest assured that inclusion of these posts or pages in your XML sitemap will ensure indexation.
GSC’s URL Inspection Tool can help you understand if Google can properly render your page’s content.
Select a page to test and enter its URL in GSC’s search bar.
Then, click the “View crawled page” button.
Search for your tabbed content or content that is meant to load on scroll or with a button click.
You can also click “Test live URL” and then “View tested page” and repeat the above steps to see if your content appears in the source code.
There’s many ways you can use Screaming Frog SEO Spider
Compare static HTML to rendered HTML: First, enable the tool to capture HTML in the “Configuration > Spider > Extraction” settings by selecting both “Store HTML” and “Store Rendered HTML” options, then you can compare the differences in the source code between content loaded with static HTML or those with JS-rendered HTML.
(source)
Identify JavaScript links: Select “Contains JavaScript Links” and sort by “Unique JS inlinks” or “Unique JS Outlinks” columns to find links that are rendered differently in HTML versus with JavaScript.
Identify JavaScript content: View the “JavaScript” results tab and select “Contains JavaScript Content.” Sort the “JS Word Count %” column to display content only found using JavaScript in the rendered HTML.
While both of the above may seem like issues if you have many identified links or pages with high percentages, Google can render content effectively as previously outlined and it’s best to use GSC’s inspection tool to check these pages manually so you can see exactly what Googlebot sees. And make sure any of this linked content also appears in your XML sitemap, as outlined next.
While it’s our last best practice in the list, this is undoubtedly one of the most important things you can do to ensure all of your pages are properly indexed. A dynamic XML sitemap solution that automatically maintains your URLs will ensure your site’s footprint is up-to-date in organic search results.
Here’s which values to include for each URL in your XML sitemap:
Include <loc>
and <lastmod>
values only for each URL.
Don’t include <priority>
and <changefreq>
values as Google ignores these.
Here’s Google’s example to show which values to include in your XML sitemap:
Here’s some additional optimization steps for dynamic sitemaps from our XML sitemap guidance:
The XML sitemap file should contain all indexable pages.
If a page is set to “noindex” with the meta robots page-level element, then the URL should be automatically excluded from the XML sitemap.
If a page is set to redirect using (301 or 302 status codes), then the URL should be automatically excluded from the XML sitemap.
If a page is “canonicalized,” meaning the page specifies a canonical link element different from the page’s URL, then the URL should be automatically excluded from the XML sitemap.
If you haven’t already, submit your XML sitemap to Google in GSC. Once your XML sitemap is processed successfully, click “See page indexing” and review all issues under “Why pages aren’t indexed.” These should be considered high priority issues to investigate as they might carry implications for indexation of important content or issues with the dynamic functionality of your XML sitemap, or both.
Ensure in the “Crawl Config” settings that you have selected all the options in the XML Sitemaps section and listed all of your XML sitemaps to crawl.
Once the crawl is complete, navigate to the “Sitemaps” tab and troubleshoot any issues in the various filters. While there might be reasons for why certain URLs don’t appear in your XML sitemap for example, these should all be considered critical issues to investigate thoroughly if you’re not sure why any of these filters are populated, particularly if they are happening at scale as these may be systemic issues related to dynamic functionality and optimization of your XML sitemap.
So, what should you takeaway from this post? That JavaScript is the web’s most popular programming language and used on over 95% of websites. More importantly for SEO, it can have a monumental impact on indexation and visibility of your content, and thus your performance in organic search.
There are several best practices to ensure your JavaScript is SEO friendly. Like assembling a pizza pie, you want to start with a solid base for your webpages, which, in our case, is a static HTML layer with all your most important elements included. Then you can layer in your fancy toppings powered by JavaScript to enhance the experience for users. Finally, make sure your XML sitemap is dynamic and follows SEO best practices to ensure visibility and indexation of your content.
Subscribe for updates
Build better digital experiences with Contentful updates direct to your inbox.