What is an Orphan Page?
Any page on a website that has no links leading to it is an orphan page. Since these pages have zero internal links that are used by crawlers or users to reach the page while navigating through your website, they are not accessible.
Orphan pages are often displayed with a “notice” tag instead of an “error” tag since sometimes websites hide their landing pages on purpose using this technique.
Search engines will not be able to find orphan pages, which is why it is crucial to analyze your website to detect them. This is primarily due to how Google finds new webpages on a website:
- Crawlers detect the URL of webpages listed in your website’s XML sitemap
- Crawlers find the URL internally or externally linked to another page
If you need a webpage to be indexed and detected by search engines, you must find orphan pages on your website and take the necessary measures.
Are Orphan Pages an SEO issue?
If a search engine isn’t able to find a page through links, it usually goes unindexed. Even if your webpage is listed in your website’s XML sitemap, it can be an SEO issue for the following reasons:
- Orphan pages might contain old and outdated content that might reduce your domain authority.
- Most commonly, pages get orphaned during the process of website migration. This can be an issue as the orphan pages might contain valuable content that could help you in boosting your rankings.
- More orphan pages on your website can send mixed signals to search engines about the context of your content, and might decrease your rankings on SERPs.
Orphan Pages vs. Dead End Pages
It is important to clarify that dead-end pages and orphan pages are two different things.
Orphan pages aren’t linked to or reachable from any other pages, which is why they are termed “orphan”. However, dead-end pages aren’t linked to any external websites or internal pages for the crawlers or users to explore. This creates a “dead-end”, hence the name.
When a user reaches a dead-end page, they have two options – abandon the website or hit back. Similarly, search engine crawlers cannot pass any link equity as they too have nowhere to go from dead-end pages.
While it is easy to remedy any dead-end page by simply adding links to the content or sidebars/footer navigation to the page, orphan pages work differently. Let’s see how to find orphan pages and fix them.
How to find Orphan Pages on a Website?
Get a list of your website URL’s
It’s a tedious and sometimes impossible task for crawlers to find orphan pages. Hence, using an SEO tool would prove to be difficult since they operate on data collected by crawlers.
The best way to find an orphan page is to collect a list of all the URLs on your website through a Google Analytics report. You can also use any other preferred analytics package for this.
If the page has ever been visited, it will show up in the Analytics report. There is a record of the URL somewhere, and you can easily detect it in the report if you check the pageviews section.
Resolve Page Duplicate issues
The most common cause of pages becoming orphaned might not even be something you would think about. Page duplication is an issue that is commonly overlooked and should be immediately dealt with. Each page duplicate should redirect to one URL only, and if it does not, then it is a given that the versions of that page will not be linked. This can result in them becoming orphan pages.
In this situation, the fact that these pages are duplicates is the main issue. This is the first thing that you should check when looking for orphan pages on your website while conducting a site audit. There are two types of page duplicates you should keep an eye on:
1. Non Canonical Pages
Each page on your website should use the https or http protocols, along with www or non-www in the URLs consistently.
Hence, you must check each of your public pages by typing in the browser all the variations of your pages, like this:
All these variations should take the users to the exact same page with a consistent URL. This will make the webpages canonical to themselves. If any of these variations is not redirecting the search to the desired webpage, then you know it might be a widespread problem. Whichever variation is causing this issue, you must check it with other webpages as well.
2. Trailing Slashes
This is another minor thing to look out for, which can make a major impact. If your website isn’t using trailing slashes consistently, it might end up orphaning some of your pages. Let’s take another example:
These URLs may provide the same content to the users, but their URLs are different from each other.
Check your webpages with both of these variations to see if they are redirecting the users to the same page. Ensure that all of your webpages are doing this consistently. You can force this process to take care of itself in “.htaccess” to ensure all these variations redirect the users to the same URL.
Compare the list of Crawlable URLs and Analytics URLs using Google Analytics
If you’re wondering how to find orphan pages on a website, this is the easiest way. You just need to collect all the URLs of your website by visiting the “Site Content” section, and clicking on “All Pages”.
The list will be displayed with the following sections:
- Page (URL)
- Unique Pageviews
- Average Time on Page
- Date Range
You must focus on Date Range and Pageviews sections to filter out the normal pages from orphan pages.
Since orphan pages are not accessible to users, they are bound to have the lowest page views. Just click on “Pageviews” to bring the lowest viewed pages to the top, and all your orphan pages will most likely come to the top.
Another option is to click on “Date Range” and set the starting date of the filter way back when Google Analytics was even in place. Google Analytics can only show you up to 5,000 URLs at a time, so select the highest number of rows at the bottom from the “Show Rows” section. In all likeliness, all your orphan pages are covered under this
Once all the URLs have completed loading on Google Analytics, click on export to get a CSV or excel file for your URLs. You can even put Google Analytics API to use here to speed up the process.
Once you have this list, you just need to add the relevant functions to distinguish crawlable URLs and Analytics URLs. Refer to the picture below to get an overview of this:
Next, you need to identify the orphan URLs from the list by comparing the list of Analytics URLs with Crawlable URLs. In the example presented above, the last link “https://xyz.com/7” is an obvious orphan page. In reality, this list is going to be very long, and you will have to sift through many more URLs to find the orphan pages.
You can easily automate this mechanical process. Use the match formula below to automatically check if each URL in the Crawlable list is also in the Analytics list:
The dollar signs send a signal to the sheet to not change the range when the formula is dragged down the respective column. And the value “0” indicates to Google that the list is not sorted.
After running this formula, the matches will be returned to the first position in the range. The ones that don’t match will be returned with a “#NA” error since they were not found in the Crawlable List column. So, in our example, “https://xyz.com/7” would be shown with “#NA” like this:
This would display all the orphan pages for you automatically in the list. You just need to collect all the #NA results by filtering them out.
Take the help of other tools to discover your Orphan URLs
Once you know how to find orphan pages on the website, there are a multitude of tools out there that can help you streamline the process.
The following tools are the ones that provide the best configurations and functions for this purpose:
- Moz Link Explorer
- Raven Tools
All of these tools come with a plethora of features that can help you in many other things apart from finding orphan pages. Out of these, Ahrefs, Moz, and SEMrush provide specific tools that can help in discovering orphaned pages a lot faster.
Another advantage is that these tools will also find some pages on your website that were not being crawled directly and are not necessarily orphan pages. This can help you in fixing these pages and generating value from them.
Your development team can conveniently assemble a list of URLs of your entire website from the server. All you need to do is take a peek into the log files to find information about:
- Who visits your website
- Where they visit the website from
- What pages did they visit
This information can significantly help you in performing a second crawl of your complete website. You can do this by ignoring directives like “noindex” or “nofollow” and compare the new data with the original crawling data to find the missed out orphaned pages. The reason for this is that there are sometimes pages that can be accessed by crawlers that ignore these directives, and this can lead to pages being orphaned.
After completing this activity, take a look at the GSC’s Search Analytics report to find the list of URLs. You might be wondering that these URLs have already been indexed. Yes, but some of these pages still might not be crawlable from the internal links of your website. These pages have a high potential of becoming orphan pages in the future, and you can fix that before it even happens.
Fixing Orphan Pages – Get Ahead in the Game
Orphan pages can become a big issue for your website, especially in terms of SEO. Now that you know how to find orphan pages, let’s take a look at the next step – fixing them.
When you have identified all the orphan pages on your website, your next call to action should be assessing which ones are worth resolving, and which ones should be removed. Here are the questions you should ask to make this decision:
- Where does the page currently exist in the taxonomy of your website?
- Is the page offering value to the users? If yes, where should it be integrated within your website architecture?
- Can the page rank for any keywords? Can it be optimized to boost the SEO of your website?
- Is there a scope for the page being backlinked? Or does the page have the potential to be externally linked from different sources?
- Is the content on the page very similar to any of the other pages?
The answers to these questions will help you in making a definitive decision about keeping or removing the orphan pages. You can also use this information to identify the amount of work needed to fix the pages you keep and what kind of value you see coming out of them.
This might seem like a tedious task but taking the help of a professional SEO company like Infidigit can take you a long way in optimizing this process. Infidigit offers comprehensive SEO audit services with a plethora of factors, which helps clients in seamlessly identifying and fixing their orphan pages. Get in touch today to learn more.
List of Search Engines | Top Google Searches | Importance of Digital Marketing | Importance of Website | Youtube SEO Tools | Types of SEO |Website Structure | Benefits of SEO | Cloaking |Google Sandbox | SEO Friendly Website | Blog Commenting for SEO | Server Side Rendering Vs. Client Side Rendering | Youtube Trends | Types of Sitemaps | Social Bookmarking |Off Page SEO Checklist | HTTP Status Codes | Vanity URL | SEO Vs. PPC | Best SEO Blogs | Benefits of LinkedIn Ads | Keyword Density | How to Use Keywords in Blog Posts | Website Migration | Digital Marketing Types | Search Engine Optimization | Canonical Tags | On Page SEO | What is Off Page SEO | Link Building for SEO | Image Optimization | SEO Company in Boston | Dallas SEO | SEO Company Houston