What is an Orphan Page?
An orphan page is a page on a website that has no links pointing to it. These pages are not accessible because they have no internal connections that crawlers or users may utilise to get to them while navigating across your website.
Because some websites hide their landing pages on purpose, orphan pages are frequently presented with a “notice” tag rather than a “error” tag.
Orphan sites will not be found by search engines, which is why it is critical to check your website for them. This is due to the way Google discovers new web pages on a website:
- Crawlers recognise the URLs of pages mentioned in your XML sitemap.
- Crawlers look for URLs that are linked to another website either internally or outside.
If you want a webpage to be indexed and found by search engines, you’ll need to look for orphan pages on your site and perform the appropriate steps.
Are Orphan Pages an SEO issue?
When a search engine can’t find a page through links, it’s usually ignored. Even if your webpage is included in your website’s XML sitemap, it can still be a problem for SEO:
- Orphan sites may include out-of-date information, lowering your domain authority.
- During the process of website migration, pages are frequently orphaned. This is a problem because orphan pages may have useful content that can help you improve your rankings.
- More orphan pages on your website can confuse search engines regarding the context of your content, thereby lowering your SERP ranks.
Orphan Pages vs. Dead End Pages
It’s crucial to understand the difference between dead-end pages and orphan pages.
The term “orphan” refers to pages that are not connected to or reachable from any other pages. Dead-end pages, on the other hand, aren’t linked to any external or internal websites for crawlers or people to explore. As a result, a “dead-end” is created, hence the term.
When a user hits a dead-end page, he or she has two choices: abandon the website or return. Search engine crawlers, likewise, are unable to convey any link equity because they have nowhere to go from dead-end pages.
While any dead-end page may be fixed by simply adding links to the content or adding sidebars/footer navigation, orphan pages are different. Let’s look at how to locate and fix orphan pages.
ALSO READ: Impact of Marketing on Society
How to find Orphan Pages on a Website?
Get a list of your website URL’s
Finding orphan pages is a time-consuming and sometimes impossible task for crawlers. As a result, using an SEO tool would be challenging because they rely on data gathered by crawlers.
The best technique to detect an orphan page is to use a Google Analytics report to compile a list of all the URLs on your website. You can easily do this with any other analytics software you choose.
The page will appear in the Analytics report if it has ever been viewed. There is a record of the URL someplace, and if you look at the pageviews portion of the report, you may easily find it.
Resolve Page Duplicate issues
It’s possible that the most prevalent reason of orphaned pages isn’t even anything you’d consider. Page duplication is a problem that is frequently neglected and should be addressed right away. Each duplicate page should only redirect to one URL, and if it does not, the versions of that page will very certainly not be connected. As a result, they may become orphan pages.
The fact that these pages are duplicates is the fundamental concern in this circumstance. When looking for orphan pages on your website as part of a site audit, this should be the first place you examine. There are two types of page duplication to watch out for:
1. Non Canonical Pages
The https or http protocols, as well as www or non-www in the URLs, should be used consistently on each page of your website.
As a result, you must examine each of your public pages by putting in all of the variations of your pages in the browser, such as this:
All of these versions should lead readers to the same page, with the same URL. The web pages will become canonical to themselves as a result of this. If any of these variations fail to redirect the search to the appropriate webpage, you should be aware that you may be dealing with a common issue. You should check other websites as well, whichever variation is causing the problem.
2. Trailing Slashes
This is yet another tiny detail to keep an eye on that can have a significant influence. If you don’t use trailing slashes consistently on your website, some of your pages may become orphaned. Let’s look at another scenario:
These URLs may deliver the same content to visitors, yet their URLs are distinct.
Check your webpages for both of these versions to determine if the users are being sent to the same page. Ensure that this is done uniformly across all of your webpages. You can use “.htaccess” to make this procedure take care of itself and ensure that all of these variations lead to the same URL.
Compare the list of Crawlable URLs and Analytics URLs using Google Analytics
This is the most straightforward method for locating orphan pages on a website. All you have to do now is go to the “Site Content” area and click on “All Pages” to collect all of your website’s URLs.
The following sections will appear in the list:
- Page (URL)
- Unique Pageviews
- Average Time on Page
- Date Range
To distinguish between normal and orphan pages, pay attention to the Date Range and Pageviews sections.
Orphan pages are destined to have the lowest page views because they are not accessible to users. Simply click “Pageviews” to put the least-visited sites to the front, and your orphan pages will most certainly follow.
Another alternative is to select “Date Range” and specify the filter’s start date as far back as Google Analytics was installed. Because Google Analytics can only display 5,000 URLs at a time, choose the highest number of rows from the “Show Rows” area at the bottom. In all likelihood, this will cover all of your orphan pages.
After all of your URLs have loaded in Google Analytics, click export to acquire a CSV or excel file of them. You may also use the Google Analytics API to help speed things up.
You only need to add the required functions to separate crawlable URLs from Analytics URLs after you have this list. To get a sense of what I’m talking about, look at the image below:
The orphan URLs in the list should then be identified by comparing the list of Analytics URLs with the list of Crawlable URLs. The last link in the example above, “https://xyz.com/7,” is an obvious orphan page. In practise, this list will be quite large, and you’ll have to search through a lot more URLs to discover the orphan page.
This mechanical process is simple to automate. To check if each URL in the Crawlable list is also in the Analytics list, use the match algorithm below:
When the formula is dragged along the relevant column, the dollar signs tell the sheet not to change the range. In addition, the value “0” tells Google that the list isn’t ordered.
The matches will be restored to the first location in the range after running this algorithm. Because they were not detected in the Crawlable List field, the ones that don’t match will be returned with a “#NA” error. As an example, “https://xyz.com/7” would be displayed with “#NA” as follows:
This will automatically display all orphan pages in the list for you. All you have to do now is filter out all of the #NA results.
Take the help of other tools to discover your Orphan URLs
There are a variety of tools available to help you locate orphan pages on your website after you’ve figured out how to do so.
The tools that provide the finest setups and functions for this purpose are as follows:
- Moz Link Explorer
- Raven Tools
Apart from discovering orphan pages, all of these tools provide a profusion of capabilities that can help you with a variety of other tasks. Ahrefs, Moz, and SEMrush are three of the tools that can assist you to find orphaned pages much more quickly.
Another benefit is that these tools will uncover pages on your website that aren’t being crawled directly and aren’t necessarily orphans. This can assist you in improving and generating value from these sites.
From the server, your development team may quickly compile a list of all of your website’s URLs. All you have to do is look through the log files for information on:
- Who is it that comes to your website?
- Where do they come from when they go to the website?
- Which pages did they go to?
This information will aid you greatly in running the second crawl of your entire website. You can achieve this by disregarding directives like “noindex” and “nofollow” and comparing the new data to the old crawling data to locate orphaned pages that were missed. The reason for this is that crawlers can sometimes access pages that disobey these directives, resulting in orphaned pages.
Look for the list of URLs in the GSC’s Search Analytics report after you’ve completed this operation. You might be wondering if these URLs have been indexed already. Yes, although some of these pages may still be inaccessible via your website’s internal links. These pages are at risk of becoming orphan pages in the future, but you can prevent this from happening.
Fixing Orphan Pages – Get Ahead in the Game
Orphan pages can be a significant problem for your website, particularly in terms of SEO. Let’s look at the following stage, which is to fix orphan pages now that you know how to find them.
When you’ve found all of your website’s orphan pages, the following step is to decide which ones are worth addressing and which ones should be eliminated. The following are the questions you should ask yourself in order to make this decision:
- Where does the page now reside in your website’s taxonomy?
- Is the page useful to the visitors? If so, where in your website’s architecture should it be placed?
- Is it possible for the page to rank for any keywords? Is it possible to optimise it to improve your website’s SEO?
- Is it possible that the page will be backlinked? Or does the page have the potential to be linked to from other websites?
- Is the content on this page similar to that on any of the others?
The answers to these questions will aid you in deciding whether or not to maintain or delete the orphan pages. You may also use this data to figure out how much labor it will take to fix the pages you preserve and how much value they will provide.