Every Web page on the Internet has an HyperText Transfer Protocol (HTTP) response code that is served when a browser or search-engine crawler sends a request to fetch the page. Based on the response code served, we can understand the status of the page. There is a long list of HTTP status codes and each status code defines different conditions. In this blog post, we will be explaining the difference between 404 and soft 404 errors. Read on to find out!
What is a 404 Error?
A 404 error or status code simply denotes that the requested Web page could not be found or that the page is no longer available.
Generally, 404 errors occur because of
An error in the URL
Errors in the URL could occur due to the user entering an incorrect URL or because a page has been linked to an incorrect URL. Incorrect URLs serve a 404 status code because they lead the user to a page that doesn’t exist at all.
A page that used to exist but was taken down intentionally or unintentionally from the server will serve a 404 status code.
How to Fix 404 Errors
The first step is to identify all the URLs present on the website, which a search engine can discover, and which are throwing the 404 error. You can do this with the help of two tools – Google Search Console’s Coverage report and running crawling tools like DeepCrawl, Screaming Frog, and so on through your website. Once you compile a list of such URLs, try to understand whether the error is because the URL has an error or because the page was taken down. This step is very important because it will give you an idea of what has to be fixed.
Fixing Linking Errors
There may be chances that due to some error your Web pages are linking to an incorrect URL. However, finding broken links across the website is a tedious task. To make it easier, you can use crawling tools like DeepCrawl, Screaming Frog, and so on to identify broken links. Once identified, you can fix the linking errors easily.
Fixing Missing Pages
In case some pages were removed from the website, either erroneously or intentionally because they no longer serve a purpose, they can be fixed in two ways:
1) Restore the pages
If you think that an important page was removed by mistake, you should restore the page and submit it in Google Search Console for re-indexing. Or, you could update the sitemap. Once this is done, start validating the 404 URLs in the Search Console.
2) Redirect to the most relevant page
If the pages that throw the 404 error are of no importance, then you should redirect it to the most closely relevant page on your website. Let’s say you have an e-commerce website. If a product is no longer available and the page has been removed, then it should be redirected to the category page of the product.
For example, the 404 product URL (https://www.example.com/category/product-name) should be redirected to the category URL (https://www.example.com/category) and not the homepage.
What Is A Soft 404 Error?
A soft 404 is a page that is missing from the server. However, it serves the 200 status code instead of 404 when requested. This indicates to search engine crawlers that the page is present so that they crawl through them even though they are non-existent. In the worst-case scenario, they might get indexed as well. However, this should be strictly avoided as it causes unnecessary wastage of crawl budget.
Also, Soft 404 is not an official HTTP response code sent by a server when a Web page is requested. It is just a label that Google uses for Web pages it has discovered. You can find soft 404 pages on your website in Google Search Console’s coverage section.
Following are the reasons why Soft 404’s occur
Poor server configuration
Due to poor server configuration, even missing pages serve the 200 status code which misleads crawlers. Servers should be configured in such a way that missing pages should always serve a 404 status code when requested.
Pages with very less or no content
Sometimes live pages with very less or no content are also misidentified as soft 404 as its behaviour indicates to Google that the page does not have potential and must be a 404. As Google is not sure about it, such pages are categorised as soft 404s.
Issues with page rendering
If your rendered page is blank or nearly blank, there are high chances that Googlebot is not able to load the page resources. This can happen if the resources are very large in size or blocked from accessing. Such pages are also marked as soft 404 since Google is not sure if it’s actually a 404 page or not.
How To Fix Soft 404 Errors
To start with, you should extract all the URLs from the Soft 404 section in Search Console’s Coverage report. Run all the URLs through a crawling tool and identify the URLs to which the 404 error actually applies. Fix these URLs with the methods mentioned in the previous section. Now, you can proceed to fix the soft 404’s with these steps:
Serve correct status codes
As the title suggests, ensure that the servers serve the correct status code for each and every URL. A valid page should serve a status code 200, a missing page should serve 404 and redirected pages should either serve 301 or 302. Do not mislead Googlebot!
Find and fix pages with duplicate or thin content
You can run the soft 404 URLs through the Screaming Frog tool and extract the word count of the content on each page. This will give you an idea of pages with thin content on your website. Screaming Frog also helps you identify pages with Near Duplicate and Exact Duplicate content. This should help you fix the duplicate content. You can either consolidate pages with similar topics together into a single page or add unique content to them.
Along with this, you should also look at technical issues that cause duplication, such as do trailing or non-trailing slash URLs, www or non-www version of URLs, https or http version of URLs, URLs with or without “.html” resolve to the main version of the URL? If no, are proper canonicals defined for duplicate URLs? If not taken care of, these issues can cause huge duplication problems, which is not desirable.
Ensure Googlebot is able to render your pages
If the pages have enough content and are still marked as soft 404, then there is a possibility that the crawlers are not able to crawl or render your page efficiently. For such URLs, you should check the rendered screenshot and HTML in Search Console. If the screenshot is blank or nearly blank, then the pages surely have a rendering problem. You can analyse the rendered HTML to find which resources are causing issues. Do not block any resources for crawlers and ensure that they are not extremely large in size.
A soft 404 error is not considered the same as a 404 page. It is an indication that something is wrong with the page and the crawlers are not considering it as a legitimate page. However, just like 404 pages, if you do not fix soft 404 errors quickly, Google might start deindexing your pages which will affect your website traffic if they are important pages. The best practice is to regularly put your website through a crawling tool and check for 404 pages and thin pages. Having access to a crawling tool is essential to help you fix these errors and Screaming Frog is one of the best and most highly recommended tools available.
Have you encountered these errors? How did you fix them? Let us know in the comments section below.
SEO Company in India | SEO Agency | SEO Company in Mumbai | Digital Marketing Services | Ecommerce SEO Services | Website Audit Services | Local SEO Service | PPC Services | ASO Services | Enterprise SEO Services | SEO Contract Staffing | SEO Guide | Types of Digital Marketing | Website Navigation | Image Optimization | Importance of Digital Marketing | What is Featured Snippet | Schema Markup | Google Reverse Image Search | What is YouTube SEO