Google: Pages with Similar URL Structure May Be Considered Duplicate

Ashwin is an SEO professional who is inclined towards technical & On page SEO. He has completed his graduation with B.Sc. Computer Science as his major. In his free time, he loves to read about Science & Technology and business.

Reviewed By

Growth Team

This post is contributed by the Growth Team, dedicated to providing insights and updates on the latest trends and best practices.

Home > Digital Marketing News > Google: Pages with Similar URL Structure May Be Considered Duplicate

Ashwin Kumar

Google: Pages with Similar URL Structure May Be Considered Duplicate

Table of Contents

Google’s John Mueller has recently said that pages with similar URL structures may be considered duplicates. It was discussed in the Google Search Central Hangout on March 5th, 2021 by one of the participants Ruchit Patel. He told Mueller that he manages an event website where thousands of URLs are not being indexed correctly.

Google uses a predictive method to identify duplicate content based on URL patterns on the web while crawling websites. Such assumptions can lead to pages wrongly being tagged as duplicates.

As we already know, Google needs to crawl the web to index and serve the content to the users. In this process, it uses different methodologies to optimize its crawl to become more efficient. One of the ways is by predicting if the pages contain duplicates or not with the URL structure’s help.

While crawling, if Google encounters a URL pattern with similar content, then it assumes all other URLs with the same pattern might contain similar content. This method might be efficient for Google; however, for site owners, this could mean their unique content might get classified as duplicate because of their same URL pattern. These pages will be left out of Google’s Index.

Mueller on Predicting Duplicate Content

Google has different ways of determining when the pages have duplicate content. One is by analyzing content on the website, and the other is by predicting duplicate pages based on URL patterns.

“What tends to happen on our side is we have multiple levels of trying to understand when there is duplicate content on a site. And one is when we look at the page’s content directly and we kind of see, well, this page has this content, this page has different content, we should treat them as separate pages.

The other thing is kind of a broader predictive approach that we have where we look at the URL structure of a website where we see, well, in the past, when we’ve looked at URLs that look like this, we’ve seen they have the same content as URLs like this. And then we’ll essentially learn that pattern and say, URLs that look like this are the same as URLs that look like this.”

Mueller further adds that Google does this to save resources while crawling and indexing. If Google thinks a page is a duplicate version of the other page with a similar URL structure, then the duplicate page would not be crawled by Google to check the content on that page.

“Even without looking at the individual URLs, we can sometimes say, well, we’ll save ourselves some crawling and indexing and just focus on these assumed or very likely duplication cases. And I have seen that happen with things like cities.

I have seen that happen with things like, I don’t know, automobiles is another one where we saw that happen, where essentially our systems recognize that what you specify as a city name is something that is not so relevant for the actual URLs. And usually we learn that kind of pattern when a site provides a lot of the same content with alternate names.”

Mueller’s Answer to the participant for Event Websites

Mueller explained how the predictive method of Google would have affected the event’s website.

“So with an event site, I don’t know if this is the case for your website, with an event site it could happen that you take one city, and you take a city that is maybe one kilometer away, and the events pages that you show there are exactly the same because the same events are relevant for both of those places.

And you take a city maybe five kilometers away and you show exactly the same events again. And from our side, that could easily end up in a situation where we say, well, we checked 10 event URLs, and this parameter that looks like a city name is actually irrelevant because we checked 10 of them and it showed the same content.

And that’s something where our systems can then say, well, maybe the city name overall is irrelevant and we can just ignore it.”

How to Fix the Problem?

Mueller suggested to the webmasters to limit duplicate content on the website and correct the issue where there are real issues of duplicate content.

“So what I would try to do in a case like this is to see if you have this kind of situations where you have strong overlaps of content and to try to find ways to limit that as much as possible.

And that could be by using something like a rel canonical on the page and saying, well, this small city that is right outside the big city, I’ll set the canonical to the big city because it shows exactly the same content.

So that really every URL that we crawl on your website and index, we can see, well, this URL and its content are unique and it’s important for us to keep all of these URLs indexed.

Or we see clear information that this URL you know is supposed to be the same as this other one, you have maybe set up a redirect or you have a rel canonical set up there, and we can just focus on those main URLs and still understand that the city aspect there is critical for your individual pages.”

It is always good to reduce duplicate content on the website. However, it is worth noting Google does not penalize or have any negative ranking for websites with duplicate content.

People are also reading

Digital Marketing News

Google Launches New Search Features in Europe to Enhance User Experience and Compliance

2 min read
Feb 16, 2024
Vivek Chaudhary

Digital Marketing News Search News

Google Diminishes the Visibility of HowTo and FAQ Rich Results in Search

3 min read
Dec 20, 2023
Yahya Punjabi

Digital Marketing News Search News

Google Introduces Generative AI to Enhance Search Experience in India and Japan

2 min read
Sep 01, 2023
Ishika Jain

Digital Marketing News Google Algorithm News Search News

Google Introduces a New Ranking Algorithm Research is TW-BERT

3 min read
Aug 26, 2023
Yahya Punjabi

Digital Marketing News Google Algorithm News Search News

John Muller on the Importance of Text-to-HTML Ratio for SEO

1 min read
Aug 26, 2023
Shritej Mali

Digital Marketing News Search News

If You’re Using AI, It’s Going To Be Rehashed From Other Sites

2 min read
Aug 26, 2023
Shritej Mali

Free SEO Tools

Think Your SEO Is On Track?

Find Out with Infigrowth’s Free 7-day Trial.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Google: Pages with Similar URL Structure May Be Considered Duplicate

Mueller on Predicting Duplicate Content

Mueller’s Answer to the participant for Event Websites

How to Fix the Problem?

Popular Searches

1 thought on “Google: Pages with Similar URL Structure May Be Considered Duplicate”

Leave a Comment Cancel Reply

People are also reading

Google Launches New Search Features in Europe to Enhance User Experience and Compliance

Google Diminishes the Visibility of HowTo and FAQ Rich Results in Search

Google Introduces Generative AI to Enhance Search Experience in India and Japan

Google Introduces a New Ranking Algorithm Research is TW-BERT

John Muller on the Importance of Text-to-HTML Ratio for SEO

If You’re Using AI, It’s Going To Be Rehashed From Other Sites

Our Solutions

Website Audit

Ecommerce SEO

Enterprise SEO

Local SEO

App Store Optimization

Conversion Rate Optimization

SEO

Penalty Recovery

Content Writing

Free SEO Tools

Think Your SEO Is On Track?

Share this article

Google: Pages with Similar URL Structure May Be Considered Duplicate