Zum Inhalt springen
Zur Hauptnavigation springen
Zum Footer springen

Duplicate content & thin content - find and avoid [Guide]

updated on February 18, 2025
Man and woman in the office with Google on screen
Sebastian Prohaska
Author: Sebastian Prohaska

Owner & Managing Director of ithelps Digital. Since 2013, he has been deeply engaged in SEO and online marketing.

Beware of duplicate content and thin content - the invisible dangers for your SEO! The proof is in the pudding: only content that really offers value and fulfils the search intent will make it to the top.

In this article, I'll show you how to avoid thin content and duplicates while increasing the value of your site.


Duplicate content

Duplicate content means content that appears on more than one web page.

First things first: you will not be penalised by Google. However, it does have an impact on SEO: the wrong page may rank if you have internal duplicate content. With external duplicate content (copy from another website), the original page will rank in most cases. Your copy will not be ranked or indexed. This may seem like a penalty from Google, but it is only an effect of the Panda update.

Take a look at what Google says about this here:

How is duplicate content created?

(Source: SEO-Summery de)

The most common reasons for internal duplicate content are

  • Website is accessible with and without www
  • Website is accessible via http and https
  • Archive and category pages
  • Overview pages for filters or tags
  • internal search results pages
  • Pages or posts that are assigned to several categories and/or tags
  • Pagination (page numbering) e.g. of comments
  • URL parameters and session IDs
  • Print versions of page content
  • Identical or very similar product descriptions
  • Mobile website with identical content

The most common reasons for external duplicate content are

  • Web projects on which partly similar or identical content is published
  • Content theft or content theft
  • Content scraping (scraped content)
  • Distribution / publication of press releases
  • Taking over manufacturer product or article descriptions
  • Publishing your own content on news portals or in forums
  • Content insertion from newsletters or via RSS feeds

Finding duplicate content

Find internal duplicate content with Siteliner:

To identify internal duplicate content on your own website, I can recommend siteliner.com. The online tool is free and you can use it to check your entire website for duplicate content. The analysis includes a detailed list of each individual page with information on the content match (matching words, page content match in per cent, number of pages with similar content, relevance of the page for search engines).

Ergebnis eines Duplicate-Content-Checks mit siteliner

Siteliner is provided by the developers of copyscape.com, the leading service for online plagiarism and duplicate content detection. And we also use this online service to find external duplicate content.

Finding external duplicate content with Copyscape

Copyscape is the counterpart to Siteliner and helps you search for external copies and plagiarised content. To do this, enter your URL in the search mask and click GO.

Eingabefeld des Duplicate-Content-Checkers copyscape

The online tool then searches its own database and external data sources for similar or identical content and then displays the corresponding text excerpt together with the websites on which the duplicates were found.

Duplicate content analysis with Google Search

If you want to check individual websites for duplicate content, you can also use Google Search.

Simply enter a passage of your text that you think is unique into the search mask. The text snippet should not be longer than 32 words and should be enclosed in inverted commas.

Duplicate Content Analyse mit der Google SucheThis is what it looks like when no duplicate content is found.

What to do about existing duplicate content?

You deal with internal duplicate content in the same way as with thin content.

  • change it
  • expand it
  • delete it
  • add to it
  • turn it into unique content in its own right

A print version of your content may also result in a duplicate content message. In this case, you should set a "NoIndex" meta tag for the print version.

For similar product descriptions, set a canonical link to the original page.

For external duplicate content, proceed as follows:

If, for any reason, you have included text passages from other websites in your content, you must first mark this as external text. Indicate the source.

In addition, you must not publish exclusively third-party content on your website. Short passages to support your own content should not be a problem.

You can also comment on or discuss short passages of other people's text. But always remember to cite the source.

If you find your own texts on another site, I would write to the owner of the website and ask for the "stolen content" to be removed. You can usually find the contact details in the legal notice. If there is no legal notice, you can use various services to find out who the website operator is.

Examples of such services are

In most cases, this step is sufficient and the duplicate content will be removed. If this is not the case, you can submit a report via the Google Search Console. Here you can read what Google says about duplicate content.

In addition to duplicate content, Google is also targeting thin content. So let's talk about that too.

What is thin content?

Thin content refers to website content that is of little or no value to the user in terms of information content, solutions or learning content.

Google and the other search engines (Bing, Yahoo, etc.) only want to provide their users with high-quality search results. You can't get round that. And it's basically quite logical.

Google is not a charity organisation, but wants to expand its market leadership and its business model. Yes, the search engine giant wants to earn money. The more, the better. And that only works if users are satisfied with the search engine.

If Google presents its users with content without added value, they are dissatisfied. Dissatisfied users do not click on adverts. Moreover, how can Google display relevant adverts if the content doesn't offer anything? Thin content is therefore useless for the user and for Google. This is why it is ranked low or not indexed at all.

Thin content does not necessarily mean short texts (under 300 words)

Many SEOs recommend not writing texts that are too short. We at ithelps are also among them. A guideline for this is: Texts should not have less than 300 words.

But: Short texts, even if they contain less than 300 words, do not necessarily have to be worthless content.

A short list post, a list of important points on a particular topic, can be of great benefit to the website user if that is exactly what they were looking for.

In such a case, you don't need to artificially stretch the text to over 300 words. Write a short, meaningful introduction so that the reader and the search engine recognise what it is about. Then link to it from relevant pages and share your text on social media platforms. Google will recognise from the social signals and the relevance of the text that it is not thin content. Even if it has less than 300 words.

Take a look at what Matt Cutts from Google says about thin content:

Why is thin content hurting my website?

To be a little more precise. Worthless content doesn't hurt your website, it hurts your rankings. Your website will continue to function, look good and consume your time and money resources. However, it will not rank in the search engine. It may not even be indexed. And you don't want that. Right?

So you need to do something about useless text and all the other quality issues.

Let's start by finding the useless, thin content.

How do I identify thin content?

To identify worthless content, you can proceed as follows:

Go to your Google Analytics account and find the websites that have a high bounce rate and a short dwell time. Bounce rate and dwell time are good indicators of whether the content is useful to the reader or not.

If your content is good, the dwell time will be correspondingly high and the bounce rate low.

With thin content, the reader leaves your site very quickly and has no reason to visit other pages on your website.

The Google Search Console is another good way to find empty text. Look under Crawling errors > Soft 404 to see if there are any entries. These are usually also thin content.

What to do about existing thin content?

If you have found empty text on your website, you should do something about it to improve your rankings. You have three options:

  1. Enhance content
  2. Delete content
  3. Set content to NoFollow

How to turn thin content into rich content

You can enhance your thin content with Storytelling, adding relevant information, lists, comparisons, case studies, graphics, infographics, explanatory videos..

Storytelling:

Write a story around the topic. For example, as an introduction, tell how you came up with the topic:

"I recently saw a report on "keyword/keyword combination". It mentioned that "long-tail keyword"... I thought about it and did some research. Here are my findings and experiences."

Then add to the existing text and remove the excess keywords to avoid keyword stuffing.

Finish your story with a conclusion in your own words.

Add relevant information

Thin content is called that for a reason. It lacks relevant information. You need to add this missing information to your text.

If you have nothing more to say on the topic or know nothing more about it, do some research. Read through the texts of the websites ranked on page 1 of Google. What content can you find that is missing from your page? Add this information to your text. But be careful: Do not copy 1 to 1. Write in your own words. Write the way you understand the matter.

Copy and paste leads to duplicate content, which is just as bad as thin content. We'll talk about that in a moment.

Lists and comparisons

Supplement your text with lists.

  • List the advantages and disadvantages of a product, a method, etc.
  • Compare product A with product B.
  • Create a pros and cons list for your topic.

Case studies - case studies

Have you or someone you know gained experience with your topic? Write about it.

Graphics and other media

Support your text with relevant media. These can be:

  • Infographics
  • Images and graphics
  • Screenshots
  • Videos
  • etc.

Make sure that you name the files meaningfully (with keyword or topic-related words) and add ALT-TAGS. The image description also lends itself to the placement of relevant content.

Delete thin content

In some cases, it can also be useful in terms of SEO to delete thin content or content that is no longer relevant.

Set thin content to NoFollow

In principle, I don't believe in blocking Google from content. But there are exceptions.

  • You have identified a text as worthless. For some reason, however, it delivers a lot of traffic via social media.
  • It's a landing page to collect email addresses for your newsletter. It usually only contains a headline, a few bullet points and a registration form.
  • Libraries or download pages, which often only contain download links.

In such cases, you obviously don't want to delete the page. If you don't want to or can't add any valuable content, it makes sense to set it to NoFollow.

Conclusion

Duplicate content and thin content are two of the biggest hurdles on the road to success. But as we have seen, they are not insurmountable. With the right tools and strategies, you can ensure that your content is not only unique, but also valuable and relevant to your target group. Avoiding duplicate and thin content is not just a matter of SEO ethics, it's an absolute necessity to score highly with Google and co.

By using the methods presented to identify and correct duplicate and thin content, you are sending a clear signal against mediocrity and in favour of quality. It shows that you are prepared to take the extra step to offer your readers added value. And that's exactly what search engines reward. Ultimately, it's an investment in the future of your online presence and in the trust of your visitors.

So take from this article the lesson that originality and depth are not just SEO strategies, but the cornerstones of any successful website. Don't get carried away by the flood of duplicate and thin content. Instead, be the beacon of uniqueness and quality in a sea of uniformity. Because at the end of the day, the quality of your content will determine how high you rank in search results and how visible your brand is in the digital world.

 


Any questions?

If you have any further questions on the topic or would like professional support, feel free to get in touch with us. Send an email to office@ithelps-digital.com, call us at +43 1 353 2 353, or reach out for us on our contact page.



Share this article