How To Find Duplicate Content On Your Site (Choice of 2 Tools)
July 30, 2016 by Alex Miller
Google doesn’t want to populate its SERP’s with the same content. They want to show their searchers 10 unique results per page. Otherwise, there wouldn’t really be a “choice” of results to click on 🙂
Whether it’s fixing duplicate title tags, meta tags, chunks of content or even an entire page, it’s gotta be done folks. Track your rankings and results when you fix these problems and you’ll see a big impact.
The most accurate way of doing this is to follow our “snippet search” method in our Avoid Panda PDF – but here’s 2 tools you can use to make your life a little easier, even if the results aren’t quite as accurate.
Method #1 – SiteLiner.com
Pro’s: Nice interface; only need to type in the root domain for a full scan; includes information on “broken links” which is very useful.
Con’s: Harder to see all the data / duplicate content areas. Need to go through page-by-page.
Step 1: Go to SiteLiner.com and type in the domain that you want to analyze.
(I just picked on a random plumbing company)
I just used a free account to show you its limitations (I recommend upgrading to a paid account, it’s a credit system).
Below you can see they have a significant amount of duplicate content. Yikes.
Step 2: Time to dig deeper; Click the “Duplicate Content” link shown below:
Step 3: Start clicking on each page to see even more detail (where SiteLiner shows you exactly where the duplications are).
As you can see, there’s a nice breakdown of each page that has some duplication issues.
Siteliner.com does a pretty nice job of highlighting the duplicate content.
Once you find content that is duplicated, you need to take action and clean up the mess.
The most likely course of action is for you to remove the duplicate content and replace with high quality, extensive unique content to boost the quality score of that page. If you just don’t want the page at all and want to get rid of it, just de-index it!
Method #2 – URLProfiler.com
Pro’s: This is a more accurate method as it excludes “common content”; you can see all the data in a spreadsheet vs clicking from window-to-window. Essentially, the data is easier to see and manipulate.
Con’s: A more time-consuming process
I’d personally use URLProfiler.com if you like to work with spreadsheets!
We’ve recently started using this tool in our site audits and find it does a great job identifying both internal and external duplicate content issues.
If you like to work with spreadsheets, then you may favor this tool more than our 1st recommendation.
(You will have to work a bit to get the information you’re looking for, but the end results are excellent).
- You’ll need a current list of proxies to load into the software. The duplicate content feature of URL Profiler works by searching Google for exact match snippets of content from the pages you are testing.If you’re only testing a few URLs, you can probably get by without proxies, but for testing URLs in bulk, it’s the only way to go – unless you enjoy manually entering captchas into Google for the rest of the day!
- Also, make sure you have a list of URL’s that you want to analyze in a list ready to paste into the tool. (You can use Screaming Frog to get that list of URL’s for any domain, or if you don’t have that tool then use this free one which crawls your site and returns a list of URL’s that you can download into Excel. (100 URL’s without registering and up to 1,000 if you do register).
Now that we’ve got that out of the way, here’s our process!
Step 1: Paste the URLs you want to analyze (for duplicate content) into URL Profiler by right clicking in the URL List area and selecting “Paste from clipboard.”
Step 2: Under “Content Analysis” select the “Duplicate Content” checkbox.
Increase the accuracy of results by identifying your “CSS Selector”. By doing this, this will isolate the HTML element containing the main content of the page (thereby leaving out content in sidebar, header, footer etc).
Step 3: To find the CSS selector, right click on your web page and select “Inspect” (See the example below):
Step 4: Next, highlight the content on the website and the CSS selector in the Inspection pane will then become highlighted.
Step 5: Copy This Selector and Paste it into The Content Area in URL Profiler:
Step 6: Click “Apply” and then Click “Run Profiler.”
Step 7: Save the file and once the results are ready, simply click “Open.”
Step 8: Once you have your spreadsheet open, expand columns S, T and V to see your results.
- Column S will contain the first snippet of text that URL Profiler scraped from your site.
- Column T will contain the second snippet of text the software scraped.
- Column V contains the URL that Google returned as the first result when searching for “snippet 1″+”snippet2”. If this URL is the same as your original URL (located in Column A), then there are no duplicate content issues!
However, if there is a different URL in Column V, you have a duplicate content issue.
Once you’ve found duplicate content issues, how do you fix them?
The solutions are as follows:
- No-index the page
- Remove the duplicate content and add-in as much unique content as you can.
Conclusion – Which Tool is Best To Use?
I personally recommend that to use SiteLiner.com if you want to work on fixing a smaller number of pages that have duplicate content issues (e.g. 20-50 pages or under). You can go through each one and it’ll effortlessly show you the duplications which you can then go about fixing.
Any more than 20-50 pages and I’d personally go a little crazy flicking back and forth between the windows; having it all in a spreadsheet might be easier for some (i.e. use URLProfiler.com)
And so – for larger sites and a more accurate “deep dive”, try URLProfiler.com – ultimately, it’s up to you and I encourage you to test both!
To wrap up: the whole focus of this article was to help you fix your duplicate content issues – and to give you a range of options.
If this is all too much for you, we offer a full-spectrum onsite audit (wherein we do all of this – and more), either for your own sites or your clients. You’ll find it by logging into PosiRank, and going to Order Services > Onsite SEO > Full-Scale, Comprehensive Site Audits.
But nonetheless, now you know the pros do it 🙂