Have you found duplicate URLs in Google Analytics (GA) for what appear to be the same pages?
On the face of it, you may think that the URLs are the same, but on closer inspection, you’ll notice that one URL has a trailing forward slash and the other doesn’t. This is actually quite a common issue.
Let’s dive into the reasons why and learn how to fix it.
Why are duplicate URLs within Google Analytics an issue?
The main frustration with duplicate URLs is that any GA report which identifies pages by the URL will split data depending on the actual URL accessed, despite the fact that the page content remains unchanged.
In the example below, because one of the page URLs has a trailing slash, there are two results for the same page instead of one.
Can you fix duplicate URLs manually?
To determine whether you can fix duplicate URLs in Google Analytics manually, you first need to know where the destination URL originates from.
Let’s take a quick look at what you can do:
- Amend all internal links to your website to use the same URL (with or without the trailing slash).
- Update any external links under your control (directory listings, guest blog posts, social media links etc).
- Supply the correct URL to all future partners, referrers and affiliates.
- Reach out and ask existing partners and referrers to update their URLs.
When you can fix a URL, you should do so for the sake of consistency. However, you can’t stop anyone linking to your site, or to any URL they wish, thus adopting a reactive approach will be a never-ending task.
Can you fix duplicate URLs in reports automatically?
Yes you can. The most effective way to fix this issue, is by using a Google Analytics Filter. (Note, this doesn’t fix the actual URL or the input data to GA, but fixes all the data in your GA Reports.
Google Analytics filters work by following a set of pre-determined rules to process the data before it enters Google Analytics. In this sense, what you need to do is create a filter in GA which adds a forward slash to the end of any target URL which does not have one.
Note, not all URLs require a trailing forward slash, such as /xyz.html or /guides/best-guide.pdf.
How to fix duplicate URLs in Google Analytics reports with a filter:
In order to fix duplicate URLs in Google Analytics, we need to create a custom filter. This is done with the Admin > Account > Property > View
Warning: You should never create custom filters in your master view. This is because once GA processes the data, you cannot revert back, there is no ‘undo’.
As best practice, you should always have a master (unfiltered) view, and preferably two other views, such as live and test. If you only have one view, I strongly recommend creating the other views before performing this task.
Once you have created your test view, navigate to it and select ‘filters’
Create a new filter with the following data as in the example below:
- Filter Name: Pages Add Trailing Slash
- Filter Type: Custom > Advanced
- Field A: -> Extract A: Request URI
- Copy & Paste the Following: ^(/[a-z0–9/_\-]*[^/])$
- Output To -> Constructor: Request URI
- Copy & Paste the following: $A1/
- Leave all other settings as default
- Press Save
Your new filter should look like this:
Your custom filter is now ready. From this point on, all new data in the test view will be consistent.
How can you test your GA filter?
Navigate to your Real-Time reports, you will see the live URLs being accessed on your website.
If you open two variations of the same URL in separate browser windows or tabs, you will notice that only one version of the URL appears in the report, but with 2 active pages. If other users are on your website (hopefully) whilst you test, bear in mind that you may see more than 2 active pages.
Make your new filter live!
Now that you’ve tested that the filter works correctly, you can safely re-create the filter in your live view, being careful to copy the setup exactly. Again, don’t add the filter to your master view.
I would also recommend leaving the test filter in your test view, as you may want to test that the filter does not conflict with any tests you perform in future. But if you’re completely happy, of course you can remove them, it’s a test view after all!
This is working for urls with only letters but it does not work with any url that has numbers in it. EG: /2020_news. We have news pages and the parent folder does not have the trailing slash and neither do the pages below.
Hi Michelle, my apologies, when I prepared the blog it looks like Microsoft Word auto-formatted a portion of the code within the Request URI field. The second hyphen was converted to a long dash which meant the code wouldn’t work on pages which start with numbers… I have updated the code in the post and it now works correctly!