How To Remove Referral Spam From Your Google Analytics Reporting

In Uncategorized by Simon

Unless you have been very, very lucky, you will have been seeing an increasing amount of ‘referral spam’ in your Google Analytics data recently.

This is something that has been around for a while, but over the last few months seems to have been getting much worse – we have a client that has had over 2,500 ‘visits’ in a single day this week from one of the worst culprits “4webmasters.org”!

This amount of fake traffic can seriously impact the trends of genuine traffic, skewing your sites bounce rate and generally making it much harder to see the genuine referral traffic activity. So the best solution is to filter out this kind of traffic so your reporting data stays clean. However this is easier said than done, and some solutions being suggested are not all that robust or future proof.

While doing some advanced Analytics work recently (cross domain tracking over multiple sites – video and blog post coming soon!) we came across a highly effective solution. Rather than continually try to exclude traffic from spammy sites, instead we now only include traffic that is intended for our sites. Let me explain:

Most spam is of a ‘spray and pray’ nature – i.e. the spammers are not usually selective, and use software to spam at high volume. Therefore they don’t actually know your website domain. The website domain of good incoming traffic, is the domain the Google Analytics code resides on. This is commonly just your own website domain, but it may also be one or two others that your tracking code exists on – such as:

  • Google user content caching
  • Google translate
  • Youtube
  • Other websites you own that use the same UA Profile ID

Junk traffic from spammy referrers will not know this right domain – or ‘hostname’ and so will try to fake it with a large popular site or more often than not, just leave it as blank – as seen in the examples below – green are good traffic, red lines are spam:

referral spam hostnames

So the best solution to keep the fake traffic out of your reports is to set up a custom include filter on a suitable view. To include multiple domains, you just need to separate them with a ‘pipe’ as shown in the example below:

.*yourdomain.com|.*youtube.com|.*yourotherdomain.com|.*googleusercontent.com

Filters are created at the ‘View’ level in GA, so once you are logged in, got to Admin, choose the view you want to filter* then click Filters, and click the  + NEW FILTER button. Choose custom type,  Include, and in the filter field drop down, choose Hostname.

Here is an example of what it should look like:

referral-spam-filter

Once you have carefully entered the ‘filter pattern’ i.e. the list of URLs you want to include, you can scroll down and click Verify Filter, to see a snapshot of the last 7 days worth of data, and how the filter you just created would impact that data. If everything looks as expected (filtering out the fake traffic!) hit SAVE and you are done.

PLEASE NOTE!

It is best practice to …

Have one view that is completely unfiltered, so be sure to create a new view to apply this filter to, or apply it to a relevant view if you already have more than one. Filters remove the data at processing time, so by keeping an unfiltered one, you always have a reference of the ‘raw data’ in case it is ever needed.

Also, be aware that filters work on traffic data as it gets recorded, so your filter will not apply to historic data. So get it set up as soon as possible, what you are you waiting for?

If you need assistance or have questions, please let us know in the comments, and if you don’t have the time or skills to do this, we offer a Google Analytics consultancy service and can help you get this and other Google Analytics enhancements set up too. Contact us to find out more!