Filtering Out ‘Working From Home’ Traffic in Google Analytics

alt
Josh Berry-Jenkins - Technical Director Written on July 31st, 2020, Last updated on October 10th, 2023

In June 2020, the ONS reported that 49% of workers had either “exclusively worked from home or had worked from home alongside travelling to work”. Companies like Google and Twitter have suggested that working from home may become the new normal in a post-COVID world.

Traditionally, internal traffic was most often filtered out through an office IP filter within your Google Analytics view, with the introduction of GDPR we saw those IP’s being anonymised before the hit reached Analytics, causing issues with the filter.

Now with the recent pandemic, we’ve seen the efficacy of those filters obliterated as the workforce shifted towards a ‘working from home’ approach. So where does this leave your data?

Working from home
Remote workforce, or legitimate traffic?

 

Whether you have filters that have been broken by GDPR or simply by moving out of the office space, the best solution to the problem at the moment is using cookies to distinguish internal traffic.

In a nutshell, rather than filtering traffic based on IP you will use a first-party cookie to determine whether traffic is internal or external, this, in turn, will either set a custom dimension such as traffic source, or the utm_source directly to internal. This will then dictate a new filter for separating traffic from views as required.

Limitations

As amazing and magical as cookies are for this method it’s worth quickly pointing out the limitations of this method.

Firstly, cookies don’t set themselves, we will cover some different methods of getting them in place, but if the steps aren’t followed then the traffic won’t be magically tagged.

Secondly, cookies are device-specific, if your staff works on multiple devices, then encourage them to get all devices ‘cookied’ via these methods, not just one.

Thirdly, these are first-party cookies, which means they can only be read across the same domain, and won’t work if you have some crazy cross-domain setup with linked analytics accounts – shouldn’t matter for most, but worth mentioning.

Custom Dimension or utm_source?

Which is best?

In this case, it boils down to two things:

  • How many Custom Definitions you have left
  • How often you use Default Channel Grouping reports

If you make use of the Default Channel Grouping and have plenty of custom definitions to spare, then go ahead and create one for traffic source or however you want to name the attribute that determines whether traffic is counted as internal or external. This is the route I take personally, as I prefer to keep that source visibility.

Used almost all your custom dimensions and never bother with Default Channel Group? Then another possible route you could take is to set the utm_source directly based on the cookie value, the misattribution in channels probably won’t bother you and you don’t have the custom definitions to spare. Keep in mind this isn’t the route I take, but a theoretical alternative nonetheless.

What’s needed?

There are several different ways to go about this, but they all revolve around setting a first-party cookie and either setting a custom dimension or making use of a custom source.

Before getting started you first want to decide how you qualify traffic as internal or not, here are some common methods:

  • An internal-only page – e.g. Post-login internal dashboard page
  • An email – this could contain a link to a page on your site that is otherwise unreachable or set certain parameters that define your cookie.
  • Backend UserId’s & dataLayer – Sophisticated app backend that knows when internal traffic is around? Pass the knowledge on to the dataLayer and set your cookie accordingly
However, don’t qualify your traffic on ambiguous sources, for example a login page is a bad idea, any traffic could navigate there, regardless of whether they are internal or external traffic.

Once you have an idea of how you would like to determine your traffic types let’s look at getting the tracking setup!

The GTM Setup

Regardless of methodology, you’re going to want to start by getting the cookie setup, so let’s first look at achieving this through GTM. You will need to create three main things within GTM:

  1. Custom HTML cookie tag (or cookies set by devs)
  2. First Party Cookie variable in GTM (for reading the cookie value)
  3. Internal only page trigger (or custom event if you went the dataLayer route)

Your Custom HTML tag exists to do the actual placing of the cookie and should contain this:

<script>
  (function() {
    
    function setCookie(name, value, days) {
      var expires = "";
      var domain = "";
      if (days) {
        var date = new Date();
        date.setTime(date.getTime() + (days * 24 * 60 * 60 * 1000));
        expires = "; expires=" + date.toUTCString();
//{{Const - mainDomain}} needs changing to your root domain!
        domain = "; domain=" + "." + {{Const - mainDomain}}
      }
      document.cookie = name + "=" + (value || "") + expires + domain + "; path=/";
    }

    setCookie("trafficSource", "Internal", 365);
  })();
</script>

Key things to note here are we have used setCookie to named our cookie trafficSource (this name has to match with our next step!), we’ve set a value of “Internal” indicating that this session is a staff session, and we passed a number, 365, which indicates how long the cookie lasts.

Before we continue there is one very important edit for you to make regarding domains!

To allow the cookie to work on subdomains we need to manually set the root domain in the tag. I did this by referencing a constant variable for the root domain on GTM, you can either do the same or just enter the root domain manually in its place.

Not sure what I’m talking about? Let’s say your website is www.example.com, then we know your root domain is example.com. Some pages on your site however sit on a sub domain like so sub.example.com, we need the cookie set at example.com so that it can be read on all subdomains and the root domain. So in this example {{Const – mainDomain}} would simply return example.com. Why not use the built in Page Hostname variable I hear you ask? Because if this tag fires on a subdomain then your cookie was set at the wrong level!

Any other edits are purely at your discretion, such as the cookies name, the value of internal or how many days the cookie should last – if you do change them, however, make sure you update the following steps to match!

Now that we have set the cookie, we need an easy way of checking its value within GTM. We achieve this by creating a new custom variable:

User-uploaded Image

 

Ensure the cookie name matches exactly with the naming convention within the custom HTML tag, notice I’ve also set a default value for when the cookie isn’t set (achieved via the Convert undefined to…) meaning I can safely pass this value even if the cookie isn’t defined yet (as would be the case for most traffic). This means if the cookie we created isn’t present we return the value of External rather than Internal.

Creating The Trigger

Finally, we need a trigger for when to set the cookie that marks traffic as internal, what form this will take depends on how you decide to qualify your traffic. For most cases this will simply be a page URL trigger such as this:

User-uploaded Image

 

Where page URL is your chosen full destination, or you could regex match a particular page path etc.

Prefer the email with certain parameters? You can instead create a URL variable, select query, input your query parameter name “trafficSource” or whatever you’ve named it. You then use a pageview trigger > some pageviews and then the URL variable you created must equal “internal” or whatever value you are passing. That way when a user follows a link containing this parameter your cookie will be set.

Actually going down the dataLayer route? No worries, just create a custom event trigger based on the name of the event that identifies the user as internal traffic, this will then set the cookie for the user.

The Analytics Setup

Onto the analytics setup required to make use of the new data we can send! This comes down to two main things:

  • Custom Dimension In Google Analytics
  • Analytics Filter Based on Custom Dimension

My preferred method is the use of a custom dimension, so if you’re following along this route then you need to create your trafficSource dimension within analytics via Admin > Custom Definitions > Custom Dimension. When choosing the scope of the dimension there are two viable options, either session or user scoped depending on how you would like traffic to be dealt with.

Session scoped would mean once a user hits the cookie page (or returns at all with the cookie present) that session will be counted as internal. If they returned without the cookie and browsed the site, then that visit would be counted as external.

eating a cookie
Only give a cookie to the users you wish to exclude!

 

User scoped on the other hand will mean once they are tagged as external, any sessions Google Analytics can tie together as that user (including multiple sessions in which the cookie was NOT present) will be set as external.

Once you’ve created your custom dimension, make sure it is set as active and take reference of the index number associated with it.

Finally, make sure to set this custom dimension in your analytics setup, if you’re using GTM this means adding the index and the cookie value as a custom dimension field on your Google Analytics Settings variable:

User-uploaded Image

 

For the above example, this was index 4, and I passed the value returned from my first-party cookie variable which I named Cookie – Traffic Type.

If you are using utm_source directly then I imagine you can ignore this custom dimension section as utm_source is already directly sent along to Analytics.

View Filters

Hopefully, you now have a custom dimension that represents traffic source and is populating with the value of either internal or external based on whether or not the cookie you made is present.

Next up we need to actually ensure we filter our data within our views correctly. If you prefer you can of course not set a filter and only use segments based on your custom dimension, but when it comes to analytics, why limit yourself? You already have an unfiltered data view anyway right? RIGHT? Good. So there is no harm in separating your internal/external traffic into corresponding new views.

For your internal traffic view you will simply create a new INCLUDE filter based on the custom dimension value of internal – this means this view will only ever populate data that is classed as internal.

User-uploaded Image

 

For an ‘external traffic only’ view you would do the same except you would set the value as External instead.

For your Main/Master view you will probably just want to throw an EXCLUDE filter in so that you aren’t reporting on any internal traffic, but should something be wrong with your setup, such as undefined values or (not sets) appearing, then you will be able to notice and fix the problem rather than being oblivious like you are in the INCLUDE filter view.

If you are using utm_source, then use that as the dimension you are checking against in your filters instead.

Summary

To recap you should now have:

  • A cookie setup tag being fired on your desired trigger
  • A variable for reading the cookie value
  • An analytics filter to make use of trafficSource or utm_source

Then depending on your implementation you should also have a custom dimension setup and updated Analytics setting to pass the value on session/user.

There you have it, folks. As always, don’t forget to test and make sure everything is working exactly as expected before rolling it out to production.

Finally don’t forget to give the tag some time to do it’s work or even speed things along by asking all staff to visit the page across all their work devices and spread those cookies!

I hope this post offered some insights into why your IP filters may not be working and offered up some potential solutions for keeping your data clean during the current pandemic.

Let us know in the comments if you have a preferred implementation method, or if you’ve been dealing with working from home traffic in your own creative way, feel free to reach out to us, we would love to hear it!

alt
Josh Berry-Jenkins - Technical Director I’m Josh and I fill the role of Technical Director at Bind Media. I spend an ungodly amount of time tangled in deep analytical webs using Google’s suite of web analytics tools such as Google Tag Manager, Google Analytics and Google App Script (to name a few). You’ll generally find me being drip-fed copious amounts of coffee in a dark room, face brightly illuminated by multiple screens.

Ready to supercharge your paid media?