Imagine having an e-commerce website through which your company sells its products, and a substantial workforce regularly interacts with the site as part of their roles. At the end of the day, your Google Analytics 4 (GA4) reports will include information about website traffic from both your staff (referred to as 'internal' traffic) and potential customers.
The above setup is not ideal. You target your paid marketing campaigns (Facebook and Google Ads, etc) around potential customers, not your employees. As such, it might be useful to exclude the latter.
Excluding internal traffic in GA4 is a two-step process.
First, we identify internal traffic by instructing GA4 to append a parameter called ‘traffic_type’ with a defined value to events generated by users interacting with the site from specific IP addresses. Then, we create data filters that exclude events with that parameter and value combination. Let’s explore this process in greater detail!
Identifying IP addresses whose traffic should be marked as internal / Setting up traffic rules
I will be using Nuna Baby for this tutorial as they asked us to audit their GA4 setup. I still have not done anything. To demonstrate this, let's generate a test 'page_view' request and examine its details sent to GA4.
The following passage describes how to explore events sent to GA4 using the Developer Tools.
I accessed the Nuna Baby website in my Chrome browser and hit F12 to open the Developer Tools, and then selected the Network tab. With the Developer Tools open, I hit F5 to refresh the page. The Networks tab shows all requests the website makes. I am only interested in requests that are made to GA4 servers. To isolate these requests, we can filter them by entering '/collect?v=2&tid=G-' into the URL filter field. This will help us focus on GA4's data. Here is how it looks:
We can then click on one of these remaining requests (that have not been filtered out) and open the Payload tab to reveal what is being sent:
Each parameter is a key/value combination. The Developer Tools parse what would otherwise be an incredibly long POST request URL into this table format, where each key is bolded, and the value follows the colon character.
The ‘traffic_type’ parameter, if appended, will appear as ‘tt:.’ As you can see, that parameter does not appear in the above screenshot, so it was not sent with the ‘page_view’ event we generated.
Now, let’s head over to https://analytics.google.com/, select our Nuna Baby account, and navigate to a screen that allows us to define internal traffic:
Property > Data Streams > Web > Click on your web stream > Configure tag settings > Show all > Define internal traffic.
Nuna Baby mentioned that they have offices at two locations that can be identified with the following IP addresses:
- Morgantown, PA : 22.214.171.124
- Elverson, PA : 126.96.36.199
Plus I will want to exclude my own IP address since I plan to keep testing their site quite a bit (as such, I will define three traffic rules in total).
I will click on the ‘Create’ button to create a new internal rule. Here is my traffic rule setup:
It means that interactions (read ‘events’) with the Nuna Baby website coming from my IP address (188.8.131.52) should now have an extra parameter ‘traffic_type’ with a value of ‘internal_alexsb.’
The default ‘internal’ value
It's worth mentioning that the default value for this parameter is 'internal,' and you can retain it as such. That being said, I encourage you to come up with a unique value for each office location that has a unique IP address. Since Nuna Baby has two office locations, each with their own IP address, I will have a total of three traffic rules, each with its own unique ‘traffic_type’ parameter. Here is how those traffic rules look once created:
With the above traffic rules saved, let’s refresh the Nuna Baby homepage one more time and see if we can find that new ‘traffic_type’ parameter as part of the event:
There you have it! The internal traffic rule works. Every event (a user interaction on the Nuna Baby website) coming from my IP address will now include that extra ‘traffic_type’ parameter.
Theoretically, if someone at Nuna Baby’s Elverson office were to visit their site, their events would be tagged with the ‘internal_elverson’ parameter value.
So why not just use the default ‘internal’ parameter value for everyone, you might ask? It is a great question and I will tackle it in a bit.
Creating data filters / Excluding my own traffic
If only there was a way to tell GA4 to not record events that have a specific value of the ‘traffic_type’ parameter …. Fortunately, there is one! Let’s see how that works.
In the same Nuna Baby GA4 account I will head over to:
Property > Data settings > Data filters
Once there, I will create a new data filter by clicking on the ‘Create’ button and selecting the ‘Internal traffic’ type. The other option is ‘Developer traffic’. The difference between the two is described at the very end. Here is my data filter setup:
The important settings are the ‘Filter operation,’ which I have set to ‘Exclude’ (since my intention is to exclude my own - internal - traffic), and the ‘Parameter value’, which must match the ‘traffic_type’ value we have set in Define internet traffic rules.
Highlighted below are the three data filters I set up:
It is a good idea to be mindful of the data filter names (first column) as those will appear as values of the default ‘Test data field name’ dimension in Exploration reports.
Another thing to keep in mind is that GA4 imposes a limit of 10 data filters per property!
Note that there are three possible filter states: testing, active and inactive. In the ‘Testing’ state, event data is still not filtered out from reports, but all events generated by traffic from listed IP addresses include the ‘traffic_type’ parameter. The testing state allows us to get a feel for the filter before permanently activating it.
Seeing the results in reports
Testing a filter before activation ensures that it successfully filters out traffic from the designated IP addresses. Traffic from filtered IP addresses is added to the "Test data filter name" dimension with the filter name as the value.
To find events triggered by a filtered IP address, you can build a free-form exploration with these settings:
- Technique: Free form
- Rows: Test data filter name
- Values: Event count
A data filter can take between 24 - 36 hours to apply.
Here is how my exploration report looks:
Hold on, have not we set up rules and filters for the Elverson office? We sure did! Here are a couple of reasons why none of the events had the ‘internal_elverson’ parameter added:
- Elverson office personnel could be on holiday and still have not visited the site.
- Or more likely, the IP address referenced in the data filter is incorrect.
While the Elverson record is absent, it's evident that a total of 340,835 events were recorded. Of those events, 415 were attributed to the Morgantown office and 101 events were attributed to my own IP address. So what is that (not set)? In simple terms, it means that these events were fired without the ‘traffic_type’ parameter. They are your real website users (potential customers).
What is the benefit of giving each office location a unique ‘traffic_type’ value? You can set up unique data filters to distinguish between different offices (one data filter per traffic rule), so you can have separate event buckets of internal traffic. If we were to have just one ‘internal’ traffic type value and, as such, just one data filter, the above report would look like this:
- (not set): 340,319
- Internal: 516 (415 + 101)
Morgantown and Alex Sb (which is a traffic rule for my IP address) would be merged together and displayed as one bucket called Internal (or whatever the data filter name we were to give it). We would not know that something might be wrong with the Elverson filter. There would not be away to distinguish one internal traffic source (Elverson) from the others (Alex Sb and Morgantown). Hopefully, it makes sense. Also, if these filters were immediately activated (without first running them in the Test mode), we would not be able to see them in the Exploration reports.
Another advantage of using multiple filters, as opposed to a single generic filter, is the ability to selectively activate those proven to be effective:
I will go ahead and change their state from Testing to Active. You do this by clicking on the Activate link of the three-dotted menu.
With the active filter, the Developer Tools still shows the ‘traffic_type’ parameter appended as before. Yet, the event is no longer coming through in the DebugView, as if the event was not taking place:
It means we can continue testing the GA4 event setup using the Developer Tools, but the DebugView feature of GA4 is no longer available for troubleshooting. ‘What a bummer,’ you will say. Can’t we have the best of both worlds? And yes, we can! Being a tracking specialist, my role is to debug GA4 issues. And it is essential that I have access to the DebugView feature. At the time of creating our data filters, we had to choose between the two types: internal and developer:
Since I tackle tracking issues and want to continue using the DebugView, I should have selected the ‘Developer traffic’ type. And that’s the difference between the two! To remedy the setup, I removed the ‘Alex Sb’ traffic rule, removed the ‘Alex Sb’ data filter, and created a new data filter to exclude developer traffic:
I gave it the name ‘Filter out developer traffic,’ and set it to run in the Testing mode for now. Every time I explore the site in debug mode, GA4 will exclude my event data from the final reports. The ‘traffic_type’ event will no longer be appended to events from my IP address, but I will continue to be able to use the DebugView feature. The best of both worlds, that is!
In conclusion, by meticulously identifying internal traffic, setting up tailored traffic rules, and employing data filters, you can ensure that your Google Analytics 4 reports accurately reflect customer interactions and provide valuable insights for optimizing your marketing strategies.