Data discrepancies are one of the most frequently discussed topics for digital advertisers and publishers. It comes up a lot because it relates to one of the most fundamental aspects of any ad serving business. Data is right at the heart of what we do and making sure that data is as accurate as possible is incredibly important.
We make every effort to make sure our data is 100% accurate. Despite this however, there are some things that are beyond our control.
Some examples of these are:
- Outdated browser software
- Privacy-protection software
- Spiders and bots that can throw off counts
- One person using multiple computers or browsers
- More than one person using a computer or browser
- User web browser settings or modifications
- DNS issues
- Hardware failures
- Poor or slow Internet connections
Based on these variables, we have an acceptable margin for error of 5-10%.
With these things in mind, let’s look at a few areas where data discrepancies may occur and what you can do fix them.
Bigger is better
When using third party tracking tools, like Voluum or GA, if you have a relatively small data pool the margin of error can sometimes be very high, mainly because of the issues listed above. The smaller the data sample, the more impact a technical glitch or browser issue (for example) will have overall. Conversely, the bigger the pool of data you have to analyse, the lesser the impact these issues will have on it.
Keep this in mind. Obviously, the best thing is to make sure everything is setup correctly in the first place, but, if you do see a big difference, wait a bit. It may be that the data pool is being impacted heavily by some outside force. It’s very hard to put a precise figure on how large your data sample should be before you start making changes to your setup, but if you’re seeing error rates around 30% at the start of launching an ad on your site, it may be best to leave it a while and see if that comes down.
Tracking codes, tracking codes everywhere
Although it may seem a bit obvious this is something that’s often overlooked, and should be one of the first things to investigate if you notice data discrepancies. Making sure that your tracking tool of choice has its code implemented in the same pages as your ad code is very important.
If you fail to do this, you are guaranteed to see a significant difference in results. If you see a high error rate (anything above 50%) then double check that all of your tracking code appears wherever your ads appear.
Comparing like for like
Another important thing to watch out for is what kinds of metrics you are comparing. If you have an ad that is setup for CPM, then your analytics tool needs to be monitoring impressions for that ad specifically. You may have a tool that is tracking overall page views, but this is different to an impression.
Typically, a page view is counted every time a page is loaded by a user’s web browser, but an impression is only registered when a specific area of a page is viewed (in this case, an ad). If your tracking system is registering pageviews then, most likely, that software will register more events than ours.
Data discrepancies can also result from problems with third-party tracking codes. Things like not having the code on the right pages (see above) or incorrect settings in need of an update. To help debug these kinds of problems, you may be able to use tools like Firebug or Web Inspector to make sure the requests are sent to the tracker.
Also, watch out for having multiple tracking codes for the same software appearing on the same page as this can frequently lead to over and under-reporting.
Watch those timezones
One more thing to watch out for, if you’re seeing discrepancies in your tracking data, is reporting timezones. Our system operates on Paris time, so if you’re using a third party tracking tool that is using a different timezone you will most likely see a difference day-to-day. As example, let’s say you’re tracking tool registers 1000 clicks in a day, but there are only 700 shown for the same day in our system. This could be because the missing 300 are being appearing under a different date because of a time zone difference.