Variations between Google and Zeald Stats

Why Do the Google Analytics Stats Not Match the Zeald.com Stats


Some Zeald.com website owners have queried why the stats at the backend of their website do not match up with the stats they get from Google (Analytics or Webmaster Tools). Brent Kelly, co-founder of Zeald.com answers that question here:
To look into the differences, it really becomes a rather technical as the analytics geekery level increases. I'll try & give a basic overview below:

There are traditionally 2 main ways that Google Analytics providers track website statistics: log file analysis or via embedded javascript.

Log file analysis uses 'log files' that reside on the web server that track every single file that gets loaded from the server. Software reads through these logs & follows a set of rules to build up summarised data which is presented in the reports.

Embedded javascript systems rely on a javascript file to be embedded into each page of the users website, which is then downloaded each time a user loads each page of the website, allowing it to track various data about the visitor & the page they are accessing.

Each of these methods have pros & cons, but the important thing to realise is NEITHER are perfect. Due to the nature of the web, we can never know for sure how many different people actually see your website, and therefore every different reporting package has to have its own set of rules to decide things such as 'how do we know when one visit ends & a new one starts', or 'how do we determine a visitor is unique'.

Because of this, NO two stats packages will ever report the same figures & there may quite likely be significant variations in the figures they present.

I found a reasonably useful blog post on developmentseed.org that does a bit of a 'face off' between the two methods, detailing a few of their pros and cons: http://www.developmentseed.org/blog/analytics/faceoff_google_serverlogs.

To relate this back to ZES vs GA, GA is a 'embedded javascript' variation. The system we use in ZES is neither. We have a 3rd way that is probably closest to the 'Enterprise Solutions' mentioned in the blog post that he doesn't bother covering due to their unaffordablity of most clients ;).

Basically because of the way we display our websites, we have access to a range of information that is available to neither of the above methods, as well as a lot of information that allows us to offset the 'cons' of each of the above methods.

For example, one of the major problems with Log File analysis is having to come up with formulas to determine what collection of page loads constitutes a 'visit'.

Our system allows us access to information that is not available to log file analysis, and allows us to use both methods that they would usually use, plus a few other ways using this additional information to achieve a much more accurate result in terms of the number of visits to the website.

One of the major problems with Javascript based methods is that they track no data about robots that visit the website, or any 'security conscious' users that have javascript and/or cookies disabled in their browser. Therefore you lose the ability to investigate search engine spider related issues with your site. Your visitors can also report lower than is true due to excluding some. Our method allows us to capture and report on all this information.

This is obviously a very quick & simplified overview of some of the differences between the difference approaches to web analytics. However I hope it illustrates the broad array of possibilities that exist when it comes to determining what exactly is causing variations between different analytics packages.

It is not at all uncommon - in fact the first search I performed on Google on the topic revealed some interesting feedback from a Google Analytics user. They were comparing the existing javascript-based GA to an earlier version (called Urchin) which worked more like the ZES system does. They were experiencing on average a 20% variation on their visitor data between the two systems, with the old one consistently reporting higher levels of visitors. They went on to try & figure out why exactly this would be and made various broad sweeping statements - however I believe they are missing the point that in ALL cases, different software will report different figures.

This is why web analytics is not so useful a tool to determine the exact amount of visitors a site is receiving, but more as a tool to determine trends - is the site improving? have my promotions resulted in increased traffic etc etc.

Anyway in summary, here are a few ideas that pop into my mind that could be causing the variations in visitor counts:

* The 'half day' of stats. Can probably rule these out by setting filters on each system to start from the 28th March.
* ZES system includes 'tricky' robots. While we do our best to filter out robots from the visitor stats, ofter robots can be tricky and disguise themselves as a normal user, making it near impossible for us to tell they are not. A javascript based system such as GA by default excludes this data as robots are generally not using a browser & therefore the javascript file that tracks the visit does not download.
* ZES records a page load as soon as the server receives a request for the page. GA only records the page load if the server then responds with the full page, the HTML fully downloads, causing the javascript file to fully download. Only at this point does the page load record. Therefore, if a user hits stop, quicky closes the page before it loads, or clicks through several pages quite quickly, page loads could theoritically be missed by GA.
* GA will have a completely different set of rules for determine what collection of page loads constitutes a visit. For example, we have a rule that states if a user doesn't access the site for so-many-minutes (can't remember off the top of my head how many but lets say 30 minutes for arguments sake), then their visit has ended, and if they start up again, this is a new visit. For GA, this may be 60 minutes which would cause some visits to be amalgamated, thus reporting a lower number. Or perhaps they don't even use such a rule & use another set of rules to determine what constitutes a visit. Even the smallest change in this area could have a large impact on the statistics.


One other area I noticed more significant differences between GA & ZES is the individual page loads on the more detailed traffic reports. ZES is significantly higher, and as far as i can tell, one of the main contributing factors to this is that the 'page load' statistic in ZES does not attempt to differentiate between real users and robots. It merely reports the number of times a page was downloaded. GA as stated above doesn't have the ability to track page loads by robots, and therefore excludes any such page loads from their statistics.

Anyway, I hope that novel covers any questions. It's quite difficult to go into too much detail with some of these reasons without getting incredibly technical, down to how things all go on behind the scenes of the internet.


Brent Kelly.
About our company
Enter a succinct description of your company here
Contact Us
Enter your company contact details here