Four Types of Web Analytic Data

So did you know that there are different ways that analytical data about your website is collected?  Of course you did you just might not have really thought about it before.  I mean who stays up thinking about this sort of stuff besides Shelby Thayer, myself and a few other people that I will leave unnamed to protect their identity?  So instead of staying up all night to see if you can come up with four different types I’ll just write them out here.  You can think me for the extra sleep later.

So Web Analytical Data is also known as Clickstream Data.  There are four main ways that this sort of data is captured, web logs, web beacons, JavaScript tags, and packet sniffing.  Let’s see if I can briefly describe each.

1. Web Logs

Web logs are the original form of web analytics.  This is the data collected in the logs of your web server.  It was originally used to analyze errors (yes error logs) and has been expanded and developed to track file calls and downloads.  Every time a file on your web server is requested and delivered this is logged.  So a typical page can include tens or even hundreds of calls depending on how a page is setup.  File calls include css, images, JavaScript, and of course the actual page file.  So even if you don’t know it your web server is collecting and logging data about your site.

Parsing web log data can be tricky, but there are lots of free solutions to do this.  Also web logs is the best method to see robot behavior.  So when a search engine bot crawls your site it will show up in your web logs, but not in most of the other methods.

Web logs can be difficult to translate into actual business data as they capture technical data.  For understanding problems with your sites like page issues, bandwidth usage, server loads, and other technical information they can be very valuable.  Also like I’ve stated you have them even if you don’t choose to look at them.

2. Web Beacons

Remember in the late 90’s when banner adds and pop-ups seemed to be everywhere?  Web Beacons measure hits.  They were originally created as a smart way to offer dynamic graphics and more sinister things.  Today Web Beacons are still quite popular and are the main way that email marketing solutions track email open and view rates.  So if you get an email that asks you to enable images and you choose not too then you are not being recorded as opening the email.  Of course the links in the message commonly use methods similar to destination URL builders to track click-throughs and this can help to more accurately monitor open rates for users that choose to not “enable images”.

Web Beacons are easy to implement and can provide data about users clicking on banner graphics and store this data on a remote server.  The beacons can be configured to record pretty much any sort of data you could be interested in collecting including creating cookies.  Because beacons track through images and robots do not view images you cannot track bot behavior.  They are especially value for easily tracking data across multiple domains.  Web Beacons were an early attempt to grab business intelligence and have been widely popular by marketing and advertising web entities.  They have also been giving a black eye by being associated with spyware and spammers.

3. Packet Sniffing

Packet Sniffing is probably the method that everyone skips or doesn’t think about.  Packet sniffing is more of Network Administrator function, but it can be either a piece of hardware that sits between your server and the web or software installed on your web server.  All traffic is sent through this sniffer which basically allows you to know exactly what is happening and being surfed on your site in a linear fashion.

The biggest problems with packet sniffing revolve around having the additional resources to insure that the user experience is not harmed by this additional step and of course the privacy concerns.  Packet sniffing also lets you track and monitor robot data while also tracking the few users who have JavaScript disabled (best estimates say slightly less than 5%).  Finally even though packet sniffing does a good job of gathering lots of raw data it is not always so great at providing aggregated business decision intelligence.  Packet Sniffing can be used for individual usability testing if you have the time to crawl through the data.

4. JavaScript Tags

JavaScript tags have become kind of the de facto standard of the web and where most of the development continues today.  A while back when I wrote my post comparing web analytics solutions all of these were JavaScript based.  If you don’t know the difference most likely you are using JavaScript tags to track your website.

JavaScript data can be called from a remote location and the data can also be stored remotely to not affect load time of your web server.  JavaScript tags work through installing a line of code on your web pages that calls a JavaScript function that collects and gathers data.  Because JavaScript is not read by bots you also won’t collect their crawling data.

So Which One Guru?

Avinash Kaushik gives JavaScript his recommendation in Web Analytics: An Hour a Day if you have to only choose one method.  I 100% agree with this.  It is very easy to install, gathers lots of data, and there are many services out there that offer, it many free.  Also most of the information in this post came from this wonderful book so if you want to learn more, GO READ IT!  Book review coming soon… I promise.

3 Responses to “Four Types of Web Analytic Data”

  1. Says:

    Thanks for the linkety link, Kyle! :)

    If you have to choose only one, javascript is, by far, the best. Of course, a combination of logs and js would be best if you had the capability of doing both. Everyone has the *capability* of using logs as well, it’s the analysis part that’s hard. :)

    I also don’t see web beacons as a sole data source for WA, but rather something to be used in combination with your WA package.

    We also have to take into consideration how young WA is in our industry.

    I don’t know many times I’ve read debates over which type of data to use when and where in WA circles. For higher ed (at least right now) it doesn’t need to get that complicated.

    Focus on three things:

    Implement a javascript WA tool.
    Implement it *correctly*.
    Analyze, don’t report.

    If we can get higher ed site owners to do that, we’re golden … for now … :)

  2. Says:

    “Web logs can be difficult to translate into actual business data as they capture technical data.”

    Try using a non-free solution. Even mid-level weblog solutions blow GA away when it comes to business data and reporting.

  3. Says:

    Not less than a week later and Google adds advanced segments to GA! Even the default ones they added are really nice. And being able to compare 3 segments at a time is pretty awesome.

    You should work on a post showing the advanced reports you can get by using multiple AND conditions and OR conditions when creating segments. Go go go,