Demystifying Digital Analytics

Originally published 15 June 2010

Consider the following facts:

1. In 2009, nearly 4 out of 5 U.S. Internet users visited a social networking site on a monthly basis.1
2. According to Forrester, interactive marketing will near $55 billion and represent 21 percent of all marketing spend in 2014.2

These market statistics demonstrate that the area of digital platforms is exploding. Consumers perform multiple tasks on digital platforms such as browsing content on websites, posting comments in response to an in-store experience, referring a friend on a social networking website, accessing articles by key opinion leaders on product review websites, etc. Analyzing the content generated from these activities on digital platforms offers huge potential to gain insight into the consumer’s psyche.

Figure 1: Digital Platforms are Exploding

Digital platforms can be classified into 6 broad categories where customers engage themselves:

  1. Social Networking Platforms: Here, users have a “friends list” and they offer those friends a peek into their interests in a controlled fashion.
  2. Opinion Platforms: Here customers congregate to discuss specific areas of interest. For example, is a trusted source of restaurant information for food lovers in their respective areas. 
  3. Neutral Niche Interest Sites: These serve specific areas of interest; e.g., flickr is a photography forum and goodreads is a place to review and recommend books.
  4. Real-Time Notifications: Twitter is one place where abbreviated messages from users are posted to subscribers in real time. This is a one-of-a-kind platform where short bursts of messages give a peek into a customer’s feelings and perceptions. Companies also use Twitter to announce current news and promotions.
  5. Company Created Platforms: Many CPG companies create niche platforms, e.g., targeting soccer moms and teenage kids to engage with each other.
  6. Key Opinion Leader (KOL) Platforms: Oprah Winfrey is a highly trusted celebrity whose product reviews on television and online are valued heavily by consumers who share similar values and interests. Thus, reviews by such an opinion leader can have proportionate impact, for example, on the success of a new product being launched.

Figure 2: Six Catgories of Digital Platforms

Depending upon the digital platform and the kind of engagement that happens, multiple flavors of digital data points are generated and can be analyzed. This data, which was previously unavailable to organizations, can now be tapped for consumer intelligence from the various types of digital platforms. Data classifications can include:

  • Clickstream data
  • User registration data
  • Consumer comments and complaints
  • KOL reviews of products such as electronics and automobiles
  • Email referrals to friends advocating a visit to a website or product referral
  • “Build your own product” configuration data on a website
  • Click events on gaming applications.

Figure 3: Digital Data Collection Points

Now that we have established the broad breadth of data points that are generated on various digital platforms, here are some of the analytical processes and applications that can be created and executed.

Opinon Platforms - Sentiment Analysis
  • What is the sentiment index (ration of positive and negative sentiments) for my brand?
  • Is there a correlation between buzz velocity on (measured by number of posts regarding my product) and my in-store sales?
  • What is the rate at which positive entries regarding my brand are emerging on
  • Theme extraction
  • Co-relationship
  • Unstructured text mining
Key Opinion Leader (KOL) Keyword Analysis
  • What are the top 5 keywords used by KOLs to express and describe features of products and their evaluation of them?
  • Keyword extraction
  • Sentiment inference
  • Theme extraction
  • Unstructured text mining

Online Product Recommendation Engine 

  • Given past history of keyword searches, product photo clicks and purchases, what is the next likely purchase to be recommended to the registered user?
  • Collaborative filtering
  • Marketbasket analysis
  • Logistic regression 

Twitter Buzz Analysis

  • Is there a correlation between number of tweets regarding a newly launched product and its success in the market?
  • Correlation analysis
  • Unstructured text mining
  • Logistic regression

Social Network Analysis (SNA) - Viral Index

  • What is the average number of people being referred to for the gaming application designed to increase brand recall value on Facebook/Orkut?
  • Content forwarding link analysis
  • Network link analysis
  • Geospatial modeling

User-Generated Content (UGC) Analysis of Microsites

  • What are the most common themes in the user forums on football, baseball or cricket microsites?
  • Theme extraction
  • Unstructured text mining

Ad Gaming Analysis Quizzes/Puzzles


  • What is the time of day at which most gamers come to solve puzzles?
  • Which geography accounts for most people playing quiz games?
  • Visual exploratory data analysis using box plot
  • Geospatial modeling

Online Product Configuration Analysis

  • Which piece of artwork is the most popularly configured on a "Customize your own T-shirt" website?
  • Is red more popular than green for a medium T-shirt?
  • Hypothesis testing
  • Visual data distribution analysis using box plots

Multichannel Online/Offline Analysis

  • What % of customized online coupons are redeemed in nearby stores?
  • Redemption analysis
  • Exploratory data analysis
  • Hypothesis testing

Digital Brand Health Monitoring

  • Can I create an alerting mechanism when the number of entries containing the keyword "allergy" regarding a soap crosses more than 30 entries per month?
  • Text mining
  • Rule-based configurable alerts

Clickstream Segmentation

  • What are the definable segments that display clickthrough and visit behavior of interest at a different rate from others as a whole?
  • What is the rate of change of clickstream segment membership over time?
  • What does the change in clickstream segment membership rate change tell us about the website's content and visitors' engagement levels?
  • Clustering
  • Discriminant analysis

SEO/Keyword Search Analysis

  • Which are the most frequently used keywords driving traffic to my site?
  • Once within my site, what are the most frequently used keywords to search for content?
  • Exploratory data analysis
  • Hypothesis testing
  • Keyword affinity analysis

Personalization of Content on Niche Websites

  • Given past clickthrough behavior and product viewership behavior, which give a clue to the registered user's interests?
  • How can I offer personalized content which resonates with a user when he logs in?
  • Persona building
  • A/B testing
  • Clustering
  • Design of experiments

Engagement Ladder  Analysis

  • With every visit, is the online visitor engaging deeper with an increase in comments, purchases or product reviews?
  • Box plots
  • Heat maps
  • Scatter plots
  • Sequence analysis
  • Page path analysis

Product Recommender Engines. If a user has registered himself, it should then be possible to create a behavioral profile based on the key products pages he/she touched; the number of times a product page was touched; average dwell time on the page; the number of online inquiries; average online purchase value; breadth of products purchased; number of online coupons redeemed, etc. Once this behavioral profile is distinguished, rule-based constructs or collaborative filtering based applications can be built to recommend the next best purchase given past behavior.

Path Analysis and Extreme Personalization through Content Customization. The path a user has taken to engage with a website during various sessions can be analyzed and that information can used to create a customized webpage which personalizes his experience on the website by surfacing content which closely matches his past webpage browsing behavior.

Online Product Configuration Analysis. Many product companies, such as automobile companies, offer the ability to customize their product online using a product configuration engine. This data can be used to identify:

  • The product attribute combinations most heavily configured (where the default option is overridden).
  • The fastest growing combinations of product attributes being customized.
  • Any affinities between car configurations and demographic profiles.

Sentiment Analysis Using Text Mining of KOL. Cult figures and programs like Oprah Winfrey and NPR’s ‘Car Talk’ command huge audiences.  The opinions these KOLs express in regards to products have an influence on the purchase habits of their audiences.  One can use unstructured text mining to map important themes and use sentiment analysis/keyword frequencies from the opinions expressed by KOLs to discern which attributes of a product or service are most heavily talked about and track buzz for the same. This can be a good early warning indicator in categories where KOLs have huge influence on their target audience

Viral Network Link Analysis. Many websites have a “refer a friend” link, which allows anonymous surfers and registered users to refer a favorite product, article or service to a friend. This, in turn, creates a viral effect as each person cascades this across friend circles. One can build a viral tracker application to see how this network builds across time and identify mavens in the network who are capable of tipping the buzz.

Keyword Search Analysis of Microsite Content. Analysis of keywords used for searching micro sites can give clues for the branding and product messaging strategy, ensuring that it is in alignment with peoples’ expressed interest online with regards to the most searched content on the microsite.

Segmentation of Clickstream Data. Clickstream data can be summarized at a user level and be used to create clusters of users who exhibit similar online behavior. Discriminant analysis can be used to identify which factors distinguish users exhibiting certain click stream behavior from the others.  We can use this data for the targeting of content, messages and product recommendations. 

Design of Experiments. Design of experiments can aid in understanding which variations in customized content page layout appeal most to registered online users. This information can be used to create a basic template to serve customized content to end users. For example, one can experiment with a purchase button and product layout placements at various locations within a webpage to understand which combination of location placements triggers maximum purchase activity.

Gaming/Quiz/Online Puzzle Analysis. Once all click events regarding online gaming data, online quiz and puzzle data are collected, one can perform an analysis to answer basic questions: What is the growth rate of the number of people using the gaming applications? Which gaming application themes resonate better with target audiences, and do they vary by profile?

Digital Brand Health Monitor. Another interesting way to track brand health online is to create a “brand keyword watch list.” For example, a leading shampoo manufacturer can actively keep track of the number of times the word “allergy” occurs in user feedback and in complaints on the brand’s microsite and opinion platforms. This can quickly clue companies’ into customer sentiments so that they can carry out messaging and marketing interventions to prevent downward spiraling of brand buzz online.

Multichannel Analysis on Coupon Redemption. Online coupon redemptions can analyze all “print” events and “refer a coupon to friend” events to decode the level of a product’s online engagement. Also, if possible, one can overlay offline purchase data from stores on top of online print events to determine what percentage of printed coupons get redeemed in the neighborhood store.

As customers start engaging more online, there are a lot of analytical scenarios that can be used to understand their behavior. The following are some success stories which serve as an eye opener of practical real world applications of digital analytics.

Entertainment Industry: In the entertainment industry, it was found that the number of tweets regarding a movie was a statistically significant predictor regarding theater attendance.

Baking Industry: A leading biscuit manufacturer created a  “build your own biscuit” gaming application which allowed young kids to configure:

  • The shape of the biscuit
  • The number of biscuits in a packet
  • The artwork for packaging (color of artwork, cartoon characters, etc.)

This was then posted on popular cartoon-related websites and the information was mined for re-launching a biscuit with a different shape and packaging, which resonated more with kids.

PC Industry: A well-known manufacturer of PCs and laptops used Twitter to broadcast weekly products which were on sale at specific stores. The real-time access to promo information increased this manufacturer’s number of followers on Twitter eager to be alerted to the new promotions each week.

Auto Industry: A leading auto manufacturer created a pre-launch blog for the next version of a particular model. The blog provided links to reviews from key opinion leaders in the auto industry who were perceived as trusted advisers.

Consumer Product Companies: A leading CPG company created an online application, which allowed registered users to create their own customized version of coupons which they could use to redeem at specific online outlets.

  • Based on past online purchase behavior, each registered shopper was allocated a promotional dollar value ranging from $2 to $15.
  • The customized application allowed the registered shopper to chose the stores where he could redeem the coupon.
  • Once the customer selected a store in his neighborhood, the CPG company mapped a set of slow moving products which had to be moved from the shelf. These items were displayed as candidates for promotion.
  • The customer chose his product and printed a coupon on his printer to be used in store.

This allowed the CPG company to analyze coupon redemptions and its effect on the overall objective of increasing the rate at which products moved off the shelf.

In a separate consumer product company example, a well-known manufacturer, gleaned insight from research which showed a strong preference for puzzles in coffee and tea drinkers. As a result, they created an online jigsaw puzzle with dynamic content, which netted 190,000 registered users in the very first month of launch and is spreading virally across friend lists on Facebook and Orkut.

As demonstrated above, we have just started touching the tip of the iceberg in terms of what the possibilities are in digital analytics. As Louis Pasteur said, “Chance favors the prepared mind.” As the digital channel explodes around us, chances of success are higher if the organization is prepared to deal with the breadth and depth of digital data.


  1. comScore, Inc., "The comScore 2009 U.S. Digital Year in Review," February 9, 2010.
  2. Forrester Research, "U.S. Interactive Marketing Forecast, 2009 to 2014," July 6, 2009.


SOURCE: Demystifying Digital Analytics

  • Derick JoseDerick Jose

    Derick Jose is the vice president of Advanced Analytics/Research within MindTree's Data & Analytic Solutions (DAS) Group, one of the world’s largest information management practices, which offers customers a one-stop-shop to capture, analyze, enhance, and view their business information. The DAS practice combines MindTree’s proven analytics, business intelligence, information management and research services for customers in the consumer packaged goods (CPG), retail, financial services, insurance, travel and media markets. Derick has 20 years of experience spanning consulting, advanced analytics and business intelligence solutions. He has worked extensively in the CPG, banking, telecom and retail industries. Derick can be contacted at

    Editor's Note: More articles and resources are available in Derick's BeyeNETWORK Expert Channel. Be sure to visit today!

Recent articles by Derick Jose



Want to post a comment? Login or become a member today!

Be the first to comment!