SOCIAL MEDIA DATA
Most of the early marketing analytics efforts were initiated from studies of unstructured texts from public websites. There are two strong reasons why this data provided the fuel for early research. First, the data on public websites is easily available to the academic community. While the early work started with research publications and search engines, it began to branch off rapidly into many aspects of marketing analytics. Second, the data provided an enormous richness to develop powerful hypotheses on social behaviors, which could be studied and perfected. Unlike structured data, the analysis of unstructured text has many nuances. Researchers have to build techniques for understanding and translating human writing, for example, converting unstructured data into structured sentiment scores. A blog or tweet may carry sarcasm or emoticons, which should be interpreted by computer programs.5
Jonathan Taplin has been a seasoned businessman in the entertainment business, having produced successful concerts and movies. He is now the director of the University of Southern California’s Annenberg
Innovation Lab and has been working closely with IBM in studying how unstructured data can be analyzed for Hollywood Marketing.
Our results demonstrate not just the usefulness of monitoring social sentiment but the importance of deeply analyzing the raw results so marketing leaders come away with a precise understanding of what consumers think and want. For example, before the mid-November release of Twilight: Breaking Dawn Part 2 our index showed positive sentiment toward the movie of 90%. Yet on Saturday, Nov. 24, in the midst of the Thanksgiving holiday weekend, the positive sentiment dipped to 75%. Did that mean consumers were disappointed with the film? Actually, no. We discovered on close examination that many of the people who used words in their Tweets signaling sadness or disappointment were reacting to the emotional moments in the film or to the fact that their beloved series is ending with this installment.6
Sentiment analysis of publicly available data is making inroads into marketing organizations in a variety of industries. For example, bing. com and the Fox Network covered President Barack Obama’s State of the Union address, combining the live broadcast with sentiment analytics of opinions and an online voting tool from bing.com. Nearly 12.5 million voters got a chance to express their opinions. These opinions were analyzed and segregated into a number of categories, providing a detailed view of public response to the speech.7 Gatorade has built a social media command center, where they collect and analyze feedback on their brand.8
While this data is extremely useful in detecting trends and patterns, we must be careful in combining this data with other big data. For example, census data provides us with an accurate count of population within a geopolitical area. Can we combine social media data from the geopolitical area to represent the opinion of the entire population? While the numbers of observation points are far higher than a statistically significant sample, it is very much a biased sample representing those who like to express their opinion. It is a highly accurate representation of the subset of population who use social media to express their opinion, but does not represent the rest of the population.
Other than quantification of unstructured text, the posted messages can also be used for discovering underlying patterns and graphs. One such pattern is the influence analysis, which can be measured by the amount of interest a particular blog or tweet generated in its community. There is a variety of ways in which impact of an expressed opinion can be measured and analyzed thereby providing an overall scores for its author. For example, Amazon measures the impact a reviewer has made on product purchases and rank orders these reviewers based on their relative contribution to past purchase decisions.