top of page

Your basic understanding is needed in big data analysis (Part1)

Source: Elizabeth Wong Kit-wai, Senior media specialist and counselor in big data analysis at Wisers    2021.03.08

How to understand “buzz” and “popularity” in big data?

First, you should understand they are the two different definitions. “Heated discussion” is a neutral indicator reflecting the degree of attention and discussion. Yes, as you have noted, “attention” and “discussion” are also two different definitions. I shared the pictures of myself and cooked steak on social media, receiving 50 likes and 70 applauds. That means I have drawn the “attention” from my friends on social media, which were their feelings at that moment. However, “discussion” involves certain contents, in which actions (comments) are needed to express ideas when emoticons are not enough to express their feelings. More engagements along with richer connotations are involved. Many people merely “like” the post without providing any comment. Likewise, people commenting on the post may not add emoticons. The third type is that people share the post (including the comments) as a way of participating in the discussion.   

Therefore, buzz is more than the number of likes, teases and applauds, but the overall interactions among netizens on the Internet. It’s also inaccurate to understand three kinds of users who post an EMOJI, comment or repost in the same way as the degree of their attention varies. For instance, celebrity A posted her sexy photo, which received 1,500 likes, 150 comments and 50 reposts, generating the total engagements of 1,700, while official B posted a sentence, which received 600 EMOJIs, 480 comments and 200 reposts, generating the total engagements of 1,280. We should not make an easy comparison or say celebrity A’s sexy photo is more popular than official B’s remarks. In academic language, more EMOJIs posted by netizens to celebrity’s sexy photo are only “weak ties”, which means they are less engaged than those who have spent time leaving a few comments.  

That is why it’s improper to calculate “buzz” by just summing up the engagements. By doing so, the data will reflect an aberration in its meaning and lead to errors when it is used in making a comparison or as an indicator. To provide a solution, we must have a statistician build an algorithm based on previous data to more accurately reflect the degree of heated discussions comprehensively presented by three different forms of engagement.

 

This is different from the big data we usually understand, which just refer to a bunch of original numbers. They are mostly characterized by the huge volumes and “big" size. As a result, we may overlook traits of different behaviors of social media users as well as relevant underlying meaning. Misunderstanding may also arise subsequently. Hence, I want to give an example.

 

In recent years, digital news has become an emerging news format, which was attributed to the gradual popularization of big data application and the greater access to open resources on the platforms by editors. One of the widely-used data was the number of searches made on the Internet. At the end of July 2020, the government of Hong Kong SAR initiated a dine-in ban to cope with the COVID-19 pandemic, which triggered heated discussion. Some netizens deemed that such a measure plunged workers into a dilemma, joking that “office workers need to “learn photosynthesis before making a living”. Some media outlets detected such comments and searched “dine-in” and “photosynthesis” online, concluding that the news buzz was on the rise.

 

Seemingly, the conclusion was in line with public opinions. In fact, social platforms were flooded with news about the dine-in ban. However, it was found that “dine-in ban” was indeed a hot topic, while “photosynthesis” was not a frequently searched word, according to the Big Data Analytics of Wisers. Does it indicate that either Wisers or web search engines provided the wrong data?

 

Not exactly. This highlights the difference between the hierarchies of the two sets of data. Web search engines showcase the popularity of search terms. If a user has no idea about the term or he doesn’t want to learn about the topic, he won’t search it online. Wisers Big Data Analytics calculates all interactions on social media. If a user notices “photosynthesis” on his social media and he joins the discussion, this will be recorded on Wisers’ big data base. However, if no one cares about the topic and leaves any post online, the chance for the user to get connected with “photosynthesis” will reduce.

In other words, online searches reflect hot words, but Wisers Big Data Analytics showcases a specific breakdown of hot topics among all social media users. For instance, people who opposed the dine-in ban were particularly concerned and sensitive about the hot word “photosynthesis”. Nevertheless, the word didn’t resonate with a larger proportion of users.

Do the analysis results provide any significance in practical use? Yes. Assuming that there is a need for public communication on that very day, then knowing the hot words obtained through big data can help choose appropriate and effective key points and words for efficient communication with specific target audiences. As what the renowned political analyst Frank Luntz points out in his best-selling book titled Words That Work: It's Not What You Say, It's What People Hear, knowing both ourselves and our adversaries enables efficient communication. Living in the era of social media, everyone is eager to express his/her feelings, fearing that he/she would be isolated due to inattentiveness. It actually serves as a perfect platform for knowing our adversaries. However, a thorough understanding of various functions of social media and characteristics of the platform is necessary to ensure accurate analysis and minimum deviation.

 

Well, I just used more than a thousand words to talk about “buzz”, then how about popularity? Are the data of the two interchangeable? If the answer is no, then how to use big data to interpret popularity? How will they inspire companies in political and business circles? I will talk about that next time.

bottom of page