Dhaoui, C., Webster, C. and Tan, L. (2017). Social media sentiment analysis: lexicon versus machine learning. Journal of Consumer Marketing, 34(6), pp.480-488.

From Digital Culture & Society

(Difference between revisions)
Jump to: navigation, search
Revision as of 17:19, 8 March 2018 (edit)
Mm13cf (Talk | contribs)

← Previous diff
Current revision (18:53, 26 March 2018) (edit) (undo)
Mm13cf (Talk | contribs)

 
Line 1: Line 1:
-Online internet advertising has become continuously more targeted as consumers dive deeper and deeper into online shopping. Consumption data is one of the highest forms of currency on the market today; everyone wants the data, and everyone is willing to pay for it. That is one of the major issues that the author of this paper is trying to tackle; how does one make aggregating data cheaper and less time consuming? What Perlich is attempting to overcome is the barriers between consumers and advertisers, while breaking down walls in order to sell their advertisements. Using machine learning systems has already provided advertisers with advantages when it comes to real-time advertising; algorithms bid on ad space, milliseconds after consumers click on links. The highest bidder is the one who wins and takes home the prize of the consumers' demographic information. These machine learning systems provide incredibly detailed information in regard to real-time consumer behavior. Data is collected on the actions of consumers that push them towards brands, and they give advertisers opportunities to make instant decisions about whether to advertise to them or not while delivering ads in real time if it is a consumer that will be most likely to buy their product after clicking. The authors look at how to integrate data characteristics and data availability constraints into a robust, all around learning system. +The article chosen for this review, “Social media sentiment analysis: lexicon versus machine learning”, aims to increase the reach of digital marketers in regard to consumer’s opinions towards different products, services and brands in the online sphere. In order to do so, the authors analyzed two different techniques that classify sentiments of social media comments. The authors used comments taken from Facebook pages of brands, in order to compare a lexicon based approach and a machine learning based approach to sentiment analysis. A lexicon based approach considers the connotation around the semantic orientation, while a machine based approach uses algorithms created for language processing; manually inputted data that was originally classified by a human is the base point for sentiment analysis with this technique. The author asked three research questions at the beginning of the article;
 +# The first research question looked at whether the two sentiment analysis techniques outlined above are appropriate for the analysis of social media conversations; the authors were basically asking if those two tests were reliable in the first place. The authors found that both tests provided the same output of classification when applied to positive comments, and negative comments were positively correlated as well, just not as strong identification as the positive comments.
 +# The second research question asked how the two techniques differed in their approach to social media conversation analysis, and provided details on how to decide which approach to use in a particular situation. Machine learning approaches tend to be stronger in their identification abilities, but machine learning approaches would not be possible without the data accrued from the lexicon based approach. The authors proposed a combination of the two approaches to be ideal, and using manually captured data from a lexicon approach to begin the machine learning data set training would be best suited for general usage of the sentiment analysis techniques.
 +# Finally, the author questioned if the combination approach actually improves the overall accuracy of sentiment analysis while looking at social media conversations. The results of this article indicated that both techniques are similar in their output of different classifications, and that the two approaches are stronger for identifying and classifying positive comments rather than negative comments. This is because not only do machines find it difficult to analyze sarcasm, but humans do as well while manually classifying data through the lexicon approach. The authors still discovered that the two approaches are stronger when used together, and propose a combined approach to be the ideal way to do sentiment analysis.
 +This paper contributed a large degree of its findings from the research; beginning with the empirical tests of two sentiment analysis approaches that are prominent within the field. These two approaches, lexicon based and machine based, have different approaches but similar performance. The authors also provided evidence towards the idea that a combination of the two approaches has more precision than either on its own. The authors gave credence to tools that provide options for attempting to analyze the sheer amount of data that is spewed out on social media sites. These findings will help improve the accuracy of target markets and target advertising for digital marketers, leading to more personalized and target messages for consumers.
- There are many problems outlined in the article. One of the major problems within the consumer data industry is the high cost of gaining this information about the consumers; data this specific and useful does not come cheap. Being able to specifically hit your target market is costly, which is why the authors of this article proposed a transfer learning system in order to hit around the target market as well, leading to more usable data at cheaper costs. Transfer learning is when the machine is able to take it upon itself to learn something else that may improve their data mining. It doesn’t apply to information like the main target in the example distribution, features that are describing the examples, the quantity be modeled or functional dependence between the learning and the features. The machine is then able to take the knowledge that it is able to glean about the alternate task and apply it to the new task, identifying more users that are likely to purchase from the campaign the machine is programmed for. +Significant improvements were shown as well, in regard to these two classification techniques, which was not originally supported by previous literature. The authors reported that this research study, and others based on sentiment analysis before it, are negatively impacted due to limitations on the assessment of only text-based messages in automated sentiment analysis. The author proposes research into the sentiment analysis of other content, such as images and videos. That is the future of sentiment analysis; these techniques, or others grown from the base of these groundbreaking tools, will be used on every piece of data floating through the internet. Comments, videos, pictures and other forms of communication will be analyzed thoroughly in order to provide you with advertisements and other interesting data that a computer algorithm says that you will enjoy.
- 
-Another issue addressed by Perlich is the potential bias when it comes to manually choosing the consumers you would like to advertise to. This bias is not represented when it is an algorithm decided who to advertise to, and not human interference. The transfer learning process allows the machine to learn on its own, with minimal human interference in the process. Minimal interference is vital to removing bias from the data and allowing the machine to decide who will or will not buy your product is more fruitful that trying to find a target market that doesn’t actually need the product. Finally, as there needs to be minimal human interference, systems are only able to run one advertising campaign and need to be left alone to do so. It is cheaper than gaining consumer data separately, but with multiple campaigns there would need to be multiple systems in place to gather consumer data consistently and efficiently.  
- 
- 
- This paper presented a breakthrough in the aggregation of consumer data, but there will always be a manual portion needed for the system to work. The authors addressed what is known as the “cold-start” problem; cases where no advertisements have been shown or programmed into the system, leaving no direct models for the target task to follow. This means that consumer data is not being collected, which creates a vacuum of time and money for the advertisers. The transfer learning model they provide, however, gives advertisers the potential to increase the targeting of machine learning for digital advertising, leading to more specific advertisements for a target market that is more likely to purchase the product after clicking the link. All of that is decided automatically by the machine learning system, and your target market can continually grow with the transfer learning process provided by Perlich. 

Current revision

The article chosen for this review, “Social media sentiment analysis: lexicon versus machine learning”, aims to increase the reach of digital marketers in regard to consumer’s opinions towards different products, services and brands in the online sphere. In order to do so, the authors analyzed two different techniques that classify sentiments of social media comments. The authors used comments taken from Facebook pages of brands, in order to compare a lexicon based approach and a machine learning based approach to sentiment analysis. A lexicon based approach considers the connotation around the semantic orientation, while a machine based approach uses algorithms created for language processing; manually inputted data that was originally classified by a human is the base point for sentiment analysis with this technique. The author asked three research questions at the beginning of the article;

  1. The first research question looked at whether the two sentiment analysis techniques outlined above are appropriate for the analysis of social media conversations; the authors were basically asking if those two tests were reliable in the first place. The authors found that both tests provided the same output of classification when applied to positive comments, and negative comments were positively correlated as well, just not as strong identification as the positive comments.
  2. The second research question asked how the two techniques differed in their approach to social media conversation analysis, and provided details on how to decide which approach to use in a particular situation. Machine learning approaches tend to be stronger in their identification abilities, but machine learning approaches would not be possible without the data accrued from the lexicon based approach. The authors proposed a combination of the two approaches to be ideal, and using manually captured data from a lexicon approach to begin the machine learning data set training would be best suited for general usage of the sentiment analysis techniques.
  3. Finally, the author questioned if the combination approach actually improves the overall accuracy of sentiment analysis while looking at social media conversations. The results of this article indicated that both techniques are similar in their output of different classifications, and that the two approaches are stronger for identifying and classifying positive comments rather than negative comments. This is because not only do machines find it difficult to analyze sarcasm, but humans do as well while manually classifying data through the lexicon approach. The authors still discovered that the two approaches are stronger when used together, and propose a combined approach to be the ideal way to do sentiment analysis.

This paper contributed a large degree of its findings from the research; beginning with the empirical tests of two sentiment analysis approaches that are prominent within the field. These two approaches, lexicon based and machine based, have different approaches but similar performance. The authors also provided evidence towards the idea that a combination of the two approaches has more precision than either on its own. The authors gave credence to tools that provide options for attempting to analyze the sheer amount of data that is spewed out on social media sites. These findings will help improve the accuracy of target markets and target advertising for digital marketers, leading to more personalized and target messages for consumers.

Significant improvements were shown as well, in regard to these two classification techniques, which was not originally supported by previous literature. The authors reported that this research study, and others based on sentiment analysis before it, are negatively impacted due to limitations on the assessment of only text-based messages in automated sentiment analysis. The author proposes research into the sentiment analysis of other content, such as images and videos. That is the future of sentiment analysis; these techniques, or others grown from the base of these groundbreaking tools, will be used on every piece of data floating through the internet. Comments, videos, pictures and other forms of communication will be analyzed thoroughly in order to provide you with advertisements and other interesting data that a computer algorithm says that you will enjoy.



Megan McGuire

Personal tools
Bookmark and Share