Social Sentiment Analysis with Hadoop
sentiment analysis is simply the most overrated of the Hadoop applications,
which should be no surprise, given that we breathe in a world with a constantly
connected and expressive population. This use of Hadoop leverages all kind of contents from content management
systems, blogs, forums and other social media tools to generate a sense of what individuals are
doing (for instance, life events) and how they react to the people around them (sentiment).
Since text-based data doesn’t usually fit into a relational database (RDMS’s),
Hadoop is a perfect destination to explore and analyse this kind of data.
is difficult to interpret, even for human beings at times — especially if we
are reading text written by people in a social group that’s different from our
own. This group of people may be speaking our language, but their expressions
and style are completely foreign, so we have no idea whether they’re talking
about a good experience or a bad one. For example, if we heard the phrase bomb
with reference to a movie, we may conclude that the movie was not good (or
good, if we are part of the youth movement that recognizes “its bomb” as a
compliment); also, if we are in the airline security business, this phrase bomb
would led us to a different interpretation. The thing is that linguistics is
used in variety of distinct ways and is constantly evolving.
we analyse sentiment on social media, we can choose from multiple approaches.
The basic method programmatically parses the text phrases and expressions,
extracts strings, and applies logics or rules. In most common conditions, this mechanism
is practical and reasonable. But as a requirement varies and rules get more
complicated, manually coding text-extractions clearly becomes no longer effectively
feasible from the point of view for code maintenance, especially for
performance optimization. Grammar- and rules-based strategies to text
processing are computationally expensive, which is an important constraint in
large-scale extraction in Hadoop. The greater involved the rules (which are inevitable
for complex purposes such as sentiment extraction), the more processing that’s
software development, a statistics-based alternative approach is becoming
increasingly common for sentiment analysis. Rather than manually write complex
rules, we can use the classification-oriented machine-learning models in Apache
Mahout. The catch here is that we will need to train our models with examples
of positive and negative sentiment. The more training data we provide (for
example, text from tweets and your classification), the more accurate our results.
The social sentiment analysis can be applied across a wide range of industries
for example, food safety, health care etc.