Sentiment Analysis of Twitter Data
Social networks are a primary resources to gather information about people’s opinions and sentiments towards different topics as they spend hours daily on social media and share their opinion. In this article, we shall discuss the applications of sentiment analysis and how to connect to Twitter and run sentiment analysis queries. Basic knowledge of Python is required for understanding the code. <!--more-->
Sentiment analysis is the process of extracting the sentiment from a piece of text and classifying it as positive, negative, or neutral accordingly.
Why Sentiment Analysis on Twitter?
Sentiment analysis has many applications for different domains. For example, companies can learn about users' feedback and reviews via social media - and get actual feedback about their products. A social network is a rich platform to learn about people's opinions and sentiment regarding different topics as they communicate and share their opinions.
Twitter has 1.3 billion accounts with 330 million monthly active users and 145 million daily users. Twitter data is the most comprehensible source of live, public conversations worldwide and therefor can serve as a valuable tool for understanding customer sentiment as people and markets respond to product and business decisions.
Sentiment analysis can predict the outcome of upcoming events, evaluate the impact of a recent product launch, pivot the direction or content of an ad campaign, and more.
Sentiment analysis with Python
We will be using the Twitter API to get real-time tweets and perform sentiment analysis and visualize our findings.
Setup development environment
Install the required libraries using pip
. Make sure you have Python 3 installed.
pip install tweepy
pip install textblob
pip install nltk
You should have a Twitter account. Apply for a developer account. Fill out the details asked in the subsequent steps.
Submit the application and wait for developer access.
After getting access, we need to create an app and API key in order to authenticate and integrate with most Twitter developer products.
Fill out the required details. Ignore the fields you don't need (these are used for authenticating with Twitter and other use cases.)
Go to the Keys and Tokens
tab under your app to get the API key and API secret key. (Do not share with others.)
Lets get started
Let's start coding. We use a Python library called tweepy for authenticating and getting information from the Twitter API. You can also use twitter.
import tweepy
access_token = "xxxx"
access_token_secret = "xxxx"
consumer_key = "xxxx"
consumer_secret = "xxxx"
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
Tweepy's API class provides access to the entire twitter RESTful API methods. Each method can accept various parameters and return responses.
We will use API.search which returns a collection of relevant tweets matching a specified query. A raw tweet may contain many unwanted characters and information which may not be necessary like emoji, "@" mentions, "#" hashtags, etc. These may be useful in some other scenario.
query=""
count=""
tweets = api.search(q=query, count=count)
text = []
for i in tweets:
text.append(i.text)
#we are extracting text and excluding metadata from the tweet
#this method is useful for getting tweets related to a particular topic
We can also use API.mentions_timeline
to get the most recent tweets where your organizations has been tweeted about.
Next, we clean the text before using sentiment analysis.
import re
text = re.sub('@[A-Za-z0–9]+', '', text)
text = re.sub('#', '', text)
text = re.sub('RT[\s]+', '', text)
text = re.sub('https?:\/\/\S+', '', text)
#removed @mentions, #hastags and URLs
We will use the TextBlob library to get the sentiment of the text.
from textblob import TextBlob
text = cleaned_tweet
blob = TextBlob(text)
sentiment = blob.sentiment.polarity
#blob.sentiment returns (polarity, subjectivity)
#polarity is within the range [-1.0, 1.0]. -1 very negative. 0 neutral. 1 very positive
#The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective.
TextBlob has Naive-Bayes classifier which we can use to classify text.
How Sentiment Analysis works
Let's examine two statements:
- This is a beautiful day, I am very happy.
- What a horrible person.
When we read these sentences, the words happy
, beautiful
describe the positivity of sentence 1 and the word horrible
describes the negativity of sentence 2. Words like is
, a
don't convey any particular sentiment and are neutral.
We can have rule-based systems that perform sentiment analysis based on a set of manually crafted rules or automatic systems that rely on machine learning techniques to learn from data. These systems learn which words represent a positive sentiment and which represent a negative.
Of course, words can have different meanings in different contexts and to different persons. Therefore - we can train a supervised machine learning algorithm to perform sentiment analysis. If you are interested you can look at my code here, where I implemented Sentiment Analysis using neural networks.
Sample Results and Conclusion
As a sample use case, I made an analysis on tweets related to JavaScript. The following are the results.
- An example of a raw tweet.
RT @kennyrecruiter: Microsoft has released tutorials on building #Javascript applications using #nodejs. They look good and are easy to fol…
- After cleaning the text
: Microsoft has released tutorials on building Javascript applications using nodejs. They look good and are easy to fol…
-
Our algorithm predicts this as
1
, meaning positive. -
The sentiment of 125 random tweets.
- Word Clouds are used to visually represent words in a text, the bigger the font size, the more a word is repeated.
Without cleaning:
With cleaning:
As you can see, cleaning text is a very important step, because it allows us to get the main information the text is trying to convey.
Sentiment analysis can be useful for a team in a similar manner to find out how their users are liking their product.