Twitter Sentiment Analysis

1559 Words4 Pages

section{Introduction}

Many new forms of communication have emerged in the past few decades such as text messaging and have become quite popular and important. These new forms of communication convey huge range of information and are also popularly used to share sentiments and opinions about different events and topics. We have worked on the following task. The task is:

egin{itemize}

item Given a message, classify whether the message is of positive, negative, or neutral sentiment. For messages conveying both a positive and negative sentiment, whichever is the stronger sentiment should be chosen.

end{itemize}

pagebreak

section{Motivation}

We often encounter many challenges when we work with these informal texts like tweets that when we work with traditional texts like newswire data. Tweets are generally short and crisp: they have to be concluded in a sentence or two. And that makes the use of language very informal, along with a lot of newly created spellings, slang, new abbreviations like tc tor "take care", gr8 for "great" and so on. And along with all this we have the hash tags with perform the task equivalent to tagging for the Twitter messages. Recently, the task of handling such challenges and automatically understand the opinions conveyed by these tweets has become quite popular and has become the subject of research. \

One important aspect of the tweets is that they have highly structured data about different aspect of the actual communication like location, language, individuals, time, etc. Twitter keeps track of different pieces of relevant information in JSON format and we can model such information to our greater use. This associated information is useful for a variety of purposes, including but not ...

... middle of paper ...

...on tweets for training. Our method achieves good accuracy with relatively small data size.

pagebreak

section{Future Work}

egin{itemize}

item We have covered most of the features in our classification. Bit, we didn't include effect of following features on classification accuracy.

egin{itemize}

item Taking care of emotions conveyed by abbreviations

item Analysing if subsequent sentences in a tweet are more important. (For eg. giving greater weight to a $2^{nd}$ line in a tweet of 2 lines.)

end{itemize}

item Although it was clear from work done by others on the same problem that SVM tends to perform better than other classifiers, it would be interesting to see how hybrid of other classifiers (like naive bayes classifier) with SVM would perform. (In our work we tried hybrid of bag of words with SVM which improved the accuracy)

end{itemize}

Open Document