[ad_1]
Each film buff has a listing of favourite motion pictures jotted down of their reminiscence. Have you ever ever puzzled what different folks take into consideration your favourite film? When you have puzzled so, it’s best to positively take into account analyzing tweets on Twitter for the film. One of many newest motion pictures I watched is an Indian film named ‘Rocketry’. I completely beloved it. Right here is a few evaluation achieved utilizing the Twitter API and a few attention-grabbing libraries in Python for the film ‘Rocketry’.
Disclaimer: All the outcomes and textual content that you just see within the output usually are not my particular person opinion. I’m merely mining knowledge.
Be aware: In some instances, I’ve imported the libraries simply the place the library is meant for use. That is for the convenience of understanding.
Pre-requisite: Check with the Twitter API Documentation | Docs | Twitter Developer Platform to get your API keys arrange.
Be aware: Should you get an error ‘Read-only application cannot POST.’, do the next –
1. Go to http://dev.twitter.com/apps and login
2. Within the Settings tab, change the Software kind to Learn, Write and Direct messages
3. Within the Reset keys tab, press the Reset button, replace the buyer key and secret in your software accordingly.
Should you get the error message ‘ValueError: Expected object or value’, it may imply that your JSON is empty. To keep away from this, a particular dealing with situation has been added — os.path.getsize(path1+””+file) > 0
Which of the tweets have the best variety of retweets? (High 10)
As you possibly can see under, duplicates are being eliminated utilizing the column ‘clean_tweet’. You would possibly surprise why that is so. It is because there are situations the place the tweet has precisely the identical textual content, however the tweet has a URL which is in a special format. For example, let’s assume that we’ve got a tweet ‘Rocketry: The Nambi Effect’nnhttps://t.co/lxpRjAONFa
Let’s say there’s one other tweet with the identical textual content however one string within the URL is completely different — ‘Rocketry: The Nambi Effect’nnhttps://t.co/4C8KGt1k88. drop_duplicates will take into account these two rows as distinctive rows. Nonetheless, clean_tweet strips punctuation and URL strings and so it’s more practical in eradicating duplicates.
The above query goes to be answered by Sentiment Evaluation. As you may need seen, Sentiment Evaluation has been defined in lots of blogs and tutorials and is utilized in many alternative methods. There’s a plethora of libraries out there for Sentiment Evaluation and every of them has its personal algorithm and corresponding accuracies. Having used completely different libraries for sentiment evaluation, I’ve realized that always instances, utilizing completely different libraries and creating an ensemble is more practical than utilizing particular person sentiment evaluation libraries. Nonetheless, on this case, I’m simply going to go forward with only one library within the curiosity of time.
Within the under assertion, a brand new column named ‘Final Sentiment’ has been created utilizing the polarity values obtained within the earlier step.
tweet_counts = final_df['Final Sentiment'].value_counts()
tweet_counts
final_df['Final Sentiment'].value_counts(kind=True).nlargest(10).plot.bar()
pd.set_option ('show.max_colwidth', 3)
happy_df = final_df[(final_df['Final Sentiment']=='Joyful/Enthralled')]
happy_df['tweet'].head(n=10)
sad_df = final_df[(final_df[‘Final Sentiment’]==’Unhappy/Pensive’)]
sad_df[‘tweet’].head(n=10)
Mixed_Emotions_df = final_df[(final_df['Final Sentiment']=='Blended Feelings')]
Mixed_Emotions_df['tweet'].head(n=10)
An attention-grabbing and highly effective library that can be utilized for textual content that has hashtags and emoticons is ‘Advertools’. Under, we exploit ‘Advertools’ highly effective and enjoyable modules to uncover some fairly cool insights.
What are some emoticons (Emojis) which were used within the Tweets?
import advertools as adv
emoji_summary = adv.extract_emoji(final_df['tweet'])
Emojis_used_in_tweets = pd.DataFrame.from_dict(emoji_summary['top_emoji'])pd.set_option ('show.width', 100)
Emojis_used_in_tweets.columns = ['Emoticon', 'Count']
Emojis_used_in_tweets.model.set_table_attributes('model="font-size: 50px"')
Emojis_used_in_tweets[Emojis_used_in_tweets['Count']>35] #Solely show emoticons which have a depend higher than 35
pd.set_option ('show.width', 200)
hashtag_summary = adv.extract_hashtags(final_df['tweet'])
Hashtag_used_in_tweets = pd.DataFrame(hashtag_summary['top_hashtags'])
Hashtag_used_in_tweets.columns = ['Emoticon', 'Count']
Hashtag_used_in_tweets.model.set_table_attributes('model="font-size: 11px"')
Hashtag_used_in_tweets[Hashtag_used_in_tweets['Count']>35] #Solely show hashtags which have a depend higher than 35
pd.set_option (‘display.width’, 200)
mentions_summary = adv.extract_mentions(final_df[‘tweet’])
Top_Mentions_in_tweets = pd.DataFrame(mentions_summary[‘top_mentions’])
Top_Mentions_in_tweets.columns = [‘Mention’, ‘Count’]
Top_Mentions_in_tweets.model.set_table_attributes(‘style=”font-size: 11px”’)
Top_Mentions_in_tweets[Top_Mentions_in_tweets[‘Count’]>30] #Solely show mentions which have a depend higher than 30
A take a look at the Mentions Abstract
The under supplies the variety of posts, variety of mentions, mentions per submit, and distinctive mentions.
mention_summary = adv.extract_mentions(final_df['tweet'])
mention_summary['overview']
What are some questions which were posted within the tweets?
question_summary = adv.extract_questions(final_df['tweet'])
final_questions_list = [q for q in question_summary['question_text'] if q!= []]
questions_posed_in_tweets = pd.DataFrame()
for query in final_questions_list:
questions_posed_in_tweets = questions_posed_in_tweets.append(query)
questions_posed_in_tweets
questions_posed_in_tweets.columns = ['Question'] # Rename column
questions_posed_in_tweets = questions_posed_in_tweets.drop_duplicates(hold='first') # After eradicating duplicates, there are 3220 rows
with pd.option_context('show.max_rows', None,):
print(questions_posed_in_tweets.to_markdown)
What are some exclamatory remarks which were posted within the tweets?
exclamations_summary = adv.extract_exclamations(final_df['tweet'])
exclamations_list = [q for q in exclamations_summary['exclamation_text'] if q!= []]
exclamations_posed_in_tweets = pd.DataFrame()
for exclamation in exclamations_list:
exclamations_posed_in_tweets = exclamations_posed_in_tweets.append(exclamation)
exclamations_posed_in_tweets
exclamations_posed_in_tweets.columns = ['Exclamation'] # Rename column
exclamations_posed_in_tweets = exclamations_posed_in_tweets.drop_duplicates(hold='first') # After eradicating duplicates, there are 3220 rows
with pd.option_context('show.max_rows', None,):
print(exclamations_posed_in_tweets.to_markdown)
Twitter Evaluation has a large number of makes use of. It may be used only for enjoyable or for a really particular function, say to enhance your corporation. Let’s say you will have a enterprise (product/service) that you just promote utilizing Twitter. It actually helps to mine and analyze opinions to know your clients higher. It additionally will serve to know the shortcomings of your product and thereby present actionable insights to enhance your product’s high quality.
[ad_2]
Source link