8.6.3. Demo: Sentiment Analysis on Bluesky#
Choose Social Media Platform: Bluesky | Reddit | Discord | No Coding
Now let’s try using sentiment analysis (and loop variables) on Bluesky:
Normal Bluesky Setup#
Now we can continue logging in to Bluesky and look through multiple posts.
load atproto library#
# Load some code called "Client" from the "atproto" library that will help us work with Bluesky
from atproto import Client
(optional) make a fake Bluesky connection with the fake_atproto library#
For testing purposes, we”ve added this line of code, which loads a fake version of atproto, so it wont actually connect to Bluesky. If you want to try to actually connect to Bluesky, don’t run this line of code.
%run ../../fake_apis/fake_atproto.ipynb
login to Bluesky#
# Login to Bluesky
# TODO: put your account name and password below
client = Client(base_url="https://bsky.social")
client.login("your_account_name.bsky.social", "m#5@_fake_bsky_password_$%Ds")
Do a search for posts on Bluesky#
We’ll now search for posts on Bluesky
Note: If you run this on real Bluesky, we can’t gurantee anything about how offensive what you might find is.
search_query = "news"
search_results = client.app.bsky.feed.search_posts({'q': search_query}).posts
Sentiment Analysis#
load sentiment analysis library and make analyzer#
import nltk
nltk.download(["vader_lexicon"])
from nltk.sentiment import SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()
[nltk_data] Downloading package vader_lexicon to
[nltk_data] C:\Users\kmthayer\AppData\Roaming\nltk_data...
[nltk_data] Package vader_lexicon is already up-to-date!
loop through posts, finding average sentiment#
We can now combine our previous examples of looping through the posts with what we just learned of sentiment analysis and looping variables to find the average sentiment of a set of posts.
num_posts = 0
total_sentiment = 0
for post in search_results:
#calculate sentiment
post_sentiment = sia.polarity_scores(post.record.text)["compound"]
num_posts += 1
total_sentiment += post_sentiment
print("Sentiment: " + str(post_sentiment))
print(" post text: " + post.record.text)
print()
average_sentiment = total_sentiment / num_posts
print("Average sentiment was " + str(average_sentiment))
Sentiment: 0.784
post text: Breaking news: A lovely cat took a nice long nap today!
Sentiment: 0.0
post text: Breaking news: Someone said a really mean thing on the internet today!
Sentiment: 0.7088
post text: Breaking news: Some grandparents made some yummy cookies for all the kids to share!
Sentiment: -0.6114
post text: Breaking news: All the horrors of the universe revealed at last!
Average sentiment was 0.22034999999999996
We can now see the average sentiment of a set of bluesky posts!
If you use your bluesky bot keys, you can change the search_query
to be whatever one you want and see whether people are posting positively or negatively in it.
Alternately look up a feed instead of a search#
We can also look up a feed instead of runnign a search
There are some subtle differences with how things come back, and therefore how we get the text out of the post (basically we get back a list of post_info, instead of a list of posts)
Before we begin though, we need a helper function.
helper function for atproto feed links#
NOTE: You don’t need to worry about the details of how this works, it just is here to make the code later easier to use.
import re #load a "regular expression" library for helping to parse text
from atproto import IdResolver # Load the atproto IdResolver library to get offical ATProto user IDs
# function to convert a feed from a weblink url to the special atproto "at" URI
def get_at_feed_link_from_url(url):
# Get the user did and feed id from the weblink url
match = re.search(r'https://bsky.app/profile/([^/]+)/feed/([^/]+)', url)
if not match:
raise ValueError("Invalid Bluesky feed URL format.")
user_handle, feed_id = match.groups()
# Get the official atproto user ID (did) from the handle
resolver = IdResolver()
did = resolver.handle.resolve(user_handle)
if not did:
raise ValueError(f'Could not resolve DID for handle "{user_handle}".')
# Construct the at:// URI
post_uri = f"at://{did}/app.bsky.feed.generator/{feed_id}"
return post_uri
load posts from a feed#
Now we can put in a feed url weblink and load the recent posts from that feed. Below we have a link to an Animals feed. You can change the feed_url to another feed if you want.
Note: If you run this on real Bluesky, we can’t gurantee anything about how offensive what you might find is.
feed_url = "https://bsky.app/profile/shouldhaveanimal.bsky.social/feed/aaab56iiatpdo"
at_feed_link = get_at_feed_link_from_url(feed_url)
post_info_list = client.app.bsky.feed.get_feed({'feed': at_feed_link}).feed
load sentiment analysis library and make analyzerimport nltk#
import nltk
nltk.download(["vader_lexicon"])
from nltk.sentiment import SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()
[nltk_data] Downloading package vader_lexicon to
[nltk_data] C:\Users\kmthayer\AppData\Roaming\nltk_data...
[nltk_data] Package vader_lexicon is already up-to-date!
loop through submissions, finding average sentiment#
We can now combine our previous examples of looping through reddit submissions with what we just learned of sentiment analysis and looping variables to find the average sentiment of a set of submission titles.
num_posts = 0
total_sentiment = 0
for post_info in post_info_list:
#calculate sentiment
post_sentiment = sia.polarity_scores(post_info.post.record.text)["compound"]
num_posts += 1
total_sentiment += post_sentiment
print("Sentiment: " + str(post_sentiment))
print(" post text: " + post_info.post.record.text)
print()
average_sentiment = total_sentiment / num_posts
print("Average sentiment was " + str(average_sentiment))
Sentiment: 0.5093
post text: Look at my cute dog!
Sentiment: 0.3612
post text: I like lizards
Sentiment: 0.5093
post text: Look at my cute cat!
Average sentiment was 0.4599333333333333