8.6.3. Demo: Sentiment Analysis on Bluesky#
Choose Social Media Platform: Reddit | Discord | Bluesky | No Coding
Now let’s try using sentiment analysis (and loop variables) on Bluesky:
Normal Bluesky Setup#
We’ll start by doing our normal steps including these helper functions:
helper function for atproto links#
NOTE: You don’t need to worry about the details of how this works, it just is here to make the code later easier to use.
import re #load a "regular expression" library for helping to parse text
from atproto import IdResolver # Load the atproto IdResolver library to get offical ATProto user IDs
# function to convert a feed from a weblink url to the special atproto "at" URI
def getATFeedLinkFromURL(url):
# Get the user did and feed id from the weblink url
match = re.search(r'https://bsky.app/profile/([^/]+)/feed/([^/]+)', url)
if not match:
raise ValueError("Invalid Bluesky feed URL format.")
user_handle, feed_id = match.groups()
# Get the official atproto user ID (did) from the handle
resolver = IdResolver()
did = resolver.handle.resolve(user_handle)
if not did:
raise ValueError(f'Could not resolve DID for handle "{user_handle}".')
# Construct the at:// URI
post_uri = f"at://{did}/app.bsky.feed.generator/{feed_id}"
return post_uri
# function to convert a post's special atproto "at" URI to a weblink url
def getWebLinkFromPost(post):
# Get the user id and post id from the weblink url
match = re.search(r'at://([^/]+)/app.bsky.feed.post/([^/]+)', post.uri)
if not match:
raise ValueError("Invalid Bluesky atproto post URL format.")
user_id, post_id = match.groups()
post_uri = f"https://bsky.app/profile/{user_id}/post/{post_id}"
return post_uri
Now we can continue logging in to Bluesky and look through multiple posts.
load atproto library#
# Load some code called "Client" from the "atproto" library that will help us work with Bluesky
from atproto import Client
(optional) make a fake Bluesky connection with the fake_atproto library#
For testing purposes, we”ve added this line of code, which loads a fake version of atproto, so it wont actually connect to Bluesky. If you want to try to actually connect to Bluesky, don’t run this line of code.
%run ../../fake_apis/fake_atproto.ipynb
login to Bluesky#
# Login to Bluesky
# TODO: put your account name and password below
client = Client(base_url="https://bsky.social")
client.login("your_account_name.bsky.social", "m#5@_fake_bsky_password_$%Ds")
find a list of posts from a feed#
We can now load a feed and find a list of posts.
Note: If you run this on real Bluesky, we can’t gurantee anything about how offensive what you might find is.
feedUrl = "https://bsky.app/profile/shouldhaveanimal.bsky.social/feed/aaab56iiatpdo"
atFeedLink = getATFeedLinkFromURL(feedUrl)
post_info_list = client.app.bsky.feed.get_feed({'feed': atFeedLink}).feed
Sentiment Analysis#
load sentiment analysis library and make analyzer#
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()
[nltk_data] Downloading package vader_lexicon to
[nltk_data] C:\Users\kmthayer\AppData\Roaming\nltk_data...
[nltk_data] Package vader_lexicon is already up-to-date!
loop through submissions, finding average sentiment#
We can now combine our previous examples of looping through reddit submissions with what we just learned of sentiment analysis and looping variables to find the average sentiment of a set of submission titles.
num_posts = 0
total_sentiment = 0
for post_info in post_info_list:
#calculate sentiment
post_sentiment = sia.polarity_scores(post_info.post.record.text)["compound"]
num_posts += 1
total_sentiment += post_sentiment
print("Sentiment: " + str(post_sentiment))
print(" post text: " + post_info.post.record.text)
average_sentiment = total_sentiment / num_posts
print("Average sentiment was " + str(average_sentiment))
Sentiment: 0.5093
post text: Look at my cute dog!
Sentiment: 0.3612
post text: I like lizards
Sentiment: 0.5093
post text: Look at my cute cat!
Average sentiment was 0.4599333333333333
We can now see the average sentiment of a set of bluesky posts!
If you use your bluesky bot keys, you can change the feedUrl
to be whatever one you want and see whether people are posting positively or negatively in it.
Alternately use search instead of a feed#
We can also do a search instead of looking up a feed.
There are some subtle differences with how things come back, and therefore how we get the text out of the post (basically we get back a list of posts, instead of a list of post_info)
search for posts#
Note: If you run this on real Bluesky, we can’t gurantee anything about how offensive what you might find is.
search_query = "news"
search_results = client.app.bsky.feed.search_posts({'q': search_query}).posts
load sentiment analysis library and make analyzerimport nltk#
import nltk
from nltk.sentiment import SentimentIntensityAnalyzer
sia = SentimentIntensityAnalyzer()
[nltk_data] Downloading package vader_lexicon to
[nltk_data] C:\Users\kmthayer\AppData\Roaming\nltk_data...
[nltk_data] Package vader_lexicon is already up-to-date!
loop through submissions, finding average sentiment#
We can now combine our previous examples of looping through reddit submissions with what we just learned of sentiment analysis and looping variables to find the average sentiment of a set of submission titles.
num_posts = 0
total_sentiment = 0
for post in search_results:
#calculate sentiment
post_sentiment = sia.polarity_scores(post.record.text)["compound"]
num_posts += 1
total_sentiment += post_sentiment
print("Sentiment: " + str(post_sentiment))
print(" post text: " + post.record.text)
average_sentiment = total_sentiment / num_posts
print("Average sentiment was " + str(average_sentiment))
Sentiment: 0.784
post text: Breaking news: A lovely cat took a nice long nap today!
Sentiment: 0.0
post text: Breaking news: Someone said a really mean thing on the internet today!
Sentiment: 0.7088
post text: Breaking news: Some grandparents made some yummy cookies for all the kids to share!
Sentiment: -0.6114
post text: Breaking news: All the horrors of the universe revealed at last!
Average sentiment was 0.22034999999999996