{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "485c4462-5cf7-4524-8f57-874e8eb6209d",
   "metadata": {},
   "source": [
    "# Demo: Sentiment Analysis on Bluesky"
   ]
  },

                {
                    "cell_type": "markdown",
                    "id": "123456789-930485093240532940945-0324095320945904325",
                    "metadata": {
                        "tags": []
                    },
                    "source": [" _Choose Social Media Platform: <a href='../../../reddit/ch08_data_mining/06_sentiment_analysis/03_demo_sentiment.html'>Reddit</a> | <a href='../../../discord/ch08_data_mining/06_sentiment_analysis/03_demo_sentiment.html'>Discord</a> | __Bluesky__ | <a href='../../../nocode/ch08_data_mining/06_sentiment_analysis/03_demo_sentiment.html'>No Coding</a>_ "]
                    },
                    
  {
   "cell_type": "markdown",
   "id": "cfa189d1-5015-4948-a9be-4c04443bd879",
   "metadata": {},
   "source": [
    "Now let's try using sentiment analysis (and loop variables) on Bluesky:\n",
    "\n",
    "## Normal Bluesky Setup\n",
    "\n",
    "We'll start by doing our normal steps including these helper functions:\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "767dd78e-290d-436a-bbf7-9a2a4f62c6c4",
   "metadata": {},
   "source": [
    "### helper function for atproto links\n",
    "_NOTE: You don't need to worry about the details of how this works, it just is here to make the code later easier to use._"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f886a5cd-7c3e-483a-be2c-67e5e8d1c601",
   "metadata": {},
   "outputs": [],
   "source": [
    "import re #load a \"regular expression\" library for helping to parse text\n",
    "from atproto import IdResolver # Load the atproto IdResolver library to get offical ATProto user IDs\n",
    "\n",
    "# function to convert a feed from a weblink url to the special atproto \"at\" URI\n",
    "def getATFeedLinkFromURL(url):\n",
    "    \n",
    "    # Get the user did and feed id from the weblink url\n",
    "    match = re.search(r'https://bsky.app/profile/([^/]+)/feed/([^/]+)', url)\n",
    "    if not match:\n",
    "        raise ValueError(\"Invalid Bluesky feed URL format.\")\n",
    "    user_handle, feed_id = match.groups()\n",
    "\n",
    "    # Get the official atproto user ID (did) from the handle\n",
    "    resolver = IdResolver()\n",
    "    did = resolver.handle.resolve(user_handle)\n",
    "    if not did:\n",
    "        raise ValueError(f'Could not resolve DID for handle \"{user_handle}\".')\n",
    "\n",
    "    # Construct the at:// URI\n",
    "    post_uri = f\"at://{did}/app.bsky.feed.generator/{feed_id}\"\n",
    "\n",
    "    return post_uri\n",
    "\n",
    "# function to convert a post's special atproto \"at\" URI to a weblink url\n",
    "def getWebLinkFromPost(post):\n",
    "    # Get the user id and post id from the weblink url\n",
    "    match = re.search(r'at://([^/]+)/app.bsky.feed.post/([^/]+)', post.uri)\n",
    "    if not match:\n",
    "        raise ValueError(\"Invalid Bluesky atproto post URL format.\")\n",
    "    user_id, post_id = match.groups()\n",
    "\n",
    "    post_uri = f\"https://bsky.app/profile/{user_id}/post/{post_id}\"\n",
    "    return post_uri"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0484f5ca-586e-4239-883b-1c472372e323",
   "metadata": {},
   "source": [
    "Now we can continue logging in to Bluesky and look through multiple posts.\n",
    "### load atproto library"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0d45a981-86cd-41f0-bc0a-066afdc985b4",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Load some code called \"Client\" from the \"atproto\" library that will help us work with Bluesky\n",
    "from atproto import Client"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8cdfc113-a6b0-4c6f-a1f1-06a47bb83925",
   "metadata": {},
   "source": [
    "### (optional) make a fake Bluesky connection with the fake_atproto library\n",
    "For testing purposes, we\"ve added this line of code, which loads a fake version of atproto, so it wont actually connect to Bluesky. __If you want to try to actually connect to Bluesky, don't run this line of code.__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f5c5eac2-09d3-4244-b4d4-63942fda66ef",
   "metadata": {},
   "outputs": [],
   "source": [
    "%run ../../fake_apis/fake_atproto.ipynb"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9d1ed88a-014b-4918-b389-3d410640b060",
   "metadata": {},
   "source": [
    "### login to Bluesky"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "72567e4d-e517-43f1-a949-49fb29120ddf",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Login to Bluesky\n",
    "# TODO: put your account name and password below\n",
    "\n",
    "client = Client(base_url=\"https://bsky.social\")\n",
    "client.login(\"your_account_name.bsky.social\", \"m#5@_fake_bsky_password_$%Ds\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "800c5cf7-bc4a-4f09-b610-49bea3827699",
   "metadata": {},
   "source": [
    "### find a list of posts from a feed\n",
    "We can now load a feed and find a list of posts.\n",
    "\n",
    "_Note: If you run this on real Bluesky, we canâ€™t gurantee anything about how offensive what you might find is._"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "999f3df5-e058-46f0-9c2d-e2c1af048248",
   "metadata": {},
   "outputs": [],
   "source": [
    "feedUrl = \"https://bsky.app/profile/shouldhaveanimal.bsky.social/feed/aaab56iiatpdo\"\n",
    "atFeedLink = getATFeedLinkFromURL(feedUrl)\n",
    "\n",
    "post_info_list = client.app.bsky.feed.get_feed({'feed': atFeedLink}).feed"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a72f67fc-09df-446f-8956-b1f28fd3d0bd",
   "metadata": {},
   "source": [
    "## Sentiment Analysis\n",
    "### load sentiment analysis library and make analyzer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a476b639-df99-4737-83f8-a37fb3654b50",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import nltk\n",
    "nltk.download([\"vader_lexicon\"])\n",
    "from nltk.sentiment import SentimentIntensityAnalyzer\n",
    "sia = SentimentIntensityAnalyzer()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "549b8cd4-61a6-42b3-ad32-7dff878ddc9c",
   "metadata": {},
   "source": [
    "### loop through submissions, finding average sentiment\n",
    "We can now combine our previous examples of looping through reddit submissions with what we just learned of sentiment analysis and looping variables to find the average sentiment of a set of submission titles."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ec912f23-54af-44ce-86b7-3f445bad5fdf",
   "metadata": {},
   "outputs": [],
   "source": [
    "num_posts = 0\n",
    "total_sentiment = 0\n",
    "\n",
    "for post_info in post_info_list:\n",
    "    \n",
    "    #calculate sentiment\n",
    "    post_sentiment = sia.polarity_scores(post_info.post.record.text)[\"compound\"]\n",
    "    num_posts += 1\n",
    "    total_sentiment += post_sentiment\n",
    "\n",
    "    print(\"Sentiment: \" + str(post_sentiment))\n",
    "    print(\"   post text: \" + post_info.post.record.text)\n",
    "    print()\n",
    "\n",
    "\n",
    "average_sentiment = total_sentiment / num_posts\n",
    "print(\"Average sentiment was \" + str(average_sentiment))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c97763a9-325f-4fd0-8b57-501a068f6335",
   "metadata": {},
   "source": [
    "We can now see the average sentiment of a set of bluesky posts! \n",
    "\n",
    "If you use your bluesky bot keys, you can change the `feedUrl` to be whatever one you want and see whether people are posting positively or negatively in it. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "89d37bac-70ec-495f-bac9-271d38c27ead",
   "metadata": {},
   "source": [
    "## Alternately use search instead of a feed\n",
    "We can also do a search instead of looking up a feed.\n",
    "\n",
    "There are some subtle differences with how things come back, and therefore how we get the text out of the post (basically we get back a list of posts, instead of a list of post_info)\n",
    "\n",
    "### search for posts\n",
    "_Note: If you run this on real Bluesky, we canâ€™t gurantee anything about how offensive what you might find is._"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bebe5195-9beb-4ee3-b279-f49e21090b75",
   "metadata": {},
   "outputs": [],
   "source": [
    "search_query = \"news\"\n",
    "search_results = client.app.bsky.feed.search_posts({'q': search_query}).posts"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f45c1738-2ac2-4171-b9f4-24f2fa77bf78",
   "metadata": {},
   "source": [
    "### load sentiment analysis library and make analyzerimport nltk"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a8e1da18-eadc-46b1-a9d0-4e20ff25967d",
   "metadata": {},
   "outputs": [],
   "source": [
    "import nltk\n",
    "nltk.download([\"vader_lexicon\"])\n",
    "from nltk.sentiment import SentimentIntensityAnalyzer\n",
    "sia = SentimentIntensityAnalyzer()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b954975-b515-4919-aee8-132b1e9279e3",
   "metadata": {},
   "source": [
    "### loop through submissions, finding average sentiment\n",
    "We can now combine our previous examples of looping through reddit submissions with what we just learned of sentiment analysis and looping variables to find the average sentiment of a set of submission titles."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1f59f929-6249-4f8c-8f2f-5ef2fde7ab7b",
   "metadata": {},
   "outputs": [],
   "source": [
    "num_posts = 0\n",
    "total_sentiment = 0\n",
    "\n",
    "for post in search_results:\n",
    "    \n",
    "    #calculate sentiment\n",
    "    post_sentiment = sia.polarity_scores(post.record.text)[\"compound\"]\n",
    "    num_posts += 1\n",
    "    total_sentiment += post_sentiment\n",
    "\n",
    "    print(\"Sentiment: \" + str(post_sentiment))\n",
    "    print(\"   post text: \" + post.record.text)\n",
    "    print()\n",
    "\n",
    "\n",
    "average_sentiment = total_sentiment / num_posts\n",
    "print(\"Average sentiment was \" + str(average_sentiment))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "01b13d1a-701c-4a84-b13b-1b42b22007d1",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}