A5: Best Comments#
Choose Social Media Platform: Reddit | Discord | Bluesky | No Coding
In this assignment you will be modifying a recursive function that prints a posts and replies on a discord channel (ignoring threads). Your goal will be to only show the best posts and replies. It will be up to you to decide what rules you use to decide which posts are the best posts.
Discord Setup#
# Load some code called "discord" that will help us work with Discord
import discord
# Load another library that helps the bot work in Jupyter Noteboook
import nest_asyncio
nest_asyncio.apply()
(optional) make a fake Discord connection with the fake_discord library
For testing purposes, we’ve added this line of code, which loads a fake version of discord, so it wont actually connect to Discord. If you want to try to actually connect to Discord, don’t run this line of code.
%run ../../../../fake_apis/fake_discord.ipynb
%run discord_keys.py
# set up Discord client with permissions to read message_contents
intents = discord.Intents.default()
intents.message_content = True
Helper function to display text in an indented box#
(You don’t need to worry about how this works. This is that function that helps display posts in indented boxes)
from IPython.display import HTML, Image, display
import html
def display_indented(text, left_margin=0, color="white"):
display(
HTML(
"<pre style='border:solid 1px;padding:3px;margin-left:"+str(left_margin)+"px;background-color:"+color+"'>" +
html.escape(text) +
"</pre>"
)
)
Helper function to reconstruct reply tree#
(You don’t need to worry about how this works. This is that function that helps take the list of posts from the channel history and organize it into a proper reply tree structure)
async def reconstruct_reply_tree(recent_posts):
# make a post + replies entry for each post (replies empty for now)
posts_with_replies_info = [{"post": recent_post, "replies": []} for recent_post in recent_posts]
# create look-up dictionary for the post+replies entries based on the post id
post_with_replies_lookup = {post_with_replies["post"].id: post_with_replies for post_with_replies in posts_with_replies_info}
# start a list that will become our post tree
post_tree = []
# go through all the posts_with_replies_info, and either add them to the post they are in
# reply to (if htey are a reply), or add them directly to the post_tree otherwise
for post_with_replies in posts_with_replies_info:
if(post_with_replies["post"].type == discord.MessageType.reply):
# if post is a reply, find what it is a reply to and add it to the replies list of that post
reply_to_id = post_with_replies["post"].reference.message_id
if reply_to_id in post_with_replies_lookup:
# if we find the post this was a reply to,
# add this post_with_replies to the replies of that post_with_replies info
reply_to_post_with_replies_info = post_with_replies_lookup[reply_to_id]
reply_to_post_with_replies_info['replies'].append(post_with_replies)
else:
# if we couldn't find the post this was in reply to, print warning and
# just add it as a regular post
print("Warning could not find post: " + str(reply_to_id) + ", which message " + str(post_with_replies["post"].id) + " replied to")
post_tree.append(post_with_replies)
else: # not a reply, just add to post_tree directly
post_tree.append(post_with_replies)
return post_tree
Helper function to load the recent posts from channel return the reply tree#
(You don’t need to worry about how this works. This is that function that gets the recent history from a channel, and then uses the reconstruct_reply_tree
function to turn them into a reply tree data structure. By default, the hist_limit
is set to get the most recent 30 posts.)
def get_channel_post_tree(channel_id, hist_limit=30):
# set up discord connection
client = discord.Client(intents=intents)
# Provide instructions for what your discord bot should do once it has logged in
@client.event
async def on_ready():
global reply_tree # Save the reply_tree variable outside our running bot
# Load the discord channel you want to read from
channel = client.get_channel(channel_id)
# Get the latest post in the channel history
post_history = channel.history(limit=hist_limit)
#special code to turn the post_history from discord into a python list
recent_posts = [post async for post in post_history]
reply_tree = await reconstruct_reply_tree(recent_posts)
# Tell your bot to stop running
await client.close()
# Now that we've defined how the bot shoould work, start running your bot
client.run(discord_token)
return reply_tree
Code to print a channel’s recent posts and replies#
We are providing these function that recursively prints a channel’s recent posts and replies, but it depends on whether a should_display
function returns True or False to decide if it actually displays a post. (Note: if a should_display
comes back false for a post, the post wont be displayed, nor will any replies to it)
The print_channel_post_and_replies
is a function that takes a channel_id, loads the reply post_tree from that channel, and then uses the print_post_and_replies
function to print out all posts and replies. By default, show_hidden
is set to False (meaning it won’t show anything for posts that should_display
came back with false, setting it to True will show them in red), and hist_limit
is set to load the most recent 30 posts (but you can change it up to 100).
def print_channel_post_and_replies(channel_id, show_hidden=False, hist_limit=30):
post_tree = get_channel_post_tree(channel_id, hist_limit=hist_limit)
print("Below are the posts and replies for post from channel " + str(channel_id) + ":" )
for post_with_replies_info in post_tree:
print_post_and_replies(post_with_replies_info, show_hidden=show_hidden)
The print_post_and_replies
function takes a given post_with_replies_info
and recursively prints that post as well as all replies to that post (which will as well as all replies to those replies, etc.)
def print_post_and_replies(post_with_replies_info, num_indents=0, show_hidden=False):
# for convenience save the post and replies info in variables
post = post_with_replies_info["post"]
replies = post_with_replies_info["replies"]
# save the text to display in a post box
display_text = (
str(post.content) + "\n" +
"-- " + str(post.author)
)
if(should_display(post)): # check if we should display this comment
# display the text of this post, indented over
display_indented(display_text, num_indents*20)
#print replies (and the replies of those, etc.)
for reply in replies:
print_post_and_replies(reply, num_indents = num_indents + 1, show_hidden=show_hidden)
elif(show_hidden): #If we want to still see which posts we are hiding, color them LightCoral so we can see they are hidden
display_indented(display_text, num_indents*20, color='LightCoral')
TODO: Create Your Content Moderation Algorithm#
Your job is to invent and implement your own rule inside the should_display
function for what comments count as the “best comments” and therefore should be displayed. The rule can be complicated or simple, it just can’t be the same as the current rule. You can aim for focusing on only hiding a few comments that you judge are bad, or for only showing a few comments you judge are the very best, or a combination of those.
When you are making your rule you may want to use different comparison operators (like == for equals, > for greater than, etc.) and different logical operators (like and
for both things must be true, or
for at least one thing must be true, etc.). See a list of operators here: https://www.w3schools.com/python/python_operators.asp
Some things you can use when you are deciding whether to display a tweet or not:
The text of the post:
post.content
The post created time:
post.created_at
The list of reactions:
post.reactions
(see more about reactions in the official docs)Is the message pinned?:
post.pinned
You can see more by looking at the official documentation for lists of attributes of a discord message
You can also look at attributes of the author such as:
author name:
post.author.display_name
when was the author created?:
post.author.created_at
is the author labeled as a bot?:
post.author.bot
The author public flags:
post.author.public_flags
(like spammer, see offical docs on PublicUserFlags)
You can see more by looking at lists of attributes of a discord user
You can use any other information you can figure out about the post as well, such as the sentiment analysis that was demoed previously.
def should_display(post):
#TODO: Make your own rule
# for a demonstration, we only display comments with the a capital "I"
has_letters_the = "I" in post.content
if(has_letters_the):
return True
else:
return False
Test our code on discord channel#
In order to test it out, we just need to get a discord channel id and pass it to the print_post_and_replies
function. If there are any replies (not threads) in the recent history, we will see them formatted as a reply tree.
As you work on your changes to the should_display
function, you can test it out on different channels
print_channel_post_and_replies(5432167890)
Below are the posts and replies for post from channel 5432167890:
I saw a movie once! -- fake_user
I saw one too! -- pretend_user
I never saw a movie :( -- imaginary_user
If we also want to see what posts are being skipped, we can use an optional argument for print_post_and_replies
by setting show_hidden = True
, and the comments that are being skipped will show up with a reddish background.
print_channel_post_and_replies(5432167890, show_hidden = True)
Below are the posts and replies for post from channel 5432167890:
I saw a movie once! -- fake_user
I saw one too! -- pretend_user
What a coincidence! -- fake_user
I never saw a movie :( -- imaginary_user
Good morning everyone! -- imaginary_user
TODO! Test it with 3 discord channels#
Now, after you’ve modified the should_display
, try testing out your algorithm on three different channels, answering follow up questions after each one.
In the sections below, replace the ?????
s with a channel id, and run the code. Then answer the questions about how that went.
At the very end will be more reflection questions.
TODO: Print Discord channel 1#
print_channel_post_and_replies('?????', show_hidden = True)
Below are the posts and replies for post from channel ?????:
I saw a movie once! -- fake_user
I saw one too! -- pretend_user
What a coincidence! -- fake_user
I never saw a movie :( -- imaginary_user
Good morning everyone! -- imaginary_user
TODO: Discord channel 1 follow-up questions#
Write an answer in response to each of these questions (you can edit this text by double clicking it):
Look through the output of print_channel_post_and_replies()
based on your modified should_display
function.
Did your function tend to keep most tweets or tend to hide most tweets?
TODO: Your answer here
Do you see any pattern to the contents of posts you showed versus hid (e.g., did it actually select better quality or more interesting posts)?
TODO: Your answer here
TODO: Print Discord Channel 2#
print_channel_post_and_replies('?????', show_hidden = True)
Below are the posts and replies for post from channel ?????:
I saw a movie once! -- fake_user
I saw one too! -- pretend_user
What a coincidence! -- fake_user
I never saw a movie :( -- imaginary_user
Good morning everyone! -- imaginary_user
TODO: Discord channel 2 follow-up questions#
Write an answer in response to each of these questions (you can edit this text by double clicking it):
Look through the output of print_channel_post_and_replies()
based on your modified should_display
function.
Did your function tend to keep most posts or tend to hide most posts?
TODO: Your answer here
Do you see any pattern to the contents of the posts you showed versus hid (e.g., did it actually select better quality or more interesting posts)?
TODO: Your answer here
TODO: Print Discord channel 3#
print_channel_post_and_replies('?????', show_hidden = True)
Below are the posts and replies for post from channel ?????:
I saw a movie once! -- fake_user
I saw one too! -- pretend_user
What a coincidence! -- fake_user
I never saw a movie :( -- imaginary_user
Good morning everyone! -- imaginary_user
TODO: Discord channel 3 follow-up questions#
Write an answer in response to each of these questions (you can edit this text by double clicking it):
Look through the output of print_channel_post_and_replies()
based on your modified should_display
function.
Did your function tend to keep most posts or tend to hide most posts?
TODO: Your answer here
Do you see any pattern to the contents of the posts you showed versus hid (e.g., did it actually select better quality or more interesting posts)?
TODO: Your answer here
TODO: Final Reflection questions#
Write an answer in response in response to each of these questions:
Explain why you chose the rules you did for selecting the best comments?
TODO: Your answer here
What was most challenging about coming up with your rules?
TODO: Your answer here
What additional information or rules do you wish you could have used?
TODO: Your answer here
If someone or some group wanted to make sure their comments were shown by your function, what would they do? How hard would this be?
TODO: Your answer here
If someone or some group wanted to make sure someone else’s comments were NOT shown by your function, what would they do (if anything)? How hard would this be?
TODO: Your answer here
If Reddit adopted this rule as a universal rule for which comments to display, what do you think would happen? (e.g., would people change commenting strategies? would comments look different than currently? would it get overwhelmed with spam?)
TODO: Your answer here