A5: Best Comments#

Choose Social Media Platform: Reddit | Discord | Bluesky | No Coding

In this assignment you will be modifying a recursive function that prints a posts and replies on a discord channel (ignoring threads). Your goal will be to only show the best posts and replies. It will be up to you to decide what rules you use to decide which posts are the best posts.

Discord Setup#

# Load some code called "discord" that will help us work with Discord
import discord

# Load another library that helps the bot work in Jupyter Noteboook
import nest_asyncio
nest_asyncio.apply()

(optional) make a fake Discord connection with the fake_discord library

For testing purposes, we’ve added this line of code, which loads a fake version of discord, so it wont actually connect to Discord. If you want to try to actually connect to Discord, don’t run this line of code.

%run ../../../../fake_apis/fake_discord.ipynb
Fake discord is replacing the discord.py library. Fake discord doesn't need real passwords, and prevents you from accessing real discord
%run discord_keys.py
# set up Discord client with permissions to read message_contents
intents = discord.Intents.default()
intents.message_content = True 

Helper function to display text in an indented box#

(You don’t need to worry about how this works. This is that function that helps display posts in indented boxes)

from IPython.display import HTML, Image, display
import html
def display_indented(text, left_margin=0, color="white"):
    display(
        HTML(
            "<pre style='border:solid 1px;padding:3px;margin-left:"+str(left_margin)+"px;background-color:"+color+"'>" + 
            html.escape(text) + 
            "</pre>"
        )
    )

Helper function to reconstruct reply tree#

(You don’t need to worry about how this works. This is that function that helps take the list of posts from the channel history and organize it into a proper reply tree structure)

async def reconstruct_reply_tree(recent_posts):
    # make a post + replies entry for each post (replies empty for now)
    posts_with_replies_info = [{"post": recent_post, "replies": []} for recent_post in recent_posts]
    
    # create look-up dictionary for the post+replies entries based on the post id
    post_with_replies_lookup = {post_with_replies["post"].id: post_with_replies for post_with_replies in posts_with_replies_info}
    
    # start a list that will become our post tree
    post_tree = []
    
    # go through all the posts_with_replies_info, and either add them to the post they are in 
    # reply to (if htey are a reply), or add them directly to the post_tree otherwise
    for post_with_replies in posts_with_replies_info:
        
        if(post_with_replies["post"].type == discord.MessageType.reply):
            # if post is a reply, find what it is a reply to and add it to the replies list of that post
            reply_to_id = post_with_replies["post"].reference.message_id

            if reply_to_id in post_with_replies_lookup:
                # if we find the post this was a reply to, 
                # add this post_with_replies to the replies of that post_with_replies info
                reply_to_post_with_replies_info = post_with_replies_lookup[reply_to_id]

                reply_to_post_with_replies_info['replies'].append(post_with_replies)

            else:
                # if we couldn't find the post this was in reply to, print warning and
                # just add it as a regular post
                print("Warning could not find post: " + str(reply_to_id) + ", which message " + str(post_with_replies["post"].id) + " replied to")
                post_tree.append(post_with_replies)
        
        else: # not a reply, just add to post_tree directly
            post_tree.append(post_with_replies)
            
    return post_tree

Helper function to load the recent posts from channel return the reply tree#

(You don’t need to worry about how this works. This is that function that gets the recent history from a channel, and then uses the reconstruct_reply_tree function to turn them into a reply tree data structure. By default, the hist_limit is set to get the most recent 30 posts.)

def get_channel_post_tree(channel_id, hist_limit=30):
    # set up discord connection
    client = discord.Client(intents=intents)

    # Provide instructions for what your discord bot should do once it has logged in
    @client.event
    async def on_ready():
        global reply_tree # Save the reply_tree variable outside our running bot

        # Load the discord channel you want to read from
        channel = client.get_channel(channel_id)

        # Get the latest post in the channel history
        post_history = channel.history(limit=hist_limit)

        #special code to turn the post_history from discord into a python list
        recent_posts = [post async for post in post_history]

        reply_tree = await reconstruct_reply_tree(recent_posts)

        # Tell your bot to stop running
        await client.close()

    # Now that we've defined how the bot shoould work, start running your bot
    client.run(discord_token)
    
    return reply_tree

Code to print a channel’s recent posts and replies#

We are providing these function that recursively prints a channel’s recent posts and replies, but it depends on whether a should_display function returns True or False to decide if it actually displays a post. (Note: if a should_display comes back false for a post, the post wont be displayed, nor will any replies to it)

The print_channel_post_and_replies is a function that takes a channel_id, loads the reply post_tree from that channel, and then uses the print_post_and_replies function to print out all posts and replies. By default, show_hidden is set to False (meaning it won’t show anything for posts that should_display came back with false, setting it to True will show them in red), and hist_limit is set to load the most recent 30 posts (but you can change it up to 100).

def print_channel_post_and_replies(channel_id, show_hidden=False, hist_limit=30):
    post_tree = get_channel_post_tree(channel_id, hist_limit=hist_limit)
    
    print("Below are the posts and replies for post from channel " + str(channel_id) + ":" )

    for post_with_replies_info in post_tree:
        print_post_and_replies(post_with_replies_info, show_hidden=show_hidden)

The print_post_and_replies function takes a given post_with_replies_info and recursively prints that post as well as all replies to that post (which will as well as all replies to those replies, etc.)

def print_post_and_replies(post_with_replies_info, num_indents=0, show_hidden=False):
    
    # for convenience save the post and replies info in variables
    post = post_with_replies_info["post"]
    replies = post_with_replies_info["replies"]

    # save the text to display in a post box
    display_text = (
        str(post.content) + "\n" +
        "-- " + str(post.author)
    )
    
    if(should_display(post)): # check if we should display this comment
        
        # display the text of this post, indented over
        display_indented(display_text, num_indents*20)

        #print replies (and the replies of those, etc.)
        for reply in replies:
            print_post_and_replies(reply, num_indents = num_indents + 1, show_hidden=show_hidden)
            
    elif(show_hidden): #If we want to still see which posts we are hiding, color them LightCoral so we can see they are hidden
        display_indented(display_text, num_indents*20, color='LightCoral')

TODO: Create Your Content Moderation Algorithm#

Your job is to invent and implement your own rule inside the should_display function for what comments count as the “best comments” and therefore should be displayed. The rule can be complicated or simple, it just can’t be the same as the current rule. You can aim for focusing on only hiding a few comments that you judge are bad, or for only showing a few comments you judge are the very best, or a combination of those.

When you are making your rule you may want to use different comparison operators (like == for equals, > for greater than, etc.) and different logical operators (like and for both things must be true, or for at least one thing must be true, etc.). See a list of operators here: https://www.w3schools.com/python/python_operators.asp

Some things you can use when you are deciding whether to display a tweet or not:

  • The text of the post: post.content

  • The post created time: post.created_at

  • The list of reactions: post.reactions (see more about reactions in the official docs)

  • Is the message pinned?: post.pinned

You can see more by looking at the official documentation for lists of attributes of a discord message

You can also look at attributes of the author such as:

  • author name: post.author.display_name

  • when was the author created?: post.author.created_at

  • is the author labeled as a bot?: post.author.bot

  • The author public flags: post.author.public_flags (like spammer, see offical docs on PublicUserFlags)

You can see more by looking at lists of attributes of a discord user

  • You can use any other information you can figure out about the post as well, such as the sentiment analysis that was demoed previously.

def should_display(post):
    #TODO: Make your own rule
    
    # for a demonstration, we only display comments with the a capital "I"
    has_letters_the = "I" in post.content
    
    if(has_letters_the):
        return True
    else:
        return False

Test our code on discord channel#

In order to test it out, we just need to get a discord channel id and pass it to the print_post_and_replies function. If there are any replies (not threads) in the recent history, we will see them formatted as a reply tree.

As you work on your changes to the should_display function, you can test it out on different channels

print_channel_post_and_replies(5432167890)
Fake discord is pretending to set up a client connection
Fake discord bot is fake logging in and starting to run
Fake discord bot is shutting down
Below are the posts and replies for post from channel 5432167890:
I saw a movie once!
-- fake_user
I saw one too!
-- pretend_user
I never saw a movie :(
-- imaginary_user

If we also want to see what posts are being skipped, we can use an optional argument for print_post_and_replies by setting show_hidden = True, and the comments that are being skipped will show up with a reddish background.

print_channel_post_and_replies(5432167890, show_hidden = True)
Fake discord is pretending to set up a client connection
Fake discord bot is fake logging in and starting to run
Fake discord bot is shutting down
Below are the posts and replies for post from channel 5432167890:
I saw a movie once!
-- fake_user
I saw one too!
-- pretend_user
What a coincidence!
-- fake_user
I never saw a movie :(
-- imaginary_user
Good morning everyone!
-- imaginary_user

TODO! Test it with 3 discord channels#

Now, after you’ve modified the should_display, try testing out your algorithm on three different channels, answering follow up questions after each one.

In the sections below, replace the ?????s with a channel id, and run the code. Then answer the questions about how that went.

At the very end will be more reflection questions.

TODO: Print Discord channel 1#

print_channel_post_and_replies('?????', show_hidden = True)
Fake discord is pretending to set up a client connection
Fake discord bot is fake logging in and starting to run
Fake discord bot is shutting down
Below are the posts and replies for post from channel ?????:
I saw a movie once!
-- fake_user
I saw one too!
-- pretend_user
What a coincidence!
-- fake_user
I never saw a movie :(
-- imaginary_user
Good morning everyone!
-- imaginary_user

TODO: Discord channel 1 follow-up questions#

Write an answer in response to each of these questions (you can edit this text by double clicking it):

Look through the output of print_channel_post_and_replies() based on your modified should_display function.

Did your function tend to keep most tweets or tend to hide most tweets?

TODO: Your answer here

Do you see any pattern to the contents of posts you showed versus hid (e.g., did it actually select better quality or more interesting posts)?

TODO: Your answer here

TODO: Print Discord Channel 2#

print_channel_post_and_replies('?????', show_hidden = True)
Fake discord is pretending to set up a client connection
Fake discord bot is fake logging in and starting to run
Fake discord bot is shutting down
Below are the posts and replies for post from channel ?????:
I saw a movie once!
-- fake_user
I saw one too!
-- pretend_user
What a coincidence!
-- fake_user
I never saw a movie :(
-- imaginary_user
Good morning everyone!
-- imaginary_user

TODO: Discord channel 2 follow-up questions#

Write an answer in response to each of these questions (you can edit this text by double clicking it):

Look through the output of print_channel_post_and_replies() based on your modified should_display function.

Did your function tend to keep most posts or tend to hide most posts?

TODO: Your answer here

Do you see any pattern to the contents of the posts you showed versus hid (e.g., did it actually select better quality or more interesting posts)?

TODO: Your answer here

TODO: Print Discord channel 3#

print_channel_post_and_replies('?????', show_hidden = True)
Fake discord is pretending to set up a client connection
Fake discord bot is fake logging in and starting to run
Fake discord bot is shutting down
Below are the posts and replies for post from channel ?????:
I saw a movie once!
-- fake_user
I saw one too!
-- pretend_user
What a coincidence!
-- fake_user
I never saw a movie :(
-- imaginary_user
Good morning everyone!
-- imaginary_user

TODO: Discord channel 3 follow-up questions#

Write an answer in response to each of these questions (you can edit this text by double clicking it):

Look through the output of print_channel_post_and_replies() based on your modified should_display function.

Did your function tend to keep most posts or tend to hide most posts?

TODO: Your answer here

Do you see any pattern to the contents of the posts you showed versus hid (e.g., did it actually select better quality or more interesting posts)?

TODO: Your answer here

TODO: Final Reflection questions#

Write an answer in response in response to each of these questions:

Explain why you chose the rules you did for selecting the best comments?

TODO: Your answer here

What was most challenging about coming up with your rules?

TODO: Your answer here

What additional information or rules do you wish you could have used?

TODO: Your answer here

If someone or some group wanted to make sure their comments were shown by your function, what would they do? How hard would this be?

TODO: Your answer here

If someone or some group wanted to make sure someone else’s comments were NOT shown by your function, what would they do (if anything)? How hard would this be?

TODO: Your answer here

If Reddit adopted this rule as a universal rule for which comments to display, what do you think would happen? (e.g., would people change commenting strategies? would comments look different than currently? would it get overwhelmed with spam?)

TODO: Your answer here