Communal Ownership Online

We often think of online communities as a “shared digital commons”. A forum or subreddit or chatroom where people meet and talk. An open source project, where a collection of developers build something together. A wiki, where people gather and organize knowledge. These are online spaces made up by communities of people, serving those same communities. But they are rarely governed by those same communities. More specifically, the technology these platforms are built on does not support shared governance, and any community decision-making must be awkwardly superimposed. Let’s examine the problem, and what solutions might look like.

Internet platforms usually only support one of two models of resource ownership:

  1. Single Administrator One user owns each GitHub repository and decides who gets commit access. If a repository is owned by an “organization”, that organization is owned by a single user who decides what users are in the org, or teams within the org, and what authority each one has. One user owns each Discord server, and each subreddit. Powers may be delegated from these “main” owners, but they hold ultimate control and cannot be overruled or removed.

  2. No Administrator Platforms like Twitter or Snapchat don’t have a sense of “shared community resources”, so each post is simply owned by the user that submitted it. On platforms like IRC, there may be chat channels with no operators, where every user is on equal footing without moderation power.

The single administrator model arises by default: When someone sets up a webserver to host a website, they have total control over the server, and so are implicitly the sole administrator of the website. This made a lot of sense in the 90s and early 00s when most online communities were self-hosted websites, and the line between server administration and community moderation was often unclear. It makes less sense as “online communities” become small compartments within larger websites like Reddit, Discord, GitHub, Trello, or Wikia. There are server administrators for these sites, of course, but they’re often several levels removed from the communities hosted on them. The single administrator model makes almost no sense for peer-to-peer communities like groups on Cabal, Cwtch, or IPFS, or Freenet sites, all which have no real “server infrastructure”.

The idea of “shared ownership of an online space” is nothing new. Many subreddits are operated by several moderators with equal power, who can do anything except expel the original owner or close the subreddit. Discord server owners frequently create moderator or half-moderator roles to delegate most governance, except the election of new moderators. While technically a benevolent dictatorship, these are functionally oligarchies so long as the benevolent dictator chooses to never exercise their powers. Many prominent open source projects have a constitution or other guiding documents that define a “steering committee” or “working groups” or rough parliamentary systems for making major choices about a project’s future. Whoever controls the infrastructure of these open source projects, from their websites, to their git repositories, to chat servers or bug trackers or forums, is honor-bound to abide by the decisions of the group.

But this is exactly the problem: While we can define social processes for decision-making, elections, and delegation, we’re awkwardly implementing those social processes over technology that only understands the benevolent dictator model of “single administrator with absolute power”, and hoping everyone follows the rules. Often they do. When someone goes “power mad” in a blatant enough way, the community might fork around them, migrating to a new subreddit or discord server or git repository and replacing the malfunctioning human. However, there’s a high social cost to forking - rebuilding any infrastructure that needs to be replaced, informing the entire community about what’s going on, selecting replacement humans, and moving everyone over. Often few people migrate to a fork, and it fizzles out. Occasionally there’s disagreement over the need to fork, so the community splits, and both versions run for a time, wasting effort duplicating one another’s work. The end result is that while online benevolent dictators are ostensibly replaceable, it’s a difficult and costly process.

Wouldn’t it better if the technology itself were built to match the social decision-making processes of the group?

Let’s focus on open source as an example. Let’s say that, by social contract, there’s a committee of “core developers” for a project. A minimum of two core developers must agree on minor decisions like accepting a pull request or closing an issue, and a majority of developers must agree on major decisions like adding or removing core developers or closing the project.

Under the present model, the community votes on each of the above operations, and then a user with the authority to carry out the action acts according to the will of the group. But there’s nothing preventing a FreeBSD core developer from approving their own pull requests, ignoring the social requirement for code review. Similarly, when an npm user’s account is compromised there’s nothing preventing the rogue account from uploading an “update” containing malware to the package manager.

But what if the platform itself enforced the social bylaws? Attempting to mark a new release for upload to npm triggers an event, and two developers must hit the “confirm” button before the release is created. If there are steps like “signing the release with our private key”, it may be possible to break up that authority cryptographically with Shamir Secret Sharing so that any two core developers can reproduce the key and sign the release - but this is going too far on a tangent.

Configuring the platform to match the group requires codifying bylaws in a way the platform can understand (something I’ve written about before), and so the supported types of group decision-making will be limited by the platform. Some common desirable approaches might be:

  • Threshold approval, where 3 people from a group must approve an action

  • Percentage voting, where a minimum % of a group’s members must approve an action

  • Veto voting, where actions are held “in escrow” for a certain amount of time, then auto-approved if no one from a group has vetoed them

This last option is particularly interesting, and allows patterns like “anyone can accept a pull request, as long as no one says no within the next 24 hours”.

There’s a lot of potential depth here: instead of giving a list of users blanket commit access to an entire repository, we can implement more nuanced permissions. Maybe no users have direct commit access and all need peer approval for their pull requests. Maybe sub-repositories (or sub-folders within a repository?) are delegated to smaller working groups, which either have direct commit access to their region, or can approve pull requests within their region among themselves, without consulting the larger group.

Now a repository, or a collection of repositories under the umbrella of a single project, can be “owned” by a group in an actionable way, rather than “owned by a single person hopefully acting on behalf of the group.” Huge improvement! The last thing to resolve is how the bylaws themselves get created and evolve over time.

Bylaws Bootstrapping

The simplest way of creating digital bylaws is through a very short-lived benevolent dictator. When a project is first created, the person creating it pastes in the first set of bylaws, configuring the platform to their needs. If they’re starting the project on their own then this is natural. If they’re starting the project with a group then they should collaborate on the bylaws, but the risk of abuse at this stage is low: If the “benevolent dictator” writes bylaws the group disagrees with, then the group refuses to participate until the bylaws are rewritten, or they make their own project with different bylaws. Since the project is brand-new, the usual costs to “forking” do not apply. Once bylaws are agreed upon, the initial user is bound by them just like everyone else, and so loses their “benevolent dictator” status.

Updating community bylaws is usually described as part of the bylaws: Maybe it’s a special kind of pull request, where accepting the change requires 80% approval among core members, or any other specified threshold. Therefore, no “single administrator” is needed for updating community rules, and the entire organization can run without a benevolent dictator forever after its creation.

Limitations and Downsides

There is a possible edge case where a group gets “stuck” - maybe their bylaws require 70% of members approve any pull request, and too many of their members are inactive to reach this threshold. If they also can’t reach the thresholds for adding or expelling members, or for changing the bylaws, then the project grinds to a halt. This is an awkward state, but it replaces a similar edge case under the existing model: What if the benevolent dictator drops offline? If the user that can approve pull requests or add new approved contributors is hospitalized, or forgets their password and no longer has the email address used for password recovery, what can you do? The project is frozen, it cannot proceed without the administrator! In both unfortunate edge cases, the solution is probably “fork the repository or team, replacing the inaccessible user(s).” If anything, the bylaws model provides more options for overcoming an inactive user edge case - for example, the rules may specify “removing users requires 70% approval, or 30% approval with no veto votes for two weeks”, offering a loophole that is difficult to abuse but allows easily reconfiguring the group if something goes wrong.

One distinct advantage of the current “implicit benevolent dictator” model is the ability to follow the spirit of the law rather than the letter. For example, if a group requires total consensus for an important decision, and a single user is voting down every option because they’re not having their way, a group of humans would expel the troublemaker for their immature temper-tantrum. If the platform is ultimately controlled by a benevolent dictator, then they can act on the community’s behalf and remove the disruptive user, bylaws or no. If the platform is automated and only permits actions according to the bylaws, the group loses this flexibility. This can be defended against with planning: A group may have bylaws like “we use veto-voting for approving all pull requests and changes in membership, but we also have a percentage voting option where 80% of the group can vote to kick out any user that we decide is abusing their veto powers.” Unfortunately, groups may not always anticipate these problems before they occur, and might not have built in such fallback procedures. This can be somewhat mitigated by providing lots of example bylaws. Much like how a platform might prompt “do you want your new repository to have an MIT, BSD, or GPL license? We can paste the license file in for you right now,” we could offer “here are example bylaws for a group with code review requirements, a group with percentage agreement on decisions, and a group with veto actions. Pick one and tweak to your needs.”

The General Case

We often intend for web communities to be “community-run”, or at least, “run by a group of benevolent organizers from the group.” In reality, many are run by a single user, leaving them fragile to abuse and neglect. This post outlines an approach to make collective online ownership a reality at a platform level. This could mitigate the risk of rogue users, compromised users, and inactive moderators or administrators that have moved on from the project or platform without formally stepping down.

Posted 7/30/2021


The Efficacy of Subreddit Bans

Deplatforming is a moderation technique where high profile users or communities are banned from a platform in an effort to inhibit their behavior. It’s somewhat controversial, because while the intention is usually good (stopping the spread of white supremacy, inciting of violence, etc), the impacts aren’t well understood: do banned users or groups return under alternate names? Do they move to a new website and regroup? When they’re pushed to more obscure platforms, are they exchanging a larger audience for an echo chamber where they can more effectively radicalize the people that followed them? Further, this is only discussing the impact of deplatforming, and not whether private social media companies should have the responsibility of governing our shared social sphere in the first place.

We have partial answers to some of the above questions, like this recent study that shows that deplatformed YouTube channels that moved to alt-tech alternatives lost significant viewership. Anecdotally we know that deplatformed users sometimes try to return under new accounts, but if they regather enough of their audience then they’ll also gather enough attention from the platform to be banned again. Many finer details remain fuzzy: when a community is banned but the users in that community remain on the platform, how does their behavior change? Do some communities respond differently than others? Do some types of users within a community respond differently? If community-level bans are only sometimes effective at changing user behavior, then under what conditions are they most and least effective?

I’ve been working on a specific instance of this question with my fabulous co-authors Sam Rosenblatt, Guillermo de Anda Jáuregui, Emily Moog, Briane Paul V. Samson, Laurent Hébert-Dufresne, and Allison M. Roth. The formal academic version of our paper can be found here (it’s been accepted, but not yet formally published, so the link is to a pre-release version of the paper). This post is an informal discussion about our research containing my own views and anecdotes.

We’ve been examining the fallout from Reddit’s decision to change their content policies and ban 2000 subreddits last year for harmful speech and harassment. Historically, Reddit has strongly favored community self-governance. Each subreddit is administered by moderators: volunteer users that establish their own rules and culture within a subreddit, and ban users from the subreddit as they see fit. Moderators, in turn, rely on the users in their community to report rule-violating content and apply downvotes to bury offensive or off-topic comments and posts. Reddit rarely intervened and banned entire communities before this change in content policy.

Importantly, Reddit left all users on the platform while banning each subreddit. This makes sense from a policy perspective: How does Reddit distinguish between someone that posted a few times in a white supremacist subreddit to call everyone a racist, and someone who’s an enthusiastic participant in those spaces? It also provides us with an opportunity to watch how a large number of users responded to the removal of their communities.

The Plan

We selected 15 banned subreddits with the most users-per-day that were open for public participation at the time of the ban. (Reddit has invite-only “private subreddits”, but we can’t collect data from these, and even if we could the invite-only aspect makes them hard to compare to their public counterparts) We gathered all comments from these subreddits in the half-year before they were banned, then used those comments to identify the most active commenters during that time, as well as a random sample of participants for comparison. For each user, we downloaded all their comments from every public subreddit for two months before and after the subreddit was banned. This is our window into their before-and-after behavior.

Next, we found vocab words that make up an in-group vocabulary for the banned subreddit. Details on that in the next section. Finally, we can calculate how much more or less a user comments after the subreddit has been banned, and whether their comments contain greater or fewer percentage of words from the banned subreddit’s vernacular. By looking at this change in behavior across many users, we can start to draw generalizations about how the population of the subreddit responded. By comparing the responses to different subreddit bans, we can see some broader patterns in how subreddit bans work.

In-Group Language

We want some metric for measuring whether users from a subreddit talk about the same things now that the subreddit has been banned. Some similar studies measure things like “did the volume of hate speech go down after banning a subreddit”, using a pre-defined list of “hate words”. We want something more generalizable. As an example, did QAnon users continue using phrases specific to their conspiracy theory (like WWG1WGA - their abbreviated slogan “Where we go one, we go all”) after the QAnon subreddits were banned? Ideally, we’d like to detect these in-group terms automatically, so that we can run this analysis on large amounts of text quickly without an expert reading posts by hand to highlight frequent terms.

Here’s roughly the process we used:

  1. Take the comments from the banned subreddit

  2. Gather a random sample of comments from across all of Reddit during the same time frame (we used 70 million comments for our random sample)

  3. Compare the two sets to find words that appear disproportionately on the banned subreddit

In theory, comparing against “Reddit as a whole during the same time period” rather than, for example, “all public domain English-language books” should not only find disproportionately frequent terms, but should filter out “Reddit-specific words” (subreddit, upvote, downvote, etc), and words related to current events unless those events are a major focus point for the banned subreddit.

There’s a lot of hand-waving between steps 2 and 3: before comparing the two sets of comments we need to apply a ton of filtering to remove punctuation, make all words singular lower-case, “stem” words (so “faster” becomes “fast”), etc. This combines variants of a word into one ‘token’ to more accurately count how many times it appears. We also filtered out comments by bots, and the top 10,000 English words, so common words can never count as in-group language even if they appear very frequently in a subreddit.

Step 3 is also more complicated than it appears: You can’t compare word frequencies directly, because words that appear once in the banned subreddit and never in the random Reddit sample would technically occur “infinitely more frequently” in the banned subreddit. We settled on a method called Jensen-Shannon Divergence, which basically compares the word frequencies from the banned subreddit text against an average of the banned subreddit’s frequencies and the random Reddit comments’ frequencies. The result is what we want - words that appear much more in the banned subreddit than on Reddit as a whole have a high score, while words that appear frequently in both or infrequently in one sample get a low score.

This method identifies “focus words” for a comunity - maybe not uniquely identifying words, but things they talk about frequently. As an example, here were some of the top vocab words from r/incels before its ban:

femoids
subhuman
blackpill
cucks
degenerate
roastie
stacy
cucked

Lovely. We’ll take the top 100 words from each banned subreddit using this approach and use that as a linguistic fingerprint. If you use lots of these words frequently, you’ve probably spent a lot of time in incel forums.

Results within a Subreddit

If a ban is effective, we expect to see users either become less active on Reddit overall, or that they’ve remained active but don’t speak the same way anymore. If we don’t see a significant change in the users’ activity or language, it suggests the ban didn’t impact them much. If users become more active or use lots more in-group language, it suggests the ban may have even backfired, radicalizing users and pushing them to become more engaged or work hard to rebuild their communities.

The following scatterplots show user reactions, on a scale from -1 to +1, with each point representing a single user. A -1 for activity means a user made 100% of their comments before the subreddit was banned, whereas +1 would mean a user made 100% of their comments after the subreddit was banned, while a score of 0 means they made as many comments before as after. Change in in-group vernacular usage is similarly scaled, from -1 (“only used vocab words before the ban”) to +1 (“only used vocab words after the ban”), with 0 again indicating no change in behavior. Since many points overlap on top of one another, distribution plots on the edges of the graph show overall trends.

The Donald

For those unfamiliar, r/the_donald was a subreddit dedicated to Donald Trump and MAGA politics, with no official connection to Trump or his team. Many vocab words were related to Republican politics (“mueller”, “illegals”, “comey”, “collusion”, “nra”, “globalist”), with a bend towards ‘edgy online communities’ (“kek”, “cucks”, etc).

Plot of r/the_donald user reactions to subreddit ban

Top users from r/the_donald showed almost zero change in in-group vocabulary usage, and only a slight decrease in activity. By contrast, arbitrary users from r/the_donald were all over the place: many didn’t use much in-group vocabulary to begin with, so any increased or decreased word usage throws their behavior score wildly. Random user activity change follows a smooth normal distribution, indicating that the ban had no effect. For both top posters and random users, the ban seems to be ineffectual, but this contrasts with our next subreddit…

Gendercritical

r/gendercritical was a TERF subreddit - ostensibly feminist and a discussion space for women’s issues, but visciously anti-trans. Vocabulary includes feminist topic words (“misogyny”, “patriarchy”, “radfem”, “womanhood”), plus “transwomen”, “intersex”, a number of gendercritical-specific trans slurs, and notable mention “rowling”.

Plot of r/gendercritical user reactions to subreddit ban

Here we see markedly different results. A large number of the top r/gendercritical users dramatically dropped in activity after the subreddit ban, or left the platform altogether. Some of those who remained stopped using gendercritical vocab words, while others ramped up vocabulary usage. Random users once again show a normal change in activity, indicating no impact, with a marked number of users that stopped all usage of gendercritical vocabulary.

Subreddit Comparison

Rather than share all 15 subreddit plots in detail (they’re in the supplemental section of the research paper if you really want to look!), here’s a summary plot, showing the median change in vocabulary and activity for top/random users from each subreddit.

Scatterplot summarizing all subreddit responses to ban

This plot indicates three things:

  1. Top users always drop in activity more than random users (as a population - individual top users may not react this way)

  2. While vocabulary usage decreases across both populations, top users do not consistently drop vocabulary more than random users

  3. Some subreddits respond very differently to bans than others

That third point leads us to more questions: Why do some subreddits respond so differently to the same banning process than others? Is there something about the content of the subreddits? The culture?

Subreddit Categorization

In an effort to answer the above questions, we categorized subreddits based on their vocabulary and content. We drew up the following groups:

Category Subreddits Description
Dark Jokes darkjokecentral, darkhumorandmemes, imgoingtohellforthis2 Edgy humor, often containing racist or other bigoted language
Anti Political consumeproduct, soyboys, wojak Believe that most activism and progressive views are performative and should be ridiculed
Mainstream Right Wing the_donald, thenewright, hatecrimehoaxes Explicitly right-wing, but clearly delineated from the next category
Extreme Right Wing debatealtright, shitneoconssay Self-identify as right-wing political extremists, openly advocate for white supremacy
Uncategorized ccj2, chapotraphouse, gendercritical, oandaexclusiveforum  

(Note that the uncategorized subreddits aren’t necessarily hard to describe, but we don’t have enough similar subreddits to make any kind of generalization)

Let’s draw the same ban-response plot again, but colored by subreddit category:

Scatterplot summarizing all subreddit responses to ban

Obviously our sample size is tiny - some categories only have two subreddits in them! - so results from this plot are far from conclusive. Fortunately, all we’re trying to say here is “subreddit responses to bans aren’t entirely random, we see some evidence of a pattern where subreddits with similar content respond kinda similarly, someone should look into this further.” So what do we get out of this?

Category Activity Change Vocabulary Change
Dark Jokes Minimal Minimal
Anti Political Top users decrease Top users decrease
Mainstream Right Wing Minimal Inconsistent
Extreme Right Wing All decrease significantly, especially top users Minimal

The clearest pattern for both top and random users is that “casual racism” in dark joke subreddits is the least impacted by a subreddit ban, while right wing political extremists are the most effected.

What have we Learned?

We’ve added a little nuance to our starting question of “do subreddit bans work?” Subreddit bans make the most active users less active, and in some circumstances lead users to largely abandon the vocabulary of the banned group. Subreddit response is not uniform, and from early evidence loosely correlates with how “extreme” the subreddit content is. This could be valuable when establishing moderation policy, but it’s important to note that this research only covers the immediate fallout after a subreddit ban: How individual user behavior changes in the 60 days post-ban. Most notably, it doesn’t cover network effects within Reddit (what subreddits do users move to?) or cross-platform trends (do users from some banned subreddits migrate to alt-tech platforms, but not others?).

Posted 7/1/2021


Tor with VPNs (Don’t!)

I see a lot of questions on forums by people asking how to “use Tor with a VPN” for “added security”, and a lot of poor advice given in response. Proposals fall in two categories:

The first is useless and unnecessary, the second is catastrophically harmful. Let’s dig in.

Using a VPN to connect to Tor

In the first case, users want to connect to Tor through a VPN, with one of the following goals:

  1. Add more levels of proxies between them and the ‘net for safety

  2. Hide that they’re connecting to Tor from their ISP

  3. Hide that they’re connecting to Tor from Tor

The first goal is theoretically an alright idea, especially if you know little about Tor’s design or haven’t thought much about your threat model. More proxies = safer, right? In practice, it doesn’t add much: any adversary able to break Tor’s three-level onion routing is probably not to have any trouble breaking a single-hop VPN, either through legal coercion or traffic analysis. Adding a VPN here won’t hurt, but you’re losing money and slowing down your connection for a questionable improvement in “security” or “anonymity”.

The second goal is a good idea if you live in a country which forbids use of Tor - but there are better solutions here. If Tor is legal in your country, then your ISP can’t identify anything about your Tor usage besides when you were connected to Tor, and approximately how much data you moved. If Tor is not legal in your country, the Tor Project provides ‘bridges’, which are special proxies designed to hide that you are connecting to Tor. These bridges don’t stand out as much as a VPN, and don’t have any money trail tying them to you, and so are probably safer.

The last objective, hiding your IP address from Tor, is silly. Because of the onion routing design, Tor can’t see anything but your IP address and approximately how much data you’ve moved. Tor doesn’t care who you are, and can’t see what you’re doing. But sure, a VPN could hide your IP address from the Tor entry guard.

Using Tor to connect to a VPN

This is where we enter the danger zone. To explain why this is a horrible idea, we need to expand the original diagram:

When you connect to “Tor”, you aren’t connecting to a single proxy server, but to a series of three proxy servers. All of your data is encrypted in layers, like an envelope stuffed inside another envelope. When you communicate with the Tor entry guard, it can see that you’re sending encrypted data destined for a Tor relay, but doesn’t know anything else, so it removes the outermost envelope and sends the message along. When the relay receives the envelope it doesn’t know that you’re the original sender, it only knows that it received data from an entry guard, destined for an exit node. The relay strips off the outermost envelope and forwards along. The exit node receives an envelope from a relay destined for some host on the Internet, dutifully strips the envelope and sends the final data to the Internet host. When the exit node receives a response, the entire process runs in reverse, using a clever ephemeral key system, so each computer in the circuit still only knows who its two neighbors are.

The safety and anonymity in Tor comes from the fact that no server involved knows both who you are, and who you’re talking to. Each proxy server involved can see a small piece of the puzzle, but not enough to put all the details together. Compromising Tor requires either finding a critical bug in the code, or getting the entry guard, relay, and exit node to collude to identify you.

When you add a VPN after Tor, you’re wrecking Tor’s entire anonymity guarantee: The VPN can see where you’re connecting to, because it just received the data from the Tor exit node, and it knows who you are, because you’re paying the VPN provider. So now the VPN is holding all the pieces of the puzzle, and an attacker only needs to compromise that VPN to deanonymize you and see all your decrypted network traffic.

(There is one use-case for placing a proxy after Tor: If you are specifically trying to visit a website that blocks Tor exit nodes. However, this is still a compromise, sacrificing anonymity for functionality.)

What if the VPN doesn’t know who I am?

How are you pulling that off? Paying the VPN with cryptocurrency? Cool, this adds one extra financial hop, so the VPN doesn’t have your name and credit card, but it has your wallet address. If you use that wallet for any other purchases, that’s leaking information about you. If you filled that wallet through a cryptocurrency exchange, and you paid the exchange with a credit card or paypal, then they know who you are.

Even if you use a dedicated wallet just for this VPN, and filled it through mining, so there’s no trail back to you whatsoever, using the same VPN account every time you connect is assigning a unique identifier to all of your traffic, rather than mixing it together with other users like Tor does.

What if you use a new dedicated wallet to make a new VPN account every time you connect, and all those wallets are filled independently through mining so none of them can be traced back to you or each-other? Okay, this might work, but what an incredible amount of tedious effort to fix a loss in anonymity, when you could just… not use a VPN after Tor.

Tl;dr

Just don’t! Just use Tor! Or, if you’re in a region where using Tor would make you unsafe, use Tor + bridges. VPNs are ineffectual at best and harmful at worst when combined with Tor.

Posted 6/26/2021


Reimagine the Internet Day 5: New Directions in Social Media Research

This week I’m attending the Reimagine the Internet mini-conference, a small and mostly academic discussion about decentralization from a corporate controlled Internet to realize a more socially positive network. This post is a collection of my notes from the fifth day of talks, following my previous post.

Today’s final session was on new directions in (academic) social media research, and some well thought-out criticisms of the decentralization zeitgeist.

An Illustrated Field Guide to Social Media

Several researchers have been collaborating on an Illustrated Field Guide to Social Media, which categorizes social media according to user interaction dynamics as follows:

Category Description Examples
Civic Logic Strict speech rules, intended for discussion and civic engagement, not socialization Parlio, Ahwaa, vTaiwan
Local Logic Geo-locked discussions, often with neighborhoods or towns, often with extremely active moderation, intended for local news and requests for assistance Nextdoor, Front Porch Forum
Crypto Logic Platforms reward creators with cryptocurrency tokens for content and engagement, and often allow spending tokens to influence platform governance, under the belief that sponsorship will lead to high quality content Steemit, DTube, Minds
Great Shopping Mall Social media serving corporate interests with government oversight before users (think WeChat Pay and strong censorship), community safety concerns are government-prompted rather than userbase-driven WeChat, Douyin
Russian Logic Simultaneously “free” and “state-controlled”, stemming from a network initially built for public consumption beyond the state, then retroactively surveilled and controlled, with an added mandate of “Internet sovereignty” that demands Russian platforms supersede Western websites within the country VKontakte
Creator Logic Monetized one-to-many platforms where content creators broadcast to an audience, the platform connects audiences with creators and advertisers, and the platform dictates successful monetization parameters, while itself heavily influenced by advertisers YouTube, TikTok, Twitch
Gift Logic Collaborative efforts of love, usually non-commercial and rejecting clear boundaries of ownership, based in reciprocity, volunteerism, and feedback, such as fanfiction, some open source software development, or Wikipedia AO3, Wikipedia
Chat Logic Semi-private semi-ephemeral real-time spaces with community self-governance where small discussions take place without unobserved lurkers, like an online living room Discord, Snapchat, iMessage
Alt-Tech Logic Provides space for people and ideas outside of mainstream acceptable behavior, explicitly for far-right, nationalist, or bigoted viewpoints Gab, Parler
Forum Logic Topic-based chat with strongly defined in-group culture and practices, often featuring gatekeeping and community self-governance Reddit, 4chan, Usenet
Q&A Logic Mostly binary roles of ‘askers’ and ‘answerers’ (often with toxic relations), heavy moderation, focuses on recognition and status, but also reciprocity and benevolence Yahoo Answers, StackOverflow, Quora

The authors compare platforms along five axis (affordances, technology, ideology, revenue model, and governance), with numerous real-world examples of platforms in each category. The table above does not nearly do the book justice; It’s well worth a read, and I’d like to dedicate a post just to the field guide in the future.

The Limits of Imagination

Evelyn Douek, Harvard law school doctoral candidate and Berkman Klein Center affiliate, had some excellent critiques of small-scale decentralization.

Changing Perceptions on Online Free Speech

She framed online free speech perspectives as coming from three “eras”:

  1. The Rights Era, where platforms are expected to be financially motivated, and will maybe engage in light censorship on those grounds (copyright, criminal content, etc), but should otherwise be as hands-off as possible

  2. The Public Health Era, where platforms are expected to be stewards of public society, and should take a more active role in suppressing hatespeech, harassment, and other negative behavior

  3. The Legitimacy Era, where platforms are directed by, or at least accountable to, the public rather than solely corporate interests, bringing public health interests to the forefront of moderation and platform policy

Under this framing we’re currently in the “public health era”, imagining an Internet that more closely resembles the “legitimacy era”. Reddit is expected to ban subreddits for hate speech and inciting violence, even if they don’t meet the criteria of illegal content, the public demands that Twitter and Facebook ban Donald Trump without a court order asking them to, etc. We ask major platforms to act as centralized gatekeepers and intervene in global culture. When we imagine a decentralized Internet, maybe a fediverse like Mastodon or Diaspora, we’re often motivated by distributing responsibility for moderation and content policy, increasing self-governance, and increasing diversity of content by creating spaces with differing content policies.

Will Decentralization Save Us?

Or is decentralization at odds with the public health era? This is mostly about moderation at scale. Facebook employs a small army of content moderators (often from the Philippines, often underpaid and without mental health support despite mopping up incredibly upsetting material daily), and we cannot expect small decentralized communities to replicate that volume of labor.

Does this mean that hatespeech and radicalization will thrive in decentralized spaces? Maybe, depending on the scale of the community. In very small, purposeful online spaces, like subreddits or university discord servers, the content volume is low enough to be moderated, and the appropriate subjects well-defined enough for consistent moderation. On a larger and more general-purpose network like the Mastodon fediverse this could be a serious issue.

In one very real example, Peloton, the Internet-connected stationary bike company, had to ban QAnon hashtags from their in-workout-class chat. As a fitness company, they don’t have a ton of expertise with content moderation in their social micro-community.

Content Cartels / Moderation as a Service

There’s been a push to standardize moderation across major platforms, especially related to child abuse and terrorism. This often revolves around projects like PhotoDNA, which is basically fuzzy-hashing to generate fingerprints for each image, then compare them against vast databases of fingerprints for child abuse images, missing children, terrorist recruitment videos, etc.

This is a great idea, so long as the databases are vetted so that we can be confident they are being used for their intended purpose. Finland maintains a national website blocklist for child pornography, and upon analysis, under 1% of blocked domains actually contained the alleged content.

Nevertheless, the option to centralize some or all moderation, especially in larger online platforms, is tempting. Negotiating the boundary between “we want moderation-as-a-service, to make operating a community easier”, and “we want distinct content policies in each online space, to foster diverse culture” is tricky.

Moderation Along the Stack

Moderation can occur at multiple levels, and every platform is subject to it eventually. For example, we usually describe Reddit in terms of “community self-governance” because each subreddit has volunteer moderators unaffiliated with the company that guide their own communities. When subreddit moderators are ineffectual (such as in subreddits dedicated to hatespeech), then Reddit employees intervene. However, when entire sites lack effectual moderation, such as 8chan, Bitchute, or countless other alt-tech platforms, their infrastructure providers act as moderators. This includes domain registrars, server hosting like AWS, content distribution networks like CloudFlare, and comment-hosting services like Disqus, all of whom have terminated service for customers hosting abhorrent content in the past.

All of this is important to keep in mind when we discuss issues of deplatforming or decentralization, and the idea that users may create a space “without moderation”.

Conclusion

The big takeaway from both conversations today is to look before you leap: What kind of community are you building online? What do you want user interactions and experiences to look like? What problem are you solving with decentralization?

The categories of social media outlined above, and the discussion of moderation and governance at multiple scales, with differing levels of centralization, add a rich vocabulary for discussing platform design and online community building.

This wraps up Reimagine the Internet: A satisfying conclusion to discussions on the value of small communities, diversity of culture and purpose, locality, safety, but also challenges we will face with decentralization and micro-community creation. This series provides a wealth of viewpoints from which to design or critique many aspects of sociotechnical networks, and I look forward to returning to these ideas in more technical and applied settings in the future.

Posted 5/14/2021


Reimagine the Internet Day 4: Building Defiant Communities

This week I’m attending the Reimagine the Internet mini-conference, a small and mostly academic discussion about decentralization from a corporate controlled Internet to realize a more socially positive network. This post is a collection of my notes from the fourth day of talks, following my previous post.

Today’s session was on building platforms in hostile environments, and community-building while facing censorship and dire consequences if deanonymized.

MidEast Tunes

The first speaker, Esra’a Al Shafei had spent some time building news and chat sites in Bahrain (which has a very restricted press), but quickly and repeatedly fell afoul of censorship via ISP-level domain blocking. This was before widespread use of proxies and VPNs, so even if the service could have stayed up hosted remotely, the userbase would be cut off.

Instead, she settled on a music site, sort of indie-west-african-eastern-asian-spotify, MidEast Tunes. Music streaming was harder to justify blocking than text news sites, but still provided an outlet for political speech. This grew into a collaboration system, sort of a web-based GarageBand, where users could supply samples and work together to create tracks. This spawned cross-cultural, international, feminist connections.

Ahwaa

Years later, now that proxies are prevalent and domain blocking is more challenging, she’s returned to making an LGBT+ positive forum. While the censorship evasion is easier, the site still faces many problems, from anonymity concerns to trolls.

Anonymity is secured by forbidding most/all photo posts, representing each user with a customizable but vague cartoon avatar, and providing only the broadest user profiles, like “lesbian in Saudi”.

Infiltration is discouraged through a Reddit-like karma system. Users receive upvote hearts from others for each kind message they post, and site features like chat are restricted based on total upvote hearts. In the opposite case, sufficient downvotes lead to shadowbanning. Therefore, infiltrating the platform to engage in harassment requires posting hundreds or thousands of supportive LGBT-positive messages, and harassers are automatically hidden via shadowbanning. Not a perfect system, but cuts down on the need for moderation dramatically.

Switter

The second speaker, Eliza Sorensen, co-founded Switter and Tryst, sex-worker positive alternatives to Twitter and Backpage, respectively. After FOSTA/SESTA, many US-based companies banned all sex workers or sex work-related accounts to protect themselves from legal liability. Others aggressively shadow-banned even sex-positive suggestive accounts, hiding them from feeds, search, and discovery. Not only is this censorship morally absurd in its own right, but it also took away a valuable safety tool for sex workers. Open communication allowed them to vet clients and exchange lists of trustworthy and dangerous clients, making the entire profession safer. Locking down and further-criminalizing the industry hasn’t stopped sex work, but has made it much more dangerous. Hacking//Hustling has documented just how insidious FOSTA/SESTA is, and the horrible impacts the legislation has had.

Fortunately, Australia has legalized, although heavily regulated, sex work. This makes it possible to host a sex worker-positive mastodon instance within Australia, providing a safer alternative to major platforms. This is not a “solution” by any means - FOSTA/SESTA criminalizes a wide variety of behavior (like treating anyone that knowingly provides housing to a sex worker as a “sex trafficker”), and that social safety net can’t be restored with a decentralized Twitter clone. Nevertheless, it’s a step in harm reduction.

Conclusions

Both speakers stressed the importance of small-scale, purpose-built platforms. Creating platforms for specific purposes allows more care, context-awareness, safety. Scalability is an impulse from capitalism, stressing influence and profit, and is often harmful to online communities.

This seems like a potential case for federation and protocol interoperability. Thinking especially of the Switter case, existing as a Mastodon instance means users outside of Switter can interact with users on the platform without creating a purpose-specific account there. It’s not cut off, and this helps with growth, but the community on Switter is specifically about sex work, and can best provide support and a safe environment for those users. In other cases, like Ahwaa, complete isolation seems mandatory, and reinforces context-awareness and multiple personas. Click here for my notes from day 5.

Posted 5/13/2021


View older posts