Backdrifting Digital Haven Democratic Censorship Democratic Censorship Democratic Censorship

There’s been a lot of discourse recently about the responsibility social media giants like Facebook and Twitter have to police their communities. Facebook incites violence by allowing false-rumors to circulate. Twitter only recently banned large communities of neo-nazis and white supremacists organizing on the site. Discord continues to be an organizational hub for Nazis and the Alt-Right. There’s been plenty of discussion about why these platforms have so little moderatorship, ranging from their business model (incendiary content drives views and is beneficial for Facebook), to a lack of resources, to a lack of incentive.

I’d like to explore a new side of the issue: Why should a private company have the role of a cultural censor, and how can we redesign our social media to democratize censorship?

To be absolutely clear, censorship serves an important role in social media in stopping verbal and emotional abuse, stalking, toxic content, and hate speech. It can also harm at-risk communities when applied too broadly, as seen in recent well-intentioned U.S. legislation endangering sex workers.

On Freedom of Speech

Censorship within the context of social media is not incompatible with free-speech. First, Freedom of Speech in the United States is largely regarded to apply to government criticism, political speech, and advocacy of unpopular ideas. These do not traditionally include speech inciting immediate violence, obscenity, or inherently illegal content like child pornography. Since stalking, abuse, and hate-speech do not contribute to a public social or political discourse, they fall squarely outside the domain of the USA’s first amendment.

Second, it’s important to note that censorship in social media means a post is deleted or an account is locked. Being banned from a platform is more akin to exile than to arrest, and leaves the opportunity to form a new community accepting of whatever content was banned.

Finally there’s the argument that freedom of speech applies only to the government and public spaces, and is inapplicable to a privately-owned online space like Twitter or Facebook. I think had the U.S. Bill of Rights been written after the genesis of the Internet this would be a non-issue, and we would have a definition for a public commons online. Regardless, I want to talk about what should be, rather than what is legally excusable.

The Trouble with Corporate Censors

Corporations have public perceptions which effect their valuations. Therefore, any censorship by the company beyond what is legally required will be predominantly focused on protecting the ‘image’ of the company and avoiding controversy so they are not branded as a safe-haven for bigots, nazis, or criminals.

Consider Apple’s censorship of the iOS App Store - repeatedly banning drone-strike maps with minimal explanatory feedback. I don’t think Apple made the wrong decision here; they reasonably didn’t want to be at the epicenter of a patriotism/pro-military/anti-war-movement debate, since it has nothing to do with their corporation or public values. However, I do think that it’s unacceptable that Apple is in the position of having this censorship choice to begin with. A private corporation, once they have sold me a phone, should not have say over what I can and cannot use that phone to do. Cue Free Software Foundation and Electronic Frontier Foundation essays on the rights of the user.

The same argument applies to social media. Facebook and Twitter have a vested interest in limiting conversations that reflect poorly on them, but do not otherwise need to engender a healthy community dynamic.

Sites like Reddit that are community-moderated have an advantage here: Their communities are self-policing, both via the main userbase downvoting inappropriate messages until they are hidden, and via appointed moderators directly removing unacceptable posts. This works well in large subreddits, but since moderators have authority only within their own sub-communities there are still entire subreddits accepting of or dedicated to unacceptable content, and there are no moderators to review private messages or ban users site wide. A scalable solution will require stronger public powers.

Federated Communities

The privacy, anonymity, and counter-cultural communities have been advocating “federated” services like Mastadon as an alternative to centralized systems like Twitter and Facebook. The premise is simple: Anyone can run their own miniature social network, and the networks can be linked at will to create a larger community.

Privacy Researcher Sarah Jamie Lewis has written about the limitations of federated models before, but it boils down to “You aren’t creating a decentralized democratic system, you’re creating several linked centralized systems, and concentrating power in the hands of a few.” With regards to censorship this means moving from corporate censors to a handful of individual censors. Perhaps an improvement, but not a great one. While in theory users could react to censorship by creating a new Mastadon instance and flocking to it, in reality users are concentrated around a handful of large servers where the community is most vibrant.

Components of a Solution

A truly self-regulatory social community should place control over censorship of content in the hands of the public, exclusively. When this leads to a Tyranny of the Majority (as I have no doubt it would), then the effected minorities have an incentive to build a new instance of the social network where they can speak openly. This is not an ideal solution, but is at least a significant improvement over current power dynamics.

Community censorship may take the form of voting, as in Reddit’s “Upvotes” and “Downvotes”. It may involve a majority-consensus to expel a user from the community. It may look like a more sophisticated republic, where representatives are elected to create a temporary “censorship board” that removes toxic users after quick deliberation. The key is to involve the participants of the community in every stage of decision making, so that they shape their own community standards instead of having them delivered by a corporate benefactor.

Care needs to be taken to prevent bots from distorting these systems of governance, and giving a handful of users de-facto censorship authority. Fortunately, this is a technical problem that’s been explored for a long time, and can be stifled by deploying anti-bot measures like CAPTCHAs, or by instituting some system like “voting for representatives on a blockchain”, where creating an army of bot-votes would become prohibitively expensive.

This should be not only compatible, but desirable, for social media companies. Allowing the community to self-rule shifts the responsibility for content control away from the platform provider, and means they no longer need to hire enormous translator and moderator teams to maintain community standards.

Posted 4/22/18

Hacker Community Espionage Hacker Community Espionage Hacker Community Espionage

I recently got to see a talk at the Chaos Communication Congress titled “When the Dutch secret service knocks on your door”, with the following description:

This is a story of when the Dutch secret service knocked on my door just after OHM2013, what some of the events that lead up to this, our guesses on why they did this and how to create an environment where we can talk about these things instead of keeping silent.

Since the talk was not recorded, the following is my synopsis and thoughts. This post was written about a week after the talk, so some facts may be distorted by poor memory recall.

  • The speaker was approached by members of the Dutch secret service at his parents’ house. They initially identified themselves as members of the department of the interior, but when asked whether they were part of the secret service, they capitulated.

  • The agents began by offering all-expenses-paid travel to any hackathon or hackerspace. All the speaker needed to do was write a report about their experience and send it back. A relatively harmless act, but it means they would be an unannounced informant in hacker communities.

  • When the author refused, the agents switched to harder recruitment techniques. They pursued the author at the gym, sat nearby in cafes when the author held meetings for nonprofits, and likely deployed an IMSI catcher to track them at a conference.

  • Eventually, the author got in contact with other members of the hacker community that had also been approached. Some of them went further through the recruitment process. The offers grew, including “attend our secret hacker summer camp, we’ll let you play with toys you’ve never heard of,” and “If you want to hack anything we can make sure the police never find out.” In either of these cases the recruit is further indebted to the secret service, either by signing NDAs or similar legal commitments to protect government secrets, or by direct threat, wherein the government can restore the recruit’s disappeared criminal charges at any time.

I have two chief concerns about this. First, given how blatant the secret service was in their recruitment attempts, and that we only heard about their attempts in December of 2017, we can safely assume many people accepted the government’s offer. Therefore, there are likely many informants working for the secret service already.

Second, this talk was about the Netherlands - a relatively small country not known for their excessive surveillance regimes like the Five Eyes. If the Netherlands has a large group of informants spying on hackerspaces and conferences around the globe, then many other countries will as well, not to mention more extreme measures likely taken by countries with more resources.

From this, we can conclude there are likely informants in every talk at significant conferences. Every hackerspace with more than token attendance is monitored. This is not unprecedented - the FBI had a vast array of informants during the COINTELPRO era that infiltrated leftist movements throughout the United States (along with much less savory groups like the KKK), and since shortly after 9/11 has used a large group of Muslim informants to search for would-be terrorists.

Posted 1/7/18

Alcoholics Anonymous as Decentralized Architecture Alcoholics Anonymous as Decentralized Architecture Alcoholics Anonymous as Decentralized Architecture

Most examples of decentralized organization are contemporary: Black Lives Matter, Antifa, the Alt-Right, and other movements developed largely on social media. Older examples of social decentralization tend to be failures: Collapsed Hippie communes of the 60s, anarchist and communist movements that quickly collapsed or devolved to authoritarianism, the “self-balancing free market,” and so on.

But not all leaderless movements are short-lived failures. One excellent example is Alcoholics Anonymous: An 82-year-old mutual aid institution dedicated to helping alcoholics stay sober. Aside from their age, AA is a good subject for study because they’ve engaged in a great deal of self-analysis, and have very explicitly documented their organizational practices.

Let’s examine AA’s Twelve Traditions and see what can be generalized to other organizations. The twelve traditions are reproduced below:

  1. Our common welfare should come first; personal recovery depends on AA unity.

  2. For our group purpose there is but one ultimate authority - a loving God as He may express Himself in our group conscience.

  3. The only requirement for AA membership is a desire to stop drinking.

  4. Each group should be autonomous except in matters affecting other groups or AA as a whole.

  5. Each group has but one primary purpose - to carry its message to the alcoholic who still suffers.

  6. An AA group ought never endorse, finance or lend the AA name to any related facility or outside enterprise, lest problems of money, property and prestige divert us from our primary purpose.

  7. Every AA group ought to be fully self-supporting, declining outside contributions.

  8. Alcoholics Anonymous should remain forever nonprofessional, but our service centers may employ special workers.

  9. AA, as such, ought never be organized; but we may create service boards or committees directly responsible to those they serve

  10. Alcoholics Anonymous has no opinion on outside issues; hence the AA name ought never be drawn into public controversy.

  11. Our public relations policy is based on attraction rather than promotion; we need always maintain personal anonymity at the level of press, radio and films.

  12. Anonymity is the spritual foundation of all our traditions, ever reminding us to place principles before personalitites.

The above twelve rules can be distilled to three themes:

  • The group comes first

  • The group is single-issue

  • The group should be independent of any external or internal structures

The first theme stresses anonymity in an interesting way: Not to protect individual members (many of whom want to be anonymous when in an organization like AA), but to prevent the rise of “rock-stars”, or powerful individuals with celebrity status. Personal power is prone to abuse, both at an inter-personal level (see the plethora of sexual abuse cases in the news right now), and at a structural level, where the organization becomes dependent on this single individual, and is drawn in to any conflict surrounding the celebrity.

The solution to a rock-star is to kick them out of the organization, and maintain a healthier community without them. AA has gone a step further however, and outlines how to prevent the rise of a rock-star by preventing any personal identification when communicating to the outside world. When you are speaking to the press you are Alcoholics Anonymous, and may not use your name. For further discussion on rock-stars in tech communities, see this article.

The single-issue design is an unusual choice. Many social movements like the Black Panthers stress solidarity, the idea that we should unite many movements to increase participants and pool resources. This is the same principle behind a general strike, and broad, cross-issue activist networks like the Indivisible movement. However, focusing on a single issue continues the trend of resisting corruption and abuse of power. AA keeps a very strict, simple mission, with no deviations.

The last theme, total organizational independence, is also unusual. Organizations that fear external attack, like terrorist cells, may operate in isolation from other cells with little to no higher-level coordination. Organizations avoiding internal corruption, like the Occupy movement, or fraternities, may limit internal leadership and centralization of power using systems like Robert’s Rules of Order or Clusters & Spokes Councils, or they may organize more anarchically, through organic discussion on social media. Avoiding both internal and external hierarchy, however, sacrifices both large-scale coordination and quick decision making. This works for Alcoholics Anonymous, because their mission is predefined and doesn’t require a great deal of complex leadership and decision making. It is also used by Antifa, where local groups have no contact with one another and rely on collective sentiment to decide on actions.

Overall, AA is an interesting introduction to decentralized organizations. I will revisit these ideas as I learn more.

Posted 1/6/18

Halftone QR Codes Halftone QR Codes Halftone QR Codes

I recently encountered a very neat encoding technique for embedding images into Quick Response Codes, like so:

Halftone QR Code Example

A full research paper on the topic can be found here, but the core of the algorithm is actually very simple:

  1. Generate the QR code with the data you want

  2. Dither the image you want to embed, creating a black and white approximation at the appropriate size

  3. Triple the size of the QR code, such that each QR block is now represented by a grid of 9 pixels

  4. Set the 9 pixels to values from the dithered image

  5. Set the middle of the 9 pixels to whatever the color of the QR block was supposed to be

  6. Redraw the required control blocks on top in full detail, to make sure scanners identify the presence of the code

That’s it! Setting the middle pixel of each cluster of 9 generally lets QR readers get the correct value for the block, and gives you 8 pixels to represent an image with. Occasionally a block will be misread, but the QR standard includes lots of redundant checksumming blocks to repair damage automatically, so the correct data will almost always be recoverable.

There is a reference implementation in JavaScript of the algorithm I’ve described. I have extended that code so that when a pixel on the original image is transparent the corresponding pixel of the final image is filled in with QR block data instead of dither data. The result is that the original QR code “bleeds in” to any space unused by the image, so you get this:

Halftone QR with background bleed

Instead of this:

Halftone QR without background bleed

This both makes the code scan more reliably and makes it more visually apparent to a casual observer that they are looking at a QR code.

The original researchers take this approach several steps further, and repeatedly perturb the dithered image to get a result that both looks better and scans more reliably. They also create an “importance matrix” to help determine which features of the image are most critical and should be prioritized in the QR rendering. Their code can be found here, but be warned that it’s a mess of C++ with Boost written for Microsoft’s Visual Studio on Windows, and I haven’t gotten it running. While their enhancements yield a marked improvement in image quality, I wish to forgo the tremendous complexity increase necessarily to implement them.

Posted 12/19/17

Cooperative Censorship Cooperative Censorship Cooperative Censorship

I have long been an opponent of censorship by any authority. Suppression of ideas stifles discussion, and supports corruption, authoritarianism, and antiquated, bigoted ideas. I have put a lot of thought in to distributed systems, like Tor or FreeNet, that circumvent censorship, or make it possible to host content that cannot be censored.

However, the recent Charlottesville protests show another side of the issue. Giving the alt-right a prolific voice online and in our media has allowed the Nazi ideology to flourish. This isn’t about spreading well-reasoned ideas or holding educational discussion - the goal of white supremacists is to share a message of racial superiority and discrimination based wholly in old hateful prejudice, not science or intellectual debate.

The progress of different hosting providers shutting down the Daily Stormer neo-Nazi community site shows how hesitant Corporate America is to censor - whether out of concern for bad PR, loss of revenue, perception of being responsible for the content they facilitate distribution of, or (less likely) an ideological opposition to censorship.

Ultimately, I still belief in the superiority of decentralized systems. Money-driven corporations like GoDaddy and Cloudflare should not be in the position where they are cultural gatekeepers that decide what content is acceptable and what is not. At the same time, a distributed system that prevents censorship entirely may provide an unreasonably accessible platform for hate speech. No censorship is preferable to authoritarian censorship, but is there a way to build distributed community censorship, where widespread rejection of content like white supremacy can stop its spread, without allowing easy abuse of power? If it is not designed carefully such a system would be prone to Tyranny of the Majority, where any minority groups or interests can be oppressed by the majority. Worse yet, a poorly designed system may allow a large number of bots to “sway the majority”, effectively returning to an oligarchic “tyranny of the minority with power” model. But before ruling the concept out, let’s explore the possibility some…

Existing “Distributed Censorship” Models

Decentralized Twitter clone Mastadon takes a multiple-instances approach to censorship. Effectively, each Mastadon server is linked together, or “federated”, but can refuse to federate with particular servers if the server admin chooses to. Each server then has its own content guidelines - maybe one server allows pornography, while another server forbids pornography and will not distribute posts from servers that do. This allows for evasion of censorship and the creation of communities around any subject, but content from those communities will not spread far without support from other servers.

Facebook lookalike Diaspora has a similar design, distributing across many independently operated servers called “pods”. However, content distribution is primarily decided by the user, not the pod administrator. While the pod administrator chooses what other pods to link to, the user independently chooses which users in those pods their posts will be sent to, with a feature called “aspects”. This ideally lets a user segment their friend groups from family or work colleagues, all within the same account, although there is nothing preventing users from registering separate accounts to achieve the same goal.

Both of these models distribute censorship power to server administrators, similar to forum or subreddit moderators. This is a step in the right direction from corporate control, but still creates power inequality between the relatively few server operators and the multitude of users. In the Mastadon example, the Mastadon Monitoring Project estimates that there are about 2400 servers, and 1.5 million registered users. That is, about 0.16% of the population have censorship control. While there’s nothing technical stopping a user from starting their own server and joining the 0.16%, it does require a higher expertise level, a server to run the software on, and a higher time commitment. This necessarily precludes most users from participating in censorship (and if we had 1.5 million Mastadon servers then administering censorship would be unwieldy).

Other Researcher’s Thoughts

The Digital Currency Initiative and the Center for Civic Media (both MIT groups) released a relevant report recently on decentralized web technologies, their benefits regarding censorship, and adoption problems the technologies face. While the report does not address the desirability of censoring hate speech, it does bring up the interesting point that content selection algorithms (like the code that decides what to show on your Twitter or Facebook news feeds) are as important to censorship as actual control of what posts are blocked. This presents something further to think about - is there a way to place more of the selection algorithm under user control without loading them down with technical complexity? This would allow for automatic but democratic censorship, that may alleviate the disproportionate power structures described above.

Posted 8/19/17

Braess's Paradox Braess's Paradox Braess’s Paradox

I had the great fortune of seeing a talk by Brian Hayes on Braess’s Paradox, an interesting network congestion phenomenon. In this post I’ll talk about the problem, and some ramifications for other fields.

The Problem

Consider a network of four roads. Two roads are extremely wide, and are effectively uncongested, regardless of how many cars are present. They still have speed limits, so we’ll say there’s a constant traversal time of “one hour” for these roads. The other two roads, while more direct and thereby faster, have only a few lanes, and are extremely prone to congestion. As an estimate, we’ll say the speed it takes to traverse these roads scales linearly with “N”, the number of cars on the road, such that if all the cars are one one road it will take one hour to travel on.

Plain Network

If a driver wants to get from point A to point B, what route is fastest? Clearly, by symmetry, the two paths are the same length. Therefore, the driver should take whatever path is less-congested, or select randomly if congestion is equal. Since half the cars will be on each path, the total commute time is about 1.5 hours for all drivers.

However, consider the following change:

Magic Shortcut Network

In this network we’ve added a new path that’s extremely fast (no speed limits, because they believe in freedom), to the point that we’ll consider it instantaneous.

What is the optimal path for a driver now? A lone driver will obviously take the first direct road, then the shortcut, then the second direct road. However, if all “N” drivers take this route the small roads will be overloaded, increasing their travel time to one hour each. The total commute for each driver will now be two hours.

Consider that you are about to start driving, and the roads are overloaded. If you take the short route your commute will be two hours long. However, if you take the long route your commute will be two hours long, and the other roads will be less overloaded (since without you only N-1 cars are taking the route), so everyone else will have a commute slightly shorter than two hours. This means from a greedy algorithm perspective there is always an incentive to take the more direct route, and help perpetuate the traffic problem.

Simply put, adding a shortcut to the network made performance worse, not better.

The Solution

There are a number of potential solutions to the problem. Law enforcement might demand that drivers select their routes randomly, saving everyone half an hour of commute. Similarly, self-driving cars may enforce random path selection, improving performance without draconian-feeling laws. These are both “greater good” solutions, which assume drivers’ willingness to sacrifice their own best interests for the interests of the group. Either of these solutions provide an incentive for drivers to cheat - after all, the shortcut is faster so long as there are only a few people using it.

Another option is limiting information access. The entire problem hinges on the assumption that users know the to-the-moment traffic information for each possible route, and plan their travel accordingly. Restricting user information to only warn about extreme congestion or traffic accidents effectively prohibits gaming the system, and forces random path selection.


Braess’s Paradox is an interesting problem where providing more limited information improves performance for all users. Are there parallels in other software problems? Any system where all nodes are controlled by the same entity can be configured for the “greater good” solution, but what about distributed models like torrenting, where nodes are controlled by many people?

In a torrenting system, users have an incentive to “cheat” by downloading chunks of files without doing their share and uploading in return. Consider changing the system so users do not know who has the chunks they need, and must made trades with various other nodes to acquire chunks, discovering after the fact whether it was what they were looking for. Users now must participate in order to acquire the data they want. This may slow the acquisition of data, since you can no longer request specific chunks, but it may also improve the total performance of the system, since there will be far more seeders uploading data fragments.

The performance detriment could even be alleviated by allowing the user to request X different chunks in their trade, and the other end must return the appropriate chunks if they have them. This limits wasteful exchanges, while still ensuring there are no leechers.

Fun thought experiment that I expect has many more applications.

Posted 8/8/17

Merkle's Puzzle-box Key Exchange Merkle's Puzzle-box Key Exchange Merkle’s Puzzle-box Key Exchange

Cryptography is fantastic, but much of it suffers from being unintuitive and math-heavy. This can make it challenging to teach to those without a math or computer science background, but makes it particularly difficult to develop a sense of why something is secure.

There are a handful of cryptographic systems however, that are delightfully easy to illustrate and provide a great introduction to security concepts. One such system is Merkle’s Puzzles.

The Problem

Similar to the Diffie-Hellman Key Exchange, the goal of Merkle’s Puzzle Boxes is for two parties (we’ll call them Alice and Bob) to agree on a password to encrypt their messages with. The problem is that Alice and Bob can only communicate in a public channel where anyone could be listening in on them. How can they exchange a password securely?

The Process

Alice creates several puzzle boxes (since she has a computer, we’ll say she makes several thousand of them). Each puzzle box has three pieces:

  1. A random number identifying the box
  2. A long, random, secure password
  3. A hash of parts 1 and 2

Each “box” is then encrypted with a weak password that can be brute-forced without taking too long. Let’s say it’s a five character password.

Alice then sends all her encrypted puzzle boxes to Bob:

Alice sending puzzle boxes

Bob selects one box at random, and brute-forces the five character password. He knows he has the right password because the hash inside the box will match the hash of parts 1 and 2 in the box. He then announces back to Alice the number of the box he broke open.

Bob sending back number of chosen box

Alice (who has all the unlocked puzzle boxes already) looks up which box Bob has chosen, and they both begin encrypting their messages with the associated password.

Why is it Secure?

If we have an eavesdropper, Eve, listening in to the exchange, then she can capture all the puzzle boxes. She can also hear the number of the puzzle box Bob has broken in to when he announces it back to Alice. Eve doesn’t know, however, which box has that number. The only way to find out is to break in to every puzzle box until she finds the right one.

Eve breaking every puzzle box

This means while it is an O(1) operation for Bob to choose a password (he only has to break one box), it is an O(n) operation for Eve to find the right box by smashing all of them.

This also means if we double the number of puzzle-boxes then the exchange has doubled in security, because Eve must break (on average) twice as many boxes to find what she’s looking for.

Why don’t we use Puzzle-boxes online?

Merkle’s puzzles are a great way of explaining a key exchange, but computationally they have a number of drawbacks. First, making the system more secure puts a heavy workload on Alice. But more importantly, it assumes the attackers and defenders have roughly the same computational power.

An O(n) attack complexity means Eve only needs n times more CPU time than Bob to crack the password - so if the key exchange is configured to take three seconds, and there are a million puzzle boxes, then it would take 35 days for that same computer to crack. But if the attacker has a computer 100 times faster than Bob (say they have a big GPU cracking cluster) then it will only take them 0.35 days to break the password. A more capable attacker like a nation state could crack such a system almost instantly.

If Eve is recording the encrypted conversation then she can decrypt everything after the fact once she breaks the puzzle box challenge. This means even the original 35-day attack is viable, let alone the GPU-cluster attack. As a result, we use much more secure algorithms like Diffie-Hellman instead.

Posted 7/17/17

Port Knocking Port Knocking Port Knocking

Port knocking is a somewhat obscure technique for hiding network services. Under normal circumstances an attacker can use a port scanner to uncover what daemons are listening on a server:

$ nmap -sV
Starting Nmap 6.46 ( ) at 2017-07-17
Nmap scan report for (XX.XXX.XX.XX)
Host is up (0.075s latency).
Not shown: 990 filtered ports
22/tcp   open   ssh        (protocol 2.0)
80/tcp   open   http       Apache httpd
443/tcp  open   ssl/http   Apache httpd
465/tcp  closed smtps
587/tcp  open   smtp       Symantec Enterprise Security manager smtpd
993/tcp  open   ssl/imap   Dovecot imapd

Note: Port scanning is illegal in some countries - consult local law before scanning others.

Sometimes however, a sysadmin may not want their services so openly displayed. You can’t brute-force ssh logins if you don’t know sshd is running.

The Technique

With port knocking, a daemon on the server secretly listens for network packets. A prospective client must make connections to a series of ports, in order, without interruption and in quick succession. Note that these ports do not need to be open on the server - attempting to connect to a closed port is enough. Once this sequence is entered, the server will allow access to the hidden service for the IP address in question.

This sounds mischievously similar to steganography - we’re hiding an authentication protocol inside failed TCP connections! With that thought driving me, it sounded like writing a port-knocking daemon would be fun.

Potential Designs

There are several approaches to writing a port-knocker. One is to run as a daemon listening on several ports. This is arguably the simplest approach, and doesn’t require root credentials, but is particularly weak because a port scanner will identify the magic ports as open, leaving the attacker to discover the knocking combination.

Another approach (used by Moxie Marlinspike’s knockknock) is to listen to kernel logs for rejected incoming TCP connections. This approach has the advantage of not requiring network access at all, but requires that the kernel output such information to a log file, making it less portable.

The third (and most common) approach to port knocking is to use packet-sniffing to watch for incoming connections. This has the added advantage of working on any operating system libpcap (or a similar packet sniffing library) has been ported to. Unfortunately it also requires inspecting each packet passing the computer, and usually requires root access.

Since I have some familiarity with packet manipulation in Python already, I opted for the last approach.

The Implementation

With Scapy, the core of the problem is trivial:

def process_packet(packet):
        src = packet[1].src     # IP Header
        port = packet[2].dport  # TCP Header
        if( port in sequence ):
                knock_finished = addKnock(sequence, src, port, clients)
                if( knock_finished ):
                        trigger(username, command, src)
        # Sequence broken
        elif( src in clients ):
                del clients[src]

sniff(filter="tcp", prn=process_packet)

The rest is some semantics about when to remove clients from the list, and dropping from root permissions before running whatever command was triggered by the port knock. Code available here.

Posted 7/17/17

Decentralized Networks Decentralized Networks Decentralized Networks

A solution to the acyclic graph problem has been found! This post adds to the continuing thread on modeling social organizations with Neural Networks (post 1) (post 2) (post 3)

The Dependency Problem

The issue with cyclic neural networks is dependencies. If we say Agent A factors in information from Agent B when deciding what message to transmit, but Agent B factors in messages from Agent A to make its decision, then we end up with an infinite loop of dependencies. One solution is to kickstart the system with a “dummy value” for Agent B (something like “On iteration 1, Agent B always transmits 0”), but this is clunky, difficult to perform in a library like Tensorflow, and still doesn’t mesh well with arbitrary evaluation order (for each iteration, do you evaluate A or B first?).

Instead, we can bypass the problem with a one-directional loop. The trick is as follows:

  1. Agent A0 sends a message (not dependent on B)
  2. Agent B0 receives A0’s message, and decides on a message to send
  3. Agent A1 receives B0’s message, and factors it (along with the messages A0 received) in to deciding on a message to send
  4. Agent B1 receives A1’s message, and factors it (along with the messages B0 received) in to deciding on a message to send

We have now created a dependency tree where A can rely on B’s message, and B can rely on the message generated in response, but all without creating an infinite loop. When judging the success of such a network, we look only at the outputs of A1 and B1, not their intermediate steps (A0 and B0).

If it’s absolutely essential you can create a third layer, where A2 is dependent on the message sent by B1, and so on. As you approach an infinite number of layers you get closer and closer to the original circular dependency solution, but using more than two or three layers usually slows down computation considerably without yielding significantly different results.

Social-Orgs, Revisited

With the above solution in mind, let’s re-evaluate the previous social-group problems with two layers of Agents instead of one. Each layer can send all of the data it’s received, with no communications cost or noise, to its counterpart one layer up. This effectively creates the A0 and A1 dynamic described above. When evaluating the success of the network we will look at the accuracy of environmental estimates from only the outermost layer, but will count communications costs from all layers.

Multilayer simple trial

Tada! A social organization that doesn’t revolve around A0 reading every piece of the environment itself!

Note: In the above graph, most nodes only have a 0 or 1 layer, not both. This is because the other layer of the agent does not listen to anything, and is not shown in the graph. More complex examples will include both layers more frequently.

The result is still unlikely - all information passes through A2 before reaching the agents (even A0 gets information about three environment nodes through A2) - but it’s already more balanced than previous graphs.

Next Steps

A better evaluation algorithm is needed. With the two-layer solution there is no longer a requirement for centralization - but there is no incentive for decentralization, either. A real human organization has not only total costs, but individual costs as well. Making one person do 100 units of work is not equivalent to making 10 people do 10 units of work. Therefore, we need a cost algorithm where communications become exponentially more expensive as they are added to a worker. This should make it “cheaper” to distribute labor across several workers.


This post is based off of research I am working on at the Santa Fe Institute led by David Wolpert and Justin Grana. Future posts on this subject are related to the same research, even if they do not explicitly contain this attribution.

Posted 7/9/17

Ruggedized Networks Ruggedized Networks Ruggedized Networks

This post adds to my posts on modeling social organizations with Neural Networks (post 1) (post 2)

The Problem

The original model defined the objective as minimizing communications costs while getting an estimate of the environment state, and sharing that estimate with all nodes. This objective has a flaw, in that it is always cheaper to let a single agent read an environment node, making that agent a single point of failure. This flaw is exacerbated by the Directed Acyclic Graph limitation, which means since A0 must always read from the environment, it is always cheapest to have the entire network rely on A0 for information.

An Attack

I recently worked on developing a welfare function emphasizing robustness, or in this case, the ability of the network to complete its objective when a random agent is suddenly removed. The result should be a network without any single points of failure, although I am not accounting for multi-agent failures.

The result is as follows:

Diagram of robustness trial

In this diagram, all agents receive information from A0. However, most of them also receive information from A2, which receives information from A1, which is monitoring the entire environment. As a result, when A0 is disabled, only nodes A3 and A5 are negatively effected.

How it Works

To force robustness I created eleven parallel versions of the graph. They have identical listen weights (the amount any agent tries to listen to any other agent), and begin with identical state weights (how information from a particular agent is factored in to the estimate of the environment), and identical output weights (how different inputs are factored in to the message that is sent).

The difference is that in each of these parallel graphs (except the first one) a single Agent is disabled, by setting all of its output weights to zero. The welfare of the solution is the average of the welfare for each graph.

But why are the state weights and output weights allowed to diverge? Aren’t we measuring ten completely different graphs then?

Not quite. The topology of the network is defined by its listen weights, so we will end up with the same graph layout in each configuration. To understand why the other weights are allowed to diverge, consider an analogy to a corporate scenario:

You are expected to get reports from several employees (Orwell, Alice, and Jim) and combine them to make your own final report. When you hear Jim has been fired, you no longer wait for his report to make your own. Taking input (which isn’t coming) from Jim in to consideration would be foolish.

Similarly, each graph adjusts its state weights and output weights to no longer depend on information from the deleted agent, representing how a real organization would immediately respond to the event.

Then why can’t the listen weights change, too?

This model represents instantaneous reaction to an agent being removed. While over time the example corporation would either replace Jim or restructure around his absence, you cannot instantly redesign the corporate hierarchy and change who is reporting to who. Meetings take time to schedule, emails need to be sent, and so on.

Next Steps

This objective is still limited by the acyclic graph structure, but provides a baseline for valuing resiliency mathematically. Once the acyclic problem is tackled this solution will be revisited.


This post is based off of research I am working on at the Santa Fe Institute led by David Wolpert and Justin Grana. Future posts on this subject are related to the same research, even if they do not explicitly contain this attribution.

Posted 7/6/17

Cryptocurrency Tutorial Cryptocurrency Tutorial Cryptocurrency Tutorial

I was recently asked to give a talk on bitcoin and other related cryptocurrencies. My audience was to be a group of scientists and mathematicians, so people with significant STEM backgrounds, but not expertise in computer science. In preparation for giving my talk, I wrote this breakdown on the ins and outs of cryptocurrencies.

UPDATE 7/11/17

I gave the talk, it went great! Slides here [PDF].


What is Bitcoin?

Bitcoin is a decentralized currency. There is no governing body controlling minting or circulation, making it appealing to those who do not trust governments or financial institutions like Wall Street.

Where as most currencies have a physical paper representation, bitcoin is exchanged by adding on to a “blockchain”, or a global ledger of who owns what pieces of currency, and what transactions were made when.

Where does the value of Bitcoin come from?

Bitcoin is a fiat currency - Its value comes exclusively from what people are willing to exchange it for. This seems ephemeral, but is not uncommon, and is the same principle behind the value of the US dollar, at least since the United States left the gold standard.

Is Bitcoin anonymous?

Yes and no. All bitcoin transactions are public, and anyone can view the exact amount of money in a bitcoin wallet at any given time. However, bitcoin wallets are not tied to human identities, so as long as you keep the two distinct (which can be challenging), it is effectively “anonymous”.

How is Bitcoin handled legally?

Some countries consider bitcoin to be a currency (with a wildly fluctuating exchange rate), while others regard it as a commodity with an unstable value. Most countries will tax bitcoins in some way or another, but due to the aforementioned anonymity it is easy to avoid paying taxes on bitcoins.

What is the blockchain?

The blockchain is a technology solving two problems:

  1. How do we know who has what currency?
  2. How do we prevent someone from spending currency that isn’t theirs?

The second problem includes preventing someone from “double-spending” a bitcoin they legitimately own.

A blockchain is a sequence of “blocks”, where each block holds “facts”. These facts describe every transaction of bitcoins from one person to another. To make a transaction, you must create a block describing the transaction, and convince the majority of the nodes in the bitcoin blockchain to accept your transaction.

What does a block consist of?

A block has four fields:

  1. A string describing all contained facts
  2. The identifier of the previous block in the blockchain (maintains an explicit order for all transactions)
  3. A random string
  4. The SHA256 hash of all of the above

A block is accepted in to the blockchain if and only if the SHA256 hash starts with at least n leading zeroes. This makes generating a block equivalent to hash cracking (keep changing the random string until you get the hash you want), and the larger n is, the more challenging the problem is to solve.

For example, if n=5:

A losing block hash (will be rejected):
A winning block hash (will be accepted):

The number of leading zeroes n is increased periodically by group consensus so that even as more people begin to work on generating blocks, the rate of new blocks remains approximately constant (~one every ten minutes). This makes it extremely unlikely that two new and valid blocks will be generated near the same time, and therefore creates a continual chain of events making double-spending impossible.

Looking for a new valid block is colloquially referred to as “bitcoin mining”.

Note: The hashing algorithm (sha256) is specific to bitcoin. Other cryptocurrencies may use different hashing algorithms to discourage the use of GPUs in mining.

Can I spend someone else’s coins by mining a block?

Bitcoins are tied to a “bitcoin wallet”, which is a public/private keypair. To send coins to a new wallet you must make a blockchain fact describing a transfer of X bitcoins from one wallet’s public key to another, signed with the private key of the originating wallet. Therefore unless you have access to the private key, you’ll be unable to control the bitcoins associated with it.

Why would anyone mine blocks?

Each successfully mined block yields the miner some currency. They include their own wallet address as one of the facts in the block, and receive a fixed amount of currency (25BTC for bitcoin) at that address. This is also why you must pay a small transaction fee to send anyone a bitcoin - you are asking someone to include your transaction in their massive mining effort.

Doesn’t this mean there are a fixed number of bitcoins in the world?

Some readers may have noticed that SHA256 has a fixed length (256-bits, or 32 characters). If we periodically increase n, then eventually we will require that all 32 characters of the hash be “0”, which will make adding to the end of the blockchain impossible. Since you receive 25 bitcoins for each mined block, this puts the maximum number of bitcoins at about 21 million.

This upper limit poses a number of problems. There are a finite number of transaction blocks, after which all bitcoins will be unmovable, and therefore worthless. There are a finite number of bitcoins, so if you send some to a non-existent address, or forget your private key, those coins are effectively destroyed forever. This, along with commodity speculation, is responsible for the incredible fluctuation in the value of bitcoin.

Trust Issues

One problem with a decentralized currency like bitcoin is that there is no revocation of money transfers. With a bank, you can make a purchase with a credit card, and later dispute that purchase, claiming you did not receive what you paid for, and the bank can reverse the charge. You can also use banks and lawyers to create contracts, agreeing to pay a certain amount before a service is rendered and a certain amount after, with other complications like security deposits.

None of this infrastructure exists with bitcoin, making it an extremely scam-prone transaction system. Some people use escrow services, but these are all very ad-hoc. This is also one of the reasons bitcoin is commonly used in ransomware attacks, or for purchases of drugs or stolen property on the “deep web”.

What about alt-coins?

There are several variations on bitcoin, called “alternative-coins” or “alt-coins”. Some of the most interesting are:


Namecoin treats the blockchain as an extremely distributed database of information tied to specific identities. It’s effectively the same as bitcoin, except in addition to storing “coins” with particular wallets, you can store domain names, email addresses, public encryption keys, and more.

In theory, this removes the need for centralized DNS servers, or domain-registrars for the Internet. Everyone can perform DNS lookups by looking for the domain name in question in the blockchain, and can transfer domains to each-other in exchange for namecoins.


Ethereum tries to solve the trust issues of bitcoin by allowing you to write programmatically-enforceable contracts and embedding them in to the blockchain.

Consider the following blockchain:

ABC Blockchain

Block A contains a program with psuedocode like the following:

if( security_deposit_received and date == December 5th and house_not_destroyed )
    send(security_deposit, from=Bob, to=Alice)
else if( date > December 5th )

When block A is added to the chain the code inside is evaluated by every node in the chain. The code is re-evaluated as each subsequent block is added, until after December 5th when the code can be safely ignored.

Block B contains a transfer of $1000 from Alice to Bob, as a security deposit.

On December 5th, if the house is not destroyed, the security deposit is automatically returned to Alice by Bob.

Ethereum therefore allows you to create contracts which are enforceable without lawyers or banks, and cannot be violated by either party once issued.

Other uses for Ethereum contracts include provably-fair gambling, and generic distributed computation, where you pay each participating node for running your application.

Ethereum suffers from a few issues:

  • The complexity makes it less approachable than Bitcoin
  • Without widespread cryptographically verifiable Internet-of-Things devices the types of contracts you can express are limited
  • All code is publicly viewable, but not changeable, so if someone finds a security hole in your code, it cannot be easily patched

Despite these limitations, Ethereum has much more functionality than other cryptocurrencies and is gaining in popularity.


The best cryptocurrency. It uses a logarithmic reward function, so the first few blocks yield many dogecoins, while later blocks yield fewer. This guarantees that lots of coins enter circulation very quickly, making it a viable currency immediately after launch. It also uses scrypt instead of sha256, and so doesn’t suffer from the same GPU and ASIC-mining problems plaguing bitcoin.

Dogecoin was started as a meme in 2013, but is collectively valued at over $340 million as of June 2017, which its user-base finds hilarious. However, because of the massive number of coins in circulation, a single dogecoin is only worth about $0.00095.

The Dogecoin community is particularly noteworthy for donating more than $30,000 to ensure the Jamaican bobsledding team could travel to the 2014 Winter Olympics.

Posted 7/5/17