I’m Milo Trujillo, a computer and social scientist, activist, and hacker.
Blog posts cover a myriad of technical and social topics, often focused on security and organizing resistance.
Taking a break from recent social architecture posts, here’s some more technical security content. It’s a pretty soft introduction to reverse engineering and binary patching for those unfamiliar with the topic.
One of the tools I use in my social research recently became abandonware. The product requires annual registration, but earlier this year the developer’s website disappeared, no one has been able to reach them by email for months, and I literally cannot give them my money to reactivate the product. Unfortunately, I am still reliant on the tool for my work. With no alternative, let’s see if the software can be reactivated through more technical means.
The tool in question is a graph layout plugin for Cytoscape, so it’s distributed as a JAR file. JAR files are bundled executables in Java - they’re literally ZIP archives with a few specific files inside.
The goal is to find where the software checks whether it’s registered, and patch the binary so it always returns true.
Since this is Java, we’ll start with a decompiler. The goal of a decompiler is to go from compiled bytecode back to human-readable sourcecode in a language like Java or C. On platforms like x86 this is often very messy and unreliable, both because x86 is disgustingly complicated, and because there are many valid C programs that could compile to the same assembly. Fortunately, today we’re working with JVM bytecode, which is relatively clean and well-understood, so the JVM to Java decompilers are surprisingly capable.
I’ll be using JD-Gui, the “Java Decompiler”. Sounds like the tool for the job. Open up JD-GUI, tell it to open the JAR file, and we’re presented with a screen like this:
Wandering through the
.class files in
com (where most of the code resides), we eventually find reference to
public static boolean isActivated(), which sure sounds promising. Here’s the method definition:
If either the product is activated, or it’s still in a trial period, the function returns true. This appears to be our golden ticket. Let’s change this method so that it always returns true.
There are two techniques for patching Java code. The apparently simple option would be to use a decompiler to get Java code out of this JAR, change the Java code, then recompile it back in to a JAR. However, this assumes the decompiler was 100% perfectly accurate, and the JAR you get at the end is going to look rather different than the one you started with. Think throwing a sentence in to Google Translate and going from English to Japanese back to English again.
The cleaner technique is to leave the JAR intact, and patch this one method while it is still JVM bytecode. The decompiler helped find the target, but we’ll patch at the assembly level.
First, we’ll need to extract contents of the JAR. Most JVM bytecode editors work at the level of
.class files, and won’t auto-extract and repack the JAR for us.
$ mkdir foo $ cp foo.jar foo $ cd foo $ jar -xf foo.jar
Now all the individual files visible in the JD-Gui sidebar are unpacked on the filesystem to edit at our leisure. Let’s grab the Java Bytecode Editor, open the
DualLicenseManager.class file, and scroll to the
Above is the assembly for the short Java if-statement we saw in JD-GUI. If
isProductActivated() returns true, jump to line 14 and push a 1 on the stack before jumping to line 19. Else, if
isTrialActivated() returns false, jump to line 18 and push a 0 on the stack. Then, return the top item on the stack.
Uglier than the Java, but not hard to follow. The patch is trivially simple - change the
iconst_0 to an
iconst_1, so that no matter what, the method always pushes and returns a 1.
Then we save the method, and it’s time to re-pack the JAR.
Re-creating the JAR is only slightly more complicated than unpacking:
$ jar -cvf foo.jar * $ jar -uvfm foo.jar META-INF/MANIFEST.MF
For some reason creating a JAR from the contents of a folder ignores the manifest and creates a new one. Since we specifically want to include the contents of the manifest file (which include some metadata necessary for the plugin to connect with Cytoscape), we explicitly update the manifest of the JAR with the manifest.mf file.
From here, we can reinstall the plugin in Cytoscape and it runs like a charm.
Often, this kind of binary patching is less straightforward. This work would have been much more time-consuming (though not much more challenging) if the executable had not included symbols, meaning none of the method names are known. Most C and C++ executables do not include symbols, so you are forced to learn what each function does by reading the assembly or looking at the context in which the function is called. This is done for performance more than security, since including symbols makes the executable larger, and is only helpful for the developers during debugging.
More security-minded engineers will use tools like Packers to obfuscate how the code works and make it more difficult to find the relevant method. These are not insurmountable, but usually require watching the packer decompress the main program in memory, waiting for the perfect moment, then plucking it from memory to create an unpacked executable.
Another option is including some kind of checksumming so that the program can detect that it has been tampered with, and refuses to run, or modifies its behavior. This is rarely helpful however, since the reverse engineer can simply look for the appropriate
hasBeenTamperedWith() function and patch it to always return false. An insidious programmer could try to hide this kind of code, but it’s a cat-and-mouse game with the reverse engineer that will only buy them time.
Ultimately, this tool was built by scientists, not battle-hardened security engineers, and included no such counter-measures. Should they resurface I will gladly continue paying for their very useful software.
There’s been a lot of discourse recently about the responsibility social media giants like Facebook and Twitter have to police their communities. Facebook incites violence by allowing false-rumors to circulate. Twitter only recently banned large communities of neo-nazis and white supremacists organizing on the site. Discord continues to be an organizational hub for Nazis and the Alt-Right. There’s been plenty of discussion about why these platforms have so little moderatorship, ranging from their business model (incendiary content drives views and is beneficial for Facebook), to a lack of resources, to a lack of incentive.
I’d like to explore a new side of the issue: Why should a private company have the role of a cultural censor, and how can we redesign our social media to democratize censorship?
To be absolutely clear, censorship serves an important role in social media in stopping verbal and emotional abuse, stalking, toxic content, and hate speech. It can also harm at-risk communities when applied too broadly, as seen in recent well-intentioned U.S. legislation endangering sex workers.
Censorship within the context of social media is not incompatible with free-speech. First, Freedom of Speech in the United States is largely regarded to apply to government criticism, political speech, and advocacy of unpopular ideas. These do not traditionally include speech inciting immediate violence, obscenity, or inherently illegal content like child pornography. Since stalking, abuse, and hate-speech do not contribute to a public social or political discourse, they fall squarely outside the domain of the USA’s first amendment.
Second, it’s important to note that censorship in social media means a post is deleted or an account is locked. Being banned from a platform is more akin to exile than to arrest, and leaves the opportunity to form a new community accepting of whatever content was banned.
Finally there’s the argument that freedom of speech applies only to the government and public spaces, and is inapplicable to a privately-owned online space like Twitter or Facebook. I think had the U.S. Bill of Rights been written after the genesis of the Internet this would be a non-issue, and we would have a definition for a public commons online. Regardless, I want to talk about what should be, rather than what is legally excusable.
Corporations have public perceptions which effect their valuations. Therefore, any censorship by the company beyond what is legally required will be predominantly focused on protecting the ‘image’ of the company and avoiding controversy so they are not branded as a safe-haven for bigots, nazis, or criminals.
Consider Apple’s censorship of the iOS App Store - repeatedly banning drone-strike maps with minimal explanatory feedback. I don’t think Apple made the wrong decision here; they reasonably didn’t want to be at the epicenter of a patriotism/pro-military/anti-war-movement debate, since it has nothing to do with their corporation or public values. However, I do think that it’s unacceptable that Apple is in the position of having this censorship choice to begin with. A private corporation, once they have sold me a phone, should not have say over what I can and cannot use that phone to do. Cue Free Software Foundation and Electronic Frontier Foundation essays on the rights of the user.
The same argument applies to social media. Facebook and Twitter have a vested interest in limiting conversations that reflect poorly on them, but do not otherwise need to engender a healthy community dynamic.
Sites like Reddit that are community-moderated have an advantage here: Their communities are self-policing, both via the main userbase downvoting inappropriate messages until they are hidden, and via appointed moderators directly removing unacceptable posts. This works well in large subreddits, but since moderators have authority only within their own sub-communities there are still entire subreddits accepting of or dedicated to unacceptable content, and there are no moderators to review private messages or ban users site wide. A scalable solution will require stronger public powers.
The privacy, anonymity, and counter-cultural communities have been advocating “federated” services like Mastadon as an alternative to centralized systems like Twitter and Facebook. The premise is simple: Anyone can run their own miniature social network, and the networks can be linked at will to create a larger community.
Privacy Researcher Sarah Jamie Lewis has written about the limitations of federated models before, but it boils down to “You aren’t creating a decentralized democratic system, you’re creating several linked centralized systems, and concentrating power in the hands of a few.” With regards to censorship this means moving from corporate censors to a handful of individual censors. Perhaps an improvement, but not a great one. While in theory users could react to censorship by creating a new Mastadon instance and flocking to it, in reality users are concentrated around a handful of large servers where the community is most vibrant.
A truly self-regulatory social community should place control over censorship of content in the hands of the public, exclusively. When this leads to a Tyranny of the Majority (as I have no doubt it would), then the effected minorities have an incentive to build a new instance of the social network where they can speak openly. This is not an ideal solution, but is at least a significant improvement over current power dynamics.
Community censorship may take the form of voting, as in Reddit’s “Upvotes” and “Downvotes”. It may involve a majority-consensus to expel a user from the community. It may look like a more sophisticated republic, where representatives are elected to create a temporary “censorship board” that removes toxic users after quick deliberation. The key is to involve the participants of the community in every stage of decision making, so that they shape their own community standards instead of having them delivered by a corporate benefactor.
Care needs to be taken to prevent bots from distorting these systems of governance, and giving a handful of users de-facto censorship authority. Fortunately, this is a technical problem that’s been explored for a long time, and can be stifled by deploying anti-bot measures like CAPTCHAs, or by instituting some system like “voting for representatives on a blockchain”, where creating an army of bot-votes would become prohibitively expensive.
This should be not only compatible, but desirable, for social media companies. Allowing the community to self-rule shifts the responsibility for content control away from the platform provider, and means they no longer need to hire enormous translator and moderator teams to maintain community standards.
I recently got to see a talk at the Chaos Communication Congress titled “When the Dutch secret service knocks on your door”, with the following description:
This is a story of when the Dutch secret service knocked on my door just after OHM2013, what some of the events that lead up to this, our guesses on why they did this and how to create an environment where we can talk about these things instead of keeping silent.
Since the talk was not recorded, the following is my synopsis and thoughts. This post was written about a week after the talk, so some facts may be distorted by poor memory recall.
The speaker was approached by members of the Dutch secret service at his parents’ house. They initially identified themselves as members of the department of the interior, but when asked whether they were part of the secret service, they capitulated.
The agents began by offering all-expenses-paid travel to any hackathon or hackerspace. All the speaker needed to do was write a report about their experience and send it back. A relatively harmless act, but it means they would be an unannounced informant in hacker communities.
When the author refused, the agents switched to harder recruitment techniques. They pursued the author at the gym, sat nearby in cafes when the author held meetings for nonprofits, and likely deployed an IMSI catcher to track them at a conference.
Eventually, the author got in contact with other members of the hacker community that had also been approached. Some of them went further through the recruitment process. The offers grew, including “attend our secret hacker summer camp, we’ll let you play with toys you’ve never heard of,” and “If you want to hack anything we can make sure the police never find out.” In either of these cases the recruit is further indebted to the secret service, either by signing NDAs or similar legal commitments to protect government secrets, or by direct threat, wherein the government can restore the recruit’s disappeared criminal charges at any time.
I have two chief concerns about this. First, given how blatant the secret service was in their recruitment attempts, and that we only heard about their attempts in December of 2017, we can safely assume many people accepted the government’s offer. Therefore, there are likely many informants working for the secret service already.
Second, this talk was about the Netherlands - a relatively small country not known for their excessive surveillance regimes like the Five Eyes. If the Netherlands has a large group of informants spying on hackerspaces and conferences around the globe, then many other countries will as well, not to mention more extreme measures likely taken by countries with more resources.
From this, we can conclude there are likely informants in every talk at significant conferences. Every hackerspace with more than token attendance is monitored. This is not unprecedented - the FBI had a vast array of informants during the COINTELPRO era that infiltrated leftist movements throughout the United States (along with much less savory groups like the KKK), and since shortly after 9/11 has used a large group of Muslim informants to search for would-be terrorists.
Most examples of decentralized organization are contemporary: Black Lives Matter, Antifa, the Alt-Right, and other movements developed largely on social media. Older examples of social decentralization tend to be failures: Collapsed Hippie communes of the 60s, anarchist and communist movements that quickly collapsed or devolved to authoritarianism, the “self-balancing free market,” and so on.
But not all leaderless movements are short-lived failures. One excellent example is Alcoholics Anonymous: An 82-year-old mutual aid institution dedicated to helping alcoholics stay sober. Aside from their age, AA is a good subject for study because they’ve engaged in a great deal of self-analysis, and have very explicitly documented their organizational practices.
Let’s examine AA’s Twelve Traditions and see what can be generalized to other organizations. The twelve traditions are reproduced below:
Our common welfare should come first; personal recovery depends on AA unity.
For our group purpose there is but one ultimate authority - a loving God as He may express Himself in our group conscience.
The only requirement for AA membership is a desire to stop drinking.
Each group should be autonomous except in matters affecting other groups or AA as a whole.
Each group has but one primary purpose - to carry its message to the alcoholic who still suffers.
An AA group ought never endorse, finance or lend the AA name to any related facility or outside enterprise, lest problems of money, property and prestige divert us from our primary purpose.
Every AA group ought to be fully self-supporting, declining outside contributions.
Alcoholics Anonymous should remain forever nonprofessional, but our service centers may employ special workers.
AA, as such, ought never be organized; but we may create service boards or committees directly responsible to those they serve
Alcoholics Anonymous has no opinion on outside issues; hence the AA name ought never be drawn into public controversy.
Our public relations policy is based on attraction rather than promotion; we need always maintain personal anonymity at the level of press, radio and films.
Anonymity is the spritual foundation of all our traditions, ever reminding us to place principles before personalitites.
The above twelve rules can be distilled to three themes:
The group comes first
The group is single-issue
The group should be independent of any external or internal structures
The first theme stresses anonymity in an interesting way: Not to protect individual members (many of whom want to be anonymous when in an organization like AA), but to prevent the rise of “rock-stars”, or powerful individuals with celebrity status. Personal power is prone to abuse, both at an inter-personal level (see the plethora of sexual abuse cases in the news right now), and at a structural level, where the organization becomes dependent on this single individual, and is drawn in to any conflict surrounding the celebrity.
The solution to a rock-star is to kick them out of the organization, and maintain a healthier community without them. AA has gone a step further however, and outlines how to prevent the rise of a rock-star by preventing any personal identification when communicating to the outside world. When you are speaking to the press you are Alcoholics Anonymous, and may not use your name. For further discussion on rock-stars in tech communities, see this article.
The single-issue design is an unusual choice. Many social movements like the Black Panthers stress solidarity, the idea that we should unite many movements to increase participants and pool resources. This is the same principle behind a general strike, and broad, cross-issue activist networks like the Indivisible movement. However, focusing on a single issue continues the trend of resisting corruption and abuse of power. AA keeps a very strict, simple mission, with no deviations.
The last theme, total organizational independence, is also unusual. Organizations that fear external attack, like terrorist cells, may operate in isolation from other cells with little to no higher-level coordination. Organizations avoiding internal corruption, like the Occupy movement, or fraternities, may limit internal leadership and centralization of power using systems like Robert’s Rules of Order or Clusters & Spokes Councils, or they may organize more anarchically, through organic discussion on social media. Avoiding both internal and external hierarchy, however, sacrifices both large-scale coordination and quick decision making. This works for Alcoholics Anonymous, because their mission is predefined and doesn’t require a great deal of complex leadership and decision making. It is also used by Antifa, where local groups have no contact with one another and rely on collective sentiment to decide on actions.
Overall, AA is an interesting introduction to decentralized organizations. I will revisit these ideas as I learn more.
I recently encountered a very neat encoding technique for embedding images into Quick Response Codes, like so:
A full research paper on the topic can be found here, but the core of the algorithm is actually very simple:
Generate the QR code with the data you want
Dither the image you want to embed, creating a black and white approximation at the appropriate size
Triple the size of the QR code, such that each QR block is now represented by a grid of 9 pixels
Set the 9 pixels to values from the dithered image
Set the middle of the 9 pixels to whatever the color of the QR block was supposed to be
Redraw the required control blocks on top in full detail, to make sure scanners identify the presence of the code
That’s it! Setting the middle pixel of each cluster of 9 generally lets QR readers get the correct value for the block, and gives you 8 pixels to represent an image with. Occasionally a block will be misread, but the QR standard includes lots of redundant checksumming blocks to repair damage automatically, so the correct data will almost always be recoverable.
Instead of this:
This both makes the code scan more reliably and makes it more visually apparent to a casual observer that they are looking at a QR code.
The original researchers take this approach several steps further, and repeatedly perturb the dithered image to get a result that both looks better and scans more reliably. They also create an “importance matrix” to help determine which features of the image are most critical and should be prioritized in the QR rendering. Their code can be found here, but be warned that it’s a mess of C++ with Boost written for Microsoft’s Visual Studio on Windows, and I haven’t gotten it running. While their enhancements yield a marked improvement in image quality, I wish to forgo the tremendous complexity increase necessarily to implement them.