Network Coding

Posted 6/6/17

I was recently introduced to the idea of Network Coding, a relatively obscure and rarely implemented technique for simplifying routing and increasing network throughput and security.

This post is an introduction to the theory, but there may be future posts on its application.

The Problem

We’ll start with the quintessential Network Coding problem - The Butterfly Network.

Assume we have data streams A and B, both of which must pass through the network to reach clients 1 and 2. The network topology looks something like the following:

Empty routing diagram

We’ll say that this is a simple network, where each node can only transmit one bit at a time. Therefore, we have a bottleneck: The first switch can either transmit data stream A, or data stream B, but not both at once. This means with traditional routing we must choose to prioritize either client 1 or 2, delaying the other.

Network Coding, however, provides an alternate solution. The first switch combines the two data streams, and transmits A+B to both clients. The result looks as follows:

Network Coding diagram

Now client 1 can reconstruct data stream B by subtracting A from A+B, and client 2 can similarly reconstruct A by subtracting B from A+B. As a result we have satisfied both clients using only one transmission, with no delays.

Layering Signals

The key to network coding, of course, is how to combine and separate data streams. In a computer science context, with bits and bytes, we can define the layering as xor, such that A+B means A^B. This is extremely convenient, as with the properties of xor we can reconstruct the original streams with B=A^(A^B) and A=B^(A^B).

In a more generic “signals” context, you can think of A+B as layering two waves on top of each other, then extracting either wave by subtracting the other.

Security Benefits

Network coding has a number of advantages besides solving the butterfly routing problem, but one I mentioned already was security. Layering data with network coding provides defense against some types of eavesdropping attacks, because given either A or A+B it is impossible to extract B. This makes it potentially advantageous to fragment your data and send it over multiple channels, making full message recovery difficult for an attacker.