A not-so-prefix prefix
[This post (as well as some future posts) is based on the last piece of my PhD work. I didn't have enough time to write a formal and proper paper before I graduated. I wrote it down as one chapter in my dissertation though. My advisor briefly talked about publishing it in some conference before I left for work. And I indeed gave it some thoughts. But I clearly know it's not worth the effort. First, the contribution is negligible. This is not the first work on improving TCP in a wireless context by incorporating network coding. And, the simulation work is quite immature, and I have no interest in writing a better simulation. I'd love to have a real implementation that can work on real system, but that's gonna take some time, which I don't have. Second, there indeed is some effort involved in just "publishing a paper". I'm not talking about coming up with an idea, get it prototyped or implemented, and then write it down as a paper. I'm just talking about the whole "publish" thing. And for me, there is simply no benefit out of this. Hence, honestly, I actually have no interest in publishing a paper at all. So, this is what I'm gonna do. I'll extract this part out of my dissertation, remove the "dependencies" on other chapters in my dissertation, and put it on arxiv.org. And during the process I will also talk about it here, in this blog.]
I'd assume everyone knows how TCP works. If you don't, you can always wiki it, or read some classical books on this topic (e.g., the must-read TCP/IP Illustrated Vol. 1), and Stanford has an open course on computer networks every now and then. I really encourage everyone interests in this topic to enroll in this course. Even if you already know so much about networks and TCP/IP, this course will give you some new perspective. And it's offer by Nick McKeown.
For this post, you need to know some basics of TCP congestion control algorithms. The rule of thumb in such algorithms is the AIMD mechanism: additive increase, multiplicative decrease. In simple words, when there is no congestion detected by the sending side of a TCP stream, the size of the congestion window, which contains all the data that the sender can feel free to push on to the network, will increase linearly. But when TCP thinks there is congestion in the network, it halves the congestion window, which in effect slows down the sending rate exponentially. As a result, the sending rate, or the congestion window of a TCP traffic looks like a sawtooth over time, something like the following one:
The issue comes from the "MD" part of AIMD, and how TCP "recognize" if there is a congestion. TCP thinks there is a congestion in the network (or more accurately somewhere along the path from source to destination) if it observes packets loss. It makes lots of sense back to day when TCP was created. It was long before we have Wi-Fi and 3G/4G around us all the time. But as we live in a wireless world not, the assumption that packet loss can only be caused by congestion is hardly right. And when you lose packets during wireless transmission, whether it's because of interference or signal errors, the TCP source shouldn't reduce the sending rate, because, again, there is no real congestion in the picture. Then of course, such behavior in TCP would lead to inferior performance . If we ever move to a multi-hop wireless world (apparently some already did), it's gonna be even worse.
Enters Network Coding
It's impossible to explain the whole network coding thing in a blog post. Again, you can google it, wiki it and there are books dedicated to this topic. This is a research topic significantly younger than TCP, somewhat younger than Wi-Fi, and actually slightly older than iPhone. But it's not an iPhone kind of thing. It's not in everyone's pocket, though I'd argue if it were, we'd live in a much better world. It's a sub-field in information theory and computer networks.
Here is my go-to explanation on network coding to anyone who doesn't know it yet, but knows one thing or two about networks. In abstract, all the current networking devices (including your laptop and smartphone which delivers network packets from itself to the AP in your apartment) arguably do only two things: store and forward. Of course this is a lie. They also do queueing, QoS, access control and firewall. They also exchange some routing information among themselves, and then update the routing table afterwards. But those are, again, all about store and forward. They decide how to store, what to store, what to forward and where to forward. Essentially, again, all the networking devices do two things: store and forward. There is almost no manipulation on the packet payload itself.
Network coding wants to changes this. Or, let's say, one genius in our age, German mathematician Rudolf Ahlswede, wanted to change this. In 1998 he decided it's time to re-write the history of information theory, communication and networks. And he did it, by publishing this seminal paper namely <Network Information Flow>. In this work, he et al. proved that by allowing the network (a.k.a., the intermediate nodes between a source and a destination) to do some "manipulation" on the packets, the throughput can be increased. The manipulation they talked about was encoding and decoding.
In their paper, they came up with a "butterfly network" to intuitively demonstrate why routing is inferior to network coding in multicast. My go-to example on illustrating network coding is a much simpler scenario, with the very famous Alice, Bob and Chloe. This example relies on wireless networking, but is much simpler to understand.
Suppose Alice wants to send a bit 1 to Bob. And Bob wants to send a bit 0 to Alice. Unfortunately they are a little far from each other. In the middle of them, there sits Chloe who's nice enough to do them a favor by playing the "router" role. So this is what would happen without network coding.
(1) Alice sends 1 to Chloe.
(2) Chloe sends 1 to Bob.
(3) Bob sends 0 to Chloe.
(4) Chloe sends 0 to Alice.
The order of (2) and (3), of course, is exchangeable. But we can do better than this, if they all know some mathematics, and are allowed to manipulate the packet:
(1) Alice sends 1 to Chloe. Chloe holds on to it.
(2) Bob sends 0 to Chloe. Chloe holds on to it.
(3) Chloe calculate 1 XOR 0 = 1, and broadcast this result.
Now Alice and Bob each do another XOR operation between their original data, and the data they receive. Then Alice will get 0, and Bob will get 1. So they each get the designated message. And thus we saved one transmission.
Now the real research on network coding is a lot more complex than exchanging bits with some magic from a router. In the past 10 - 20 years many people from the research community worked on this topic. Some of them are just like Dr. Ahlswede. They are mathematicians who sees network as a mathematical model. They work on different coding algorithms and try to work out which works better in what context. And some are networking people. They work on the practical side of network coding. For example, in reality, Chloe shouldn't hold on to any packets just to create a coding opportunity. The delay incurred by this is definitely something you want to avoid. So how to achieve the same throughput without deliberately holding on to a packet? And how does Chloe even know which two packets should be encoded together if there are more people around her? Those are the questions the networking people try to answer. In the past decade we saw many papers published in this area. Many fundamental problems on both the theoretical side and practical side of network coding got solved. But here is one issue: almost none of those works successfully got out of research labs.
Until someone says, let's make TCP better in the wireless networks with some help from network coding.
I will end this piece of post right here. :)
(To be continued)