This article is part 4 of the Blockchain train journal, start reading here: Catching the Blockchain Train.
Oh no, a White Paper!
The crypto currency/blockchain world loves its white papers, and IPFS is no exception. It started with the famous Bitcoin: A Peer-to-Peer Electronic Cash System white paper by Satoshi Nakamoto (which in turn references to another white paper you don't want to know about). We will have a look at the Bitcoin white paper in a future post when we dive into crypto currencies.
Two things you need to know about white papers:
- They are distributed in PDF format (because it is harder to change than HTML!)
- white papers are hard, scary and contain at least one mathematical formula.
The IPFS white paper is here: IPFS - Content Addressed, Versioned, P2P File System (DRAFT 3). And, as it should, hosted on IPFS.
To understand how IPFS works it's a good idea to walk through this white paper step by step. I'll limit this post to chapter 2 of the white paper, having a look at the underlying technologies:
- Distributed Hash Tables
- Block Exchanges - BitTorrent
- Version Control Systems - Git
- Self-Certified Filesystems - SFS
Mapped on a ISO like stack:
OK, here we go!
Distributed Hash Tables (DHT)
What is a DHT?
A DHT is like a Python dict or Perl hash (if you have the key, you can retrieve the value), but the data is distributed over multiple nodes. The Wikipedia article Distributed hash table gives a good introduction.
In the case of IPFS, the key is a hash over the content. So ask an IPFS node for the content with hash QmcPx9ZQboyHw8T7Afe4DbWFcJYocef5Pe4H3u7eK1osnQ and the IPFS node will lookup in the DHT which nodes have the content.
How the specific value is found efficiently (fast, with a little as possible network requests) and how to manage the DHT so that changes (nodes that enter/leave the network, or new entries to the table) are absorbed easily is different for the various DHT implementations that exist.
Kademlia and friends
Back to the white paper. Three DHT implementations are mentioned explicitly, starting with Kademlia. Kademlia has a white paper of its own but let's not go there...
Kademlia is the DHT protocol that is used in almost all popular P2P systems and again the Wikipedia article on Kedemlia is a great introduction.
In short, Kedemlia uses the ID of the nodes to get step by step closer to the node with the desired hash (from the Wikipedia article):
When searching for some value, the algorithm needs to know the associated key and explores the network in several steps. Each step will find nodes that are closer to the key until the contacted node returns the value or no more closer nodes are found. This is very efficient: Like many other DHTs, Kademlia contacts only O(log(n)) nodes during the search out of a total of n nodes in the system.
The details can start to get quite complicated and I don't think it won't add to our goal of understanding IPFS, so we move on. If you like this sort of thing: this is a nice introduction and if you want to dig deeper the link to the Kademlia white paper is ^^^.
IPFS mentions two other DHT implementations that are added to standard Kedemlia:
- Coral DSHT: this improves the lookup performance and decreases resource use.
- S/Kademlia: makes Kademlia more resistant against malicious attacks.
The IPFS DHT in practice
The DHT is used in IPFS for routing, in other words:
- to announce added data to the network
- and help locate data that is requested by any node.
The white paper states:
Small values (equal to or less than 1KB) are stored directly on the DHT. For values larger, the DHT stores references, which are the NodeIds of peers who can serve the block.
So let's have a look if we can access the DHT directly and add and retrieve small data blocks.
$ ipfs daemon # make sure this runs (and install/setup ipfs first) $ echo 'my tiny text' | ipfs add # add content to the node, smaller than 1KB QmfQKcXMLGvCxx6opNDwb1ptD1LJER6MPHdsMHCB1CXpFF $ ipfs cat QmfQKcXMLGvCxx6opNDwb1ptD1LJER6MPHdsMHCB1CXpFF # just check my tiny text $ ipfs dht get /ipfs/QmfQKcXMLGvCxx6opNDwb1ptD1LJER6MPHdsMHCB1CXpFF # returns not found
So now we have that text there we would like to access it from the DHT, but apparently, the only
ipfs dht get request that is supported is for keys starting with
OK, and so we create an IPNS and see if we can query that directly:
$ ipfs name publish QmfQKcXMLGvCxx6opNDwb1ptD1LJER6MPHdsMHCB1CXpFF # point a IPNS address to our content Published to QmYebHWdWStasXWZQiXuFacckKC33HTbicXPkdSi5Yfpz6: /ipfs/QmfQKcXMLGvCxx6opNDwb1ptD1LJER6MPHdsMHCB1CXpFF $ ipfs resolve QmYebHWdWStasXWZQiXuFacckKC33HTbicXPkdSi5Yfpz6 # check it # never returns, hmm IPNS doesn't seem to be ready for production $ ipfs dht get /ipns/QmYebHWdWStasXWZQiXuFacckKC33HTbicXPkdSi5Yfpz6 # same thing but directly # does return binary data starting with 4/ipfs/QmfQKcXMLGvCxx6opNDwb1ptD1LJER6MPHdsMHCB1CXpFF
It sort of does work, but it is limited to IPNS for now it seems.
What else can we do with the DHT?
Use the DHT to find out which nodes can provide some data:
$ ipfs dht findprovs QmfQKcXMLGvCxx6opNDwb1ptD1LJER6MPHdsMHCB1CXpFF QmYebHWdWStasXWZQiXuFacckKC33HTbicXPkdSi5Yfpz6 QmRf4ERGvYpVo6HRa2VueZT8pWi8YvyLS3rW6ad2y83tdN # ^^^ there are both my nodes, so that works # Now ask the DHT for the address of the first peer that was returned $ ipfs dht findpeer QmYebHWdWStasXWZQiXuFacckKC33HTbicXPkdSi5Yfpz6 /ip4/22.214.171.124/tcp/4001 /ip4/126.96.36.199/tcp/38689 /ip4/127.0.0.1/tcp/4001 /ip4/192.168.1.30/tcp/4001 /ip6/::1/tcp/4001 /ip4/192.168.1.3/tcp/4001 /ip4/188.8.131.52/tcp/40443 # Oooh nice! We will visit this address notation in the next episode.
All in all, we saw a bit of the DHT in action, but not as intuitive and functional as I hoped for.
Time to move on to the next exciting technology applied by IPFS:
How does BitTorrent work?
We all know about BitTorrent, but some of us (yeah, me too) need to dig deeper to really get it. This presentation is a great introduction: Feross Aboukhadijeh: WebTorrent - JSConf.Asia 2014. The speaker talks about how he implemented a BitTorrent client that can run in the browser, thanks to WebRTC (which is a cool technology that might come in handy later on for our project). It also explains the working of the DHT again to drive it home.
More to read or watch:
- Peer-to-peer networking with BitTorrent (PDF). If you read only one article, make it this one.
- Peer to Peer Content Delivery - BitTorrent (video) + the slides (PDF).
- The BitTorrent Protocol Specification v2 (the spec, not hard to read).
BitTorrent and IPFS
The exchange of data (blocks) in IPFS is inspired by BitTorrent but is not 100% BitTorrent. The white paper mentions two BitTorrent features that IPFS uses:
- tit-for-tat strategy (if you don't share, you won't receive either)
- get rare pieces first (improves performance and more, see the first PDF above)
A notable difference is that where in BitTorrent each file has a separate swarm of peers (forming a P2P network with each other) where IPFS is one big swarm of peers for all data. The IPFS BitTorrent variety is called BitSwap and I'll go over that in the next episode.
Let's try if we can see something of the swarm in action.
# Make sure you have the daemon running $ ipfs swarm peers /ip4/184.108.40.206/tcp/4001/ipfs/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ /ip4/220.127.116.11/tcp/4001/ipfs/QmSoLju6m7xTh3DuokvT3886QRYqxAzb1kShaanJgW36yx /ip4/18.104.22.168/tcp/4001/ipfs/QmSoLSafTMBsPKadTEgaXctDQVcqN88CNLHXMkTNwMKPnu /ip4/22.214.171.124/tcp/4001/ipfs/QmSoLMeWqB7YGVLJN3pNLQpmmEk35v6wYtsMGLzSr5QBU3
When you just start the daemon you get connected to a couple of "seed peers". After a while the number of peers grows rapidly.
Another command to query the swarm:
$ ipfs swarm addrs QmNRuQrwtGgeVQgndTny4A4Rukg7GR7vzDJrVJxUBfqevk (4) /ip4/127.0.0.1/tcp/4001 /ip4/172.31.41.39/tcp/4001 /ip4/126.96.36.199/tcp/4001 /ip6/::1/tcp/4001 QmNSJohkvvBVmHeN7KUZnm4X84GA6Znbv6ZmvsTAjbw3AB (5) /ip4/10.0.1.8/tcp/4001 /ip4/127.0.0.1/tcp/4001 /ip4/188.8.131.52/tcp/16012 /ip6/::1/tcp/4001 /ip6/fd15:462e:f8fd:695e:9161:27dd:7f78:d242/tcp/4001 QmNTJyhCYcbv5GdnqtdEwTfJCgY6pV4PJiNTtAmuoxnQak (3) /ip4/127.0.0.1/tcp/4001 /ip4/184.108.40.206/tcp/4001 /ip6/::1/tcp/4001 QmNTZy7TfXvsHczwwV3EYbxRZN5gthgicG9GiroD7C4ZrP (4) /ip4/127.0.0.1/tcp/4001 /ip4/172.20.255.127/tcp/4001 /ip4/220.127.116.11/tcp/4001 /ip6/::1/tcp/4001 . . .
These are the addresses in the swarm the node knows of, the hash on top is the peerId.
With this information, you can connect to a peer like
$ ipfs swarm connect /ip4/18.104.22.168/tcp/34746/ipfs/QmfTgdg6GkqJtUrWAYo69GjcLrjQq9LjTjgW3KZ1ux1X6U connect QmfTgdg6GkqJtUrWAYo69GjcLrjQq9LjTjgW3KZ1ux1X6U success
More about BitSwap later...
Let's take a break
Pfew, that was a lot of theory and dry material, and we are not even halfway! So to let it all sink in and to free up the brain for the rest go for a walk in the park and listen to this podcast: The Quiet Master of Cryptocurrency — Nick Szabo. It goes all over the place, but I found it very interesting. It zooms out from bit level to things like "what is money?".
Highly recommended listening, and when you're done...
Version Control Systems - Git
The white paper's entire section on Version Control Systems is reproduced here:
Version Control Systems provide facilities to model files changing over time and distribute different versions efficiently.
The popular version control system Git provides a powerful Merkle DAG object model that captures changes to a filesystem tree in a distributed-friendly way.
- Immutable objects represent Files (blob), Directories (tree), and Changes (commit).
- Objects are content-addressed, by the cryptographic hash of their contents.
- Links to other objects are embedded, forming a Merkle DAG. This provides many useful integrity and workflow properties.
- Most versioning metadata (branches, tags, etc.) are simply pointer references, and thus inexpensive to create and update.
- Version changes only update references or add objects.
- Distributing version changes to other users is simply transferring objects and updating remote references.
Let's step through these lines one by one to understand it:
Ad 1. Immutable objects represent Files (blob), Directories (tree), and Changes (commit).
Git only adds data. So blobs, trees, and commits are immutable. What the last version of either of these is, is determined by special references (see point 4 and 5).
Ad 2. Objects are content-addressed, by the cryptographic hash of their contents.
In git, the file or directory name is not used when they are referenced. Git hashes the content (or listing or commit) with SHA1 and uses these hashes in its database. This article makes it very insightful how that works: Git under the hood (Start reading at "Now let’s look at how Git does all this")
Ad 3. Links to other objects are embedded, forming a Merkle DAG. This provides many useful integrity and workflow properties.
Oh my, a Merkle DAG! No worries, it is actually not as scary as it sounds.
A Merkle tree is a binary tree where the parent contains the hash of the concatenation of the hashes of the two children. This explains the integrity property: any change in a data block results in a change of the root node. With just a little bit of meta-data (uncles and parents, which can be untrusted) and a trusted root node, we can verify the validity of the block.
In short: the Merkle DAG is more general as it is not a binary tree but a graph, and any node can contain data, not just the leaf nodes as in the Merkle Tree.
It's still all a bit vague, but I'll address it again when we have a look at the IPFS Merkle DAG.
Ad 4. and 5 Most versioning metadata (branches, tags, etc.) are simply pointer references, and thus inexpensive to create and update. Version changes only update references or add objects.
This is visualized here: Git for Computer Scientists where the branch, the HEAD, and tags are simply references to commits.
Ad 6. Distributing version changes to other users is simply transferring objects and updating remote references.
What it says here, no need to explain I think :)
Self-Certified Filesystems - SFS
This is used to implement the IPNS name system for IPFS. It allows us to generate an address for a remote filesystem, where the user can verify the validity of the address.
From the white paper:
SFS introduced a technique for building Self-Certified Filesystems: addressing remote filesystems using the following scheme/sfs/<Location>:<HostID>
where Location is the server network address, and:HostID = hash(public_key || Location)
Thus the name of an SFS file system certifies its server.
I think that speaks for itself, we will see it in action in the next episode: Understanding the IPFS White Paper part 2
Let me know what you think of this post by tweeting to me @pors!