Distributed hash table (DHT) in IPFS

The English part is extracted from the official documents, while the Chinese part is based on understanding translation

Distributed hash tables (DHTs)

To find which peers are hosting the content you're after (discovery), IPFS uses a distributed hash table, or DHT. A hash table is a database of keys to values. A distributed hash table is one where the table is split across all the peers in a distributed network. To find content, you ask these peers.

To find out which node holds what you want (this process is called discovery), IPFS uses a distributed hash table (DHT). A hash table is a database that stores value keys. The distributed hash table is to split the table to all peers in the distributed network. To find the content, you can ask these peers.

Once you know where your content is (or more precisely, which peers are storing each of the blocks that make up the content you're after), you use the DHT again to find the current location of those peers (routing). So, in order to get to content, you use libp2p to query the DHT twice.

Once you know where your content is (or, more precisely, which nodes are storing every chunk of data you want), you can use DHT again to find the current location of these nodes (this process is called routing). Therefore, in order to get the content, we need to query DHT twice with libp2p.

Distributed Hash Tables (DHTs) are distributed key-value stores where keys are cryptographic hashes.
DHTs are, by definition, distributed. Each "peer" (or "node") is responsible for a subset of the DHT.
When a peer receives a request, it either answers it, or the request is passed to another peer until a peer that can answer it is found.

Distributed hash tables (DHTs) are distributed key value pair stores, where the key is the encrypted hash value.
By definition, DHT is distributed. Each node is responsible for a subset of DHT.
When a node receives a request, it either answers the request or passes the request to another node until it finds the node that can answer the request.

Depending on the implementation, a request not answered by the first node contacted can be:

  • forwarded from peer to peer, with the last peer contacting the requesting peer

  • forwarded from peer to peer, with the answer forwarded following the same path

  • answered with the contact information of a node that has better chances to be able to answer. IPFS uses this strategy.

According to the implementation, if a request is not answered by the first connected node, it will:

  • Forward from one node to another until the last node is connected to the request node
  • Forward from one node to another, and the response is forwarded according to the same path
  • Other nodes that are better able to respond to this request will be used as responses. IPFS uses this strategy.

DHTs' decentralization provides advantages compared to a traditional key-value store, including:

  • scalability, since a request for a hash of length n takes at most log2(n) steps to resolve.

  • fault tolerancevia redundancy, so that lookups are possible even if peers unexpectedly leave or join the DHT. Additionally, requests can be addressed to any peer if another peer is slow or unavailable.

  • load balancing, since requests are made to different nodes and no unique peers process all the requests.

Compared with the traditional key value storage, the advantages of DHTs are as follows:

  • Scalability, because a request for a hash of length n requires a maximum of log2 (n) steps to resolve.
  • Fault tolerance is realized by redundancy, so that the nodes can be searched even if they leave or join DHT unexpectedly. In addition, if the other node is slow or unavailable, a request can be sent to any peer.
  • Load balancing, because requests are made to different nodes, and there is no unique peer to handle all requests.

Peer IDs

Each peer has a peerID, which is a hash with the same length n as the DHT keys.

Each node has an ID, called peerID, which is a hash with the same length as the key in DHT.

Buckets

A subset of the DHT maintained by a peer is called a 'bucket'.
A bucket maps to hashes with the same prefix as the peerID, up to m bits. There are 2^m buckets. Each bucket maps for 2^(n-m) hashes.

A small part of the DHT maintained by each node is called bucket
Bucket maps hash values, which have the same (up to m) prefixes as peerID. There are 2^m buckets in total. Each bucket can map 2^(n-m) hash values.

For example, if m=2^16 and we use hexadecimal encoding (four bits per displayed character), the peer with peerID 'ABCDEF12345' maintains mapping for hashes starting with 'ABCD'.
Some hashes falling into this bucket would be ABCD38E56, ABCD09CBA or ABCD17ABB, just as examples.

//'m=2^16 I think the official may be wrong. It should be m=16`

For example, if m=2^16 and we use hexadecimal encoding (four digits per display character), a node with peerID of "ABCDEF12345" will maintain a hash map starting with "ABCD".
Some of the hash values that fall into this bucket will be ABCD38E56, ABCD09CBA, or ABCD17ABB.

The size of a bucket is related to the size of the prefix. The longer the prefix, the fewer hashes each peer has to manage, and the more peers are needed.
Several peers can be in charge of the same bucket if they have the same prefix.
In most DHTs, including IPFS's Kademlia implementation, the size of the buckets (and the size of the prefix), are dynamic.

The size of the bucket depends on the size of the prefix. The longer the prefix, the fewer hash values each node needs to manage, and the more nodes it needs.
If multiple nodes have the same prefix, they may be responsible for the same bucket.
In most DHT s, including the Kademlia implementation of IPFS, the bucket size (and prefix size) is dynamic.

Peer lists

Peers also keep a connection to other peers in order to forward requests if the requested hash is not in their own bucket.

The node maintains connections with other nodes to forward requests whose hash value is not in its own bucket.

If hashes are of length n, a peer will keep n-1 lists of peers:

  • the first list contains peers whose IDs have a different first bit.
  • the second list contains peers whose IDs have first bits identical to its own, but a different second bit
    ...
  • the m-th list contains peers whose IDs have their first m-1 bits identical, but a different m-th bit
    ...

If the hash length is n, then a node has n-1 node list:

  • The first list contains nodes with different ID s
  • The ID of the node in the second list is the same in the first place and different in the second place
    ...
  • The m-1 bits in front of the ID of the node contained in the m-th list are the same, and the m-th bits are different
    ...

The higher m is, the harder it is to find peers that have the same ID up to m bits. The lists of "closest" peers typically remains empty.
"Close" here is defined as the XOR distance, so the longer the prefix they share, the closer they are.
Lists also have a maximum of entries (k) — otherwise the first lists would contain half the network, then a fourth of the network, and so on.

The higher m, the harder it is to find a node with the same ID (up to m bits). The list of recent nodes is usually left empty.
Here "near" is defined as exclusive or distance, so the longer the prefixes they share, the closer they are.
The list has an entry limit of up to k - otherwise the first list will contain half the network's nodes, then a quarter, and so on.

How to use DHTs

When a peer receives a lookup request, it will either answer with a value if it falls into its own bucket, or answer with the contacting information (IP+port, peerID, etc.) of a closer peer. The requesting peer can then send its request to this closer peer. The process goes on until a peer is able to answer it.
A request for a hash of length n will take at maximum log2(n) steps, or even log2m(n).

When a node receives a search request, if it is in its own bucket, it will return the corresponding value, otherwise it will return the connection information (IP + port, peerID, etc.) of a closer node. The requesting node can then send its request to this closer node. This process continues until a certain node can answer.
A hash request of length n requires at most log2 (n) steps, or even log2m (n) steps.

Keys and Hashes

In IPFS's Kademlia DHT, keys are SHA256 hashes. PeerIDs are those of libp2p, the networking library used by IPFS.

We use the DHT to look up two types of objects, both represented by SHA256 hashes:

  • Content IDs of the data added to IPFS. A lookup of this value will give the peerIDs of the peers having this immutable content.
  • IPNS records. A lookup will give the last Content ID associated with this IPNS address, enabling the routing of mutable content.

We use DHT to find two types of objects, each represented by a SHA256 hash:

  • The content ID of the data added to IPFS. A lookup to this value will give the peerID of the node with this immutable content.
  • IPNS records. This lookup will give the latest content ID associated with this IPNS address, enabling routing of variable content.

Consequently, IPFS's DHT is one of the ways to achieve mutable and immutable content routing. It's currently the only one implemented.

Therefore, the DHT of IPFS is a method to realize variable content and immutable content routing, and it is also the only method at present.

Published 3 original articles, won praise 1, visited 444
Private letter follow

Tags: network Database encoding

Posted on Fri, 13 Mar 2020 01:18:20 -0400 by gman-03