Saturday, November 9, 2013

I saw Tristan Slominski speak at JavaScript Austin on Kademlia DHT.

This talk was technically a talk on approaching distributed systems in the JavaScript space with Tristan Slominski making his way through a presentation touching on the Lambda Architecture of Nathan Marz, indeed.com's Boxcar, and course the fallacies of distributed computing which are:

  1. the network is reliable
  2. latency isn't a problem
  3. bandwidth isn't a problem
  4. the network is secure
  5. topology won't change
  6. the administrator will know what to do
  7. transport cost isn't a problem
  8. the network is homogenous
  9. the system is atomic/monolithic
  10. the system is finished
  11. business logic can and should be centralized

 
 

...but the reality was that while we skimmed other topics the evening focused on a heavy geek out on Kademlia DHT (distributed hash tables) and how the nodes within a DHT find each other and speak with each other. I'm not complaining though. It was very interesting to learn about how the technology driving BitTorrent worked under the hood. There is an exponential/logarithmic relationship between the number of nodes a system can support and the amount of neighboring nodes that anyone one node needs to know about. The variable k is specified to represent the number of comrades that any one participant knows of data of, and as this is a setting you drive in configuration, there is thus a need to really understand how DHTs work when using them. The data for the "friends" is stored at a node that has the friends. The snapshot may be old however. Here is how updating data works: A vector clock number gets incremented on each node when it is updated. When another node polls to make sure that its cache of what it thinks its friends have is up to date, it bases its need to update or not on the snapshot of the vector clock it has for a friend and the actual vector clock of the friend which may be greater. If it is greater, then the data cache at the polling node is updated. This concept is called gossip. Scuttlebutt reconciliation is the specific implementation of gossip here. Merkle trees are a way to nest hashes in hashes. These were mentioned too while phi accrual is for failure detection.

No comments:

Post a Comment