CS 3410: Distributed Systems

Spring 2018 Paper (due Thursday) Topic Project Presentations
Jan 8–12 1. Google File System the internet Go tour
Jan 15–19 (MLK Day) 2. Bigtable Go, RPC
Jan 22–26 3. Case study: Google peer-to-peer Go exercises
Jan 29–Feb 2 4. Chord concurrency Chat service
Feb 5–9 5. Dynamo Chord: linked list ring
Feb 12–16 6. Borg Chord: finger tables
Feb 19–23 (Presidents’ Day) 7. Case study: Facebook containers Chord: fault tolerance
Feb 26–Mar 2 8. Paxos
Mar 5–9 9. Chubby consensus Paxos: shell, RPC server, etc. Kademlia, Session Guarantees
Mar 12–16 (Spring break)
Mar 19–23 10. Megastore Paxos: all but propose Byzantine Generals, CRUSH
Mar 26–30 11. MapReduce Paxos —, Snapshots
Apr 2–6 12. Case study: Twitter databases Petal, CAP
Apr 9–13 13. Spanner Clocks, FLP
Apr 16–20
Apr 23–27 (Wednesday last day) MapReduce

Resources


Projects

Code to discover your own IP address. This does not work in all cases, but it is a useful starting point:


Papers

  1. The Google File System
  2. Bigtable: A Distributed Storage System for Structured Data
    • What does data look like from the client’s perspective, i.e., what are rows, column families, and columns?
    • How are responsibilities shared between master, tablet servers, and clients? What happens when each of these components fails and/or restarts?
    • How does Bigtable support small, random reads and writes when the data is stored in GFS, which optimizes for large, sequence reads and writes? Consider the interaction of SSTables (with compactions), memtables, and redo logs.
    • How is data stored on disk and in memory for efficient access? Consider the sequence of events for each of these cases:
      • Random data reads
      • Sequential data reads
      • Random data writes
      • Sequential data writes
    • What is: a group commit, a Bloom filter, a commit log
  3. Case study: Google
  4. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications
  5. Dynamo: Amazon’s Highly-available Key-value Store
  6. Large-scale cluster management at Google with Borg
  7. Case study: Facebook
  8. Paxos
  9. The Chubby lock service for loosely-coupled distributed systems
  10. Megastore: Providing Scalable, Highly Available Storage for Interactive Services
  11. MapReduce: Simplified Data Processing on Large Clusters
  12. Case study: Twitter
  13. Spanner: Google’s Globally-Distributed Database

Presentations

First presentation:

Final presentation:

Last Updated 04/23/2018