CS 3410: Distributed Systems

Spring 2019 Paper (due Wednesday) Topic Project
Jan 7–11 1. Google File System the internet Go tour
Jan 14–18 2. Bigtable Go, RPC
Jan 21–25 (MLK Day) 3. Case study: Google peer-to-peer MUD: command interpreter
Jan 28–Feb 1 4. Chord concurrency MUD: single user
Feb 4–8 5. Dynamo MUD: multi user
Feb 11–15 6. Borg Chord: linked list ring
Feb 18–22 (Presidents’ Day) 7. Case study: Facebook containers Chord: finger tables
Feb 25–Mar 1 8. Paxos Chord: fault tolerance
Mar 4–8 9. Chubby consensus Paxos: shell, RPC server, etc.
Mar 11–15 (Spring break)
Mar 18–22 10. Megastore Paxos: all but propose
Mar 25–29 11. MapReduce Paxos
Apr 1–5 12. Case study: Twitter databases
Apr 8–12 13. Spanner
Apr 15–19
Apr 22–26 (Wednesday last day) MapReduce

Resources


Projects

Code to discover your own IP address. This does not work in all cases, but it is a useful starting point:


Papers

  1. The Google File System
  2. Bigtable: A Distributed Storage System for Structured Data
    • What does data look like from the client’s perspective, i.e., what are rows, column families, and columns?
    • How are responsibilities shared between master, tablet servers, and clients? What happens when each of these components fails and/or restarts?
    • How does Bigtable support small, random reads and writes when the data is stored in GFS, which optimizes for large, sequence reads and writes? Consider the interaction of SSTables (with compactions), memtables, and redo logs.
    • How is data stored on disk and in memory for efficient access? Consider the sequence of events for each of these cases:
      • Random data reads
      • Sequential data reads
      • Random data writes
      • Sequential data writes
    • What is: a group commit, a Bloom filter, a commit log
  3. Case study: Google
  4. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications
  5. Dynamo: Amazon’s Highly-available Key-value Store
  6. Large-scale cluster management at Google with Borg
  7. Case study: Facebook
  8. Paxos
  9. The Chubby lock service for loosely-coupled distributed systems
  10. Megastore: Providing Scalable, Highly Available Storage for Interactive Services
  11. MapReduce: Simplified Data Processing on Large Clusters
  12. Case study: Twitter
  13. Spanner: Google’s Globally-Distributed Database

Presentations

Last Updated 04/21/2019