Project 3: Paxos

Introduction

In this assignment, you will implement a simple in-memory database that is replicated across multiple hosts using the Paxos distributed consensis protocol.

You can download my 64-bit Linux implementing to play with:

Save the file as paxos-sol and make it executable:

chmod 755 paxos-sol

The launch it using:

./paxos-sol 3410 3411 3412

This will launch a single instance that will listen on port 3410 (the first one listed) and will expect to find additional instances listening on ports 3411 and 3412, respectively. You can then launch additional instances:

./paxos-sol 3411 3410 3412
./paxos-sol 3412 3410 3411

You can also launch the instances on different machines by supplying IPv4 addresses and port numbers in the format 1.2.3.4:3410

Requirements

Your system must implement the services to support Paxos, and it must support a simple command-line user interface. Each instance will run an RPC server and act as an RPC client. No background tasks will be necessary.

User interface

When running, and instance of your database should support a simple command-line interface. The user can issue any of the following commands:

The main commands you will use are those related to finding and setting key/value pairs in the database. A <key> is any sequence of one or more non-space characters, as is a value.

In addition, another operation is useful mainly for debugging:

The executable also supports a couple of command-line options:

Replicas

Each instance participating in the system is called a replica. All of the replicas together constitute a cell. Each replica tracks a series of slots. Each slot represents a single command that should be applied to the database (a put, get, or delete operation). The goal of the system is to ensure that every replica agrees on the same operation in each slot, and applies them to the database in the same order. Since the database starts out empty and the operations are all deterministic, this ensures that each replica has the exact same database when the same number of slots have been applied.

Because of asynchronous messages, a replica may discover the decision for slot six before it has discovered the decision for slot five. If this is the case, it must wait until it learns the decision for slot five and apply it before applying the decision from slot six.

Note that get operations are synchronized just like put and delete operations, so the cell must agree on the point at which a get operation actually reads the database.

For simplicity, each replica should track a complete history if slots over its entire run. It should never garbage collect old slots. It is up to you to manage the process of applying operations to the database. Note that a simple Go map will be sufficient for holding the database itself.

Paxos

Each replica will implement all necessary roles in the Paxos algorithm.

Acceptor role

The simplest ones to describe and understand are those associated with the acceptor role. Each of these operations should be applied to a specific slot number, which should be part of the request data. As mentioned earlier, it is possible that the protocol will be working on multiple slots concurrently.

The sequence numbers (called seq above) should be pairs of values, each consisting of a number and an address. I recommend writing methods on this type to compare two sequence numbers, and also to convert them to strings for printing, e.g.:

func (elt Seq) String() string { ... }

Any type that has such a method can be printed easily using the various log and fmt commands.

For comparisons, define something like:

func (this Seq) Cmp(that Seq) int { ... }

where Cmp returns <0 if this<that, >0 if this>that, and 0 if they are equal.

Learner role

The learner role is even simpler:

Applying commands is generally straightforward. I recommend storing commands using a struct that tracks:

Methods on these command objects to convert them to strings and compare them to each other (if the address and tag match, they should be considered equal) are also useful.

When a user enters a command from the shell interface, the command may not be executed until later, since it must wait until the command is decided upon in some slot. That decision comes in the form of a decide message, so you must have some way to communicate that it succeeded back to the user. I suggest doing the following:

  1. When you issue a command, assign it a randomly-generated tag value to track it.

  2. Create a string channel with capacity 1 where the response to the command can be communicated back to the shell code that issued the command.

  3. Store the channel in a map associated with the entire replica. It should map the address and tag number (combine these into a string) to the channel.

  4. Whenever a decision is applied, check if the command has a channel waiting for it by generating the same address/tag key and looking it up. If found, send the result across the channel, then remove the channel from the map and throw it away. If not channel is found, do nothing; the command was proposed on a different replica.

Proposer role

The proposer is a little more complicated. It is given a command to execute, and it must repeatedly propose it (in progressively higher slot numbers) until the command is decided upon by the cell.

It always tries to propose in the first undecided slot that it knows of. Note that it may be out-of-sync; the slot may already be decided, but this replica may now know about it.

It starts a single round of Paxos by picking a value n. If it has already attempted to propose something in this slot, or the acceptor handlers on the same replica have already received prepare and/or accept requests for this slot, it will know of a higher value of n than one. Otherwise it can start with one (combined with its address).

Each time you start over, check to see if the slot has been decided. While you were running through the steps, a decision may have been reached and you may have been notified about it. In this case, you must move on to the next undecided slot before you start over. Or, if the value decided (while you slept) is the value you were trying to propose, you should not re-propose it in another slot.

Testing

Here are a few suggestions of scenarios to test and what to expect.