Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log repair #4

Merged
merged 7 commits into from
Apr 21, 2014
Merged

Log repair #4

merged 7 commits into from
Apr 21, 2014

Conversation

quux00
Copy link
Collaborator

@quux00 quux00 commented Apr 19, 2014

I now have the leader_loop refactored sufficiently that I'm reading to start the log repair back and forth conversation between leader and follower.

A few important changes so far:

  1. "c" and "p" channels pulled out of Server struct as I describe below
  2. leader_loop now logs a client cmd only after majority-1 peers have already logged it; I think this is still consistent with the Raft spec and allows avoiding having the leader truncate its logs. Since one of the rules is "the leader only ever appends" to its log, this looks correct to me.
  3. leader loop now has two selects:
    • first over events (from network_listener) vs. aeresponses from peers (those not needed to make a majority to signal back to client)
    • second embedded select is over aresponses and timeout when waiting to respond to client as to whether a command was successfully logged in some defined time window
  4. I added a placeholder "else if" clause in leader_loop for when it receives AERequests from peers => meaning another peer thinks it is now leader, but the code that handle that use case is completely missing, so that's something someone can work on

This is the prelude for starting on the log repair conversation that happens between follower and leader, which will be done next.

Change of Server struct

I hit an ownership/borrow problem with the "c" and "p" Event Sender and Receiver channels embedded in the Server struct. In this branch I have pulled them out.

Now I create the Sender/Receiver channel pair in the run method and pass the Receiver into the serve_loop method:

let (chsend, chrecv): (Sender<~Event>, Receiver<~Event>) = channel();      
let conx_str = format!("{}:{:u}", &self.ip, self.tcpport);
let svr_id = self.id;
spawn(proc() {
    network_listener(conx_str, chsend, svr_id);
});

self.serve_loop(chrecv);

and then pass it into the serve_loop submethods that need it (so far)

    fn serve_loop(&mut self, event_recvr: Receiver<~Event>) {
        debug!("Now serving => loop until Stopped");
        loop {
            match self.state {
                Follower     => self.follower_loop(&event_recvr),
                Candidate    => self.candidate_loop(),
                Leader       => self.leader_loop(&event_recvr),
                Snapshotting => self.snapshotting_loop(),
                Stopped      => break
            }
            std::io::timer::sleep(1);
        }
        debug!("Serve_loop END");
    }

Let me know if this will conflict with any changes you are making. This model also allows us to use the select! macro rather than having to manually code with the Select impl.

quux00 added 7 commits April 19, 2014 09:12
…e logged it

This saves having to truncate leader logs on failed attempt to handle client cmds.
I think this is consistent with the raft spec, which is vague on the ordering here.
Also started to handle the "rejection scenario" where the AEReq is actively rejected
by a peer because it's log is out of date, but long ways still to go on log repair.
Cosmetic/comment changes to log.rs and peer.rs
This is to properly determining when a (majority-1) of peers have
logged the cmd in question.  log repair still not (fully) implemented.
Also refactored leader_loop a slight bit - pulled out some code to
other helper methods.
This would be for cases where another peer claims to be implemented.
This case is NOT yet handled at all in the code.  Just putting a placeholder
for where the logic goes.
Event Sender/Receiver now being passed into server_loop and submethods.
Added select at top of leader_loop to choose between event receiver and
AEResponse receiver.  Passes existing tests.  Need to refactor/clean up
and then add more/better tests.  Log repair/catch up not yet implemented,
but placeholder section exists in the code.
Created ldr_process_cmd_from_client, pulling code out of leader_loop.
process_aeresponse_from_peer now used in two places to make code more DRY
Also changed Makefile to have all: server as first target to match
pull request.
@nathantypanski
Copy link
Collaborator

wrt 2, sec. 5.3 Raft spec says:

In the simple case of a leader replicating entries from its current term, a log entry is committed once it is stored on a majority of servers [...] The leader keeps track of the highest index it knows to be committed, and it includes that index in future AppendEntries RPCs (including heartbeats) so that the other servers eventually find out.
its local state machine (in log order).

So it seems like the right behavior - "majority of servers" and all. I question why you chose majority-1 - perhaps figuring that the leader would then make it the majority - but excluding that bit it is a useful change.

I read over the rest of the changes, and barring the fact that we seem to have added more unsafe {} code (which might be inevitable, I am not sure) it seems good to me. I haven't really dug into this codebase yet and tried modifying things, so none of this will harm any work I'm doing.

👍 from me.

@quux00
Copy link
Collaborator Author

quux00 commented Apr 20, 2014

Michael: leader_loop now logs a client cmd only after majority-1 peers have already logged it

Nathan: I question why you chose majority-1 - perhaps figuring that the leader would then make it the majority

That is correct, the leader logs it as soon as majority-1 peers have signaled that they logged it, thus making a majority.

The Raft spec states:

Each client request contains a command
to be executed by the replicated state machines. The
leader appends the command to its log as a new entry, then
issues AppendEntries RPCs in parallel to each of the other
servers to replicate the entry. When the entry has been
safely replicated (as described below), the leader applies
the entry to its state machine and returns the result of that
execution to the client.

Because we haven't yet defined the StateMachine to be anything other than the commit_idx, in order to stay consisent with the "Leader Append-Only" rule (p. 5 Raft spec), I had to change the order specified above. Only after majority-1 peers have logged it, does the leader log it. If that logging works, then it responds with OK/success to the client.

barring the fact that we seem to have added more unsafe {} code

The only unsafe code is for the Select handling, which is required for reasons I'm not clear about. Usually this is hidden inside the select! macro (which uses these unsafe blocks), but the select! macro has two flaws that makes it hard to use in all circumstances:

  1. it can't be used with a channel that is in a struct (Cannot use select! with Receiver contained in a struct rust-lang/rust#12902)
  2. it can't be used with a borrowed reference to a channel, because it blindly assumes you have an owned pointer and then takes a reference to it, so if you pass it a reference it tries to reference the reference which blows up at compile time.

@nathantypanski
Copy link
Collaborator

Oh, unsafe {} is needed because add() on a Handle is unsafe. I don't know how to link to docs besides linking to Master, but link.

This method is unsafe because it requires that the Handle is not moved while it is added to the Select set.

@lenary lenary merged commit 2f95535 into master Apr 21, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants