MIT6.824 (lab3a kV storage)

When the dead wood is in spring, it is not luxuriant,
Young and cherish the people around the mirror.

Write in front

Today's exam was fairly smooth, but today's me, my heart is inexplicably uncomfortable, uncomfortable? If you feel uncomfortable, please write a blog. 3A requires the implementation of distributed kv server based on Raft

Implementation process

requirement:

  1. Single client, serial
  2. Multiple clients can initiate requests concurrently
  3. Once a client reads a new value, other clients should also read the value

Because of their own LAB3 code reference This brother This article is only used for personal review + learning. You can directly jump to the link to learn.

type Clerk struct {
	mu sync.Mutex
	servers []*labrpc.ClientEnd
	// You will have to modify this struct.
	// You only need to ensure that the clientId is not repeated every time you restart, so clientId + seqd is safe
	clientId int64	// Client unique ID
	seqId int64	// The client monotonically incremented request id
	leaderId int
}

Seqd reflects the idempotency of a write request. If a write request has actually been written, but it dies when it returns, using this monotonically increasing request Id can avoid the problem of repeated writing.

func MakeClerk(servers []*labrpc.ClientEnd) *Clerk {
	ck := new(Clerk)
	ck.servers = servers
	// You'll have to add code here.
	ck.clientId = nrand()
	return ck
}

This function is used to generate the client

// Put or Append
type PutAppendArgs struct {
	Key   string
	Value string
	Op    string // "Put" or "Append"
	// You'll have to add definitions here.
	// Field names must start with capital letters,
	// otherwise RPC will break.
	ClientId int64
	SeqId int64
}
 
type PutAppendReply struct {
	Err Err
}

For the PRC parameter of PutAppend, you need to take ClientId int64 and seqd Int64.

func (ck *Clerk) PutAppend(key string, value string, op string) {
	// You will have to modify this function.
	args := PutAppendArgs{
		Key: key,
		Value: value,
		Op: op,
		ClientId: ck.clientId,
		SeqId: atomic.AddInt64(&ck.seqId, 1),
	}
 
	DPrintf("Client[%d] PutAppend, Key=%s Value=%s", ck.clientId, key, value)
 
	leaderId := ck.currentLeader()
	for {
		reply := PutAppendReply{}
		if ck.servers[leaderId].Call("KVServer.PutAppend", &args, &reply) {
			if reply.Err == OK {	// success
				break
			}
		}
		leaderId = ck.changeLeader()
		time.Sleep(1 * time.Millisecond)
	}
}

This function is the RPC initiator function. If an error is returned, it indicates that the leader has changed during the operation and needs to switch the leader to resend.

func (ck *Clerk) currentLeader() (leaderId int) {
	ck.mu.Lock()
	defer ck.mu.Unlock()
	leaderId = ck.leaderId
	return
}
 
func (ck *Clerk) changeLeader() (leaderId int) {
	ck.mu.Lock()
	defer ck.mu.Unlock()
	ck.leaderId = (ck.leaderId + 1) % len(ck.servers)
	return ck.leaderId
}
type GetArgs struct {
	Key string
	// You'll have to add definitions here.
	ClientId int64
	SeqId int64
}
 
type GetReply struct {
	Err   Err
	Value string
}
Request and response.

Get RPC call parameters and responses

func (ck *Clerk) Get(key string) string {
	// You will have to modify this function.
	args := GetArgs{
		Key: key,
		ClientId: ck.clientId,
		SeqId: atomic.AddInt64(&ck.seqId, 1),
	}
 
	DPrintf("Client[%d] Get starts, Key=%s ", ck.clientId, key)
 
	leaderId := ck.currentLeader()
	for {
		reply := GetReply{}
		if ck.servers[leaderId].Call("KVServer.Get", &args, &reply) {
			if reply.Err == OK {	// Hit
				return reply.Value
			} else if reply.Err == ErrNoKey {	// non-existent
				return "";
			}
		}
		leaderId = ck.changeLeader()
		time.Sleep(1 * time.Millisecond)
	}
}

The processing of get is roughly the same as that of putappend. It should be noted that there may be two get errors, one is caused by the change of the leader, and the other is caused by the absence of the sent key value.

const (
	OK             = "OK"
	ErrNoKey       = "ErrNoKey"
	ErrWrongLeader = "ErrWrongLeader"
)

type Err string

// Put or Append
type PutAppendArgs struct {
	Key   string
	Value string
	Op    string // "Put" or "Append"
	// You'll have to add definitions here.
	// Field names must start with capital letters,
	// otherwise RPC will break.

	ClientId int64
	SeqId    int64
}

type PutAppendReply struct {
	Err Err
}

type GetArgs struct {
	Key string
	// You'll have to add definitions here.
	ClientId int64
	SeqId    int64
}

type GetReply struct {
	Err   Err
	Value string
}

Here is the structure to be used in lab3, which is defined in common.go

const (
	OP_TYPE_PUT    = "Put"
	OP_TYPE_APPEND = "Append"
	OP_TYPE_GET    = "Get"
)

type Op struct {
	// Your definitions here.
	// Field names must start with capital letters,
	// otherwise RPC will break.
	Index    int
	Term     int
	Type     string
	Key      string
	Value    string
	SeqId    int64
	ClientId int64
}
type OpContext struct {
	op          *Op
	committed   chan byte
	wrongLeader bool
	ignore      bool

	keyExist bool
	value    string
}
type KVServer struct {
	mu      sync.Mutex
	me      int
	rf      *raft.Raft
	applyCh chan raft.ApplyMsg
	dead    int32 // set by Kill()

	maxraftstate int // snapshot if log grows this big

	kvStore map[string]string  // kv storage
	reqMap  map[int]*OpContext // Log index - > request context
	seqMap  map[int64]int64    // Client ID - > client seq
	// Your definitions here.
	lastAppliedIndex int
}

Here, the first op operation and kvsever structure on the server side are given. reqMap stores the RPC calls in progress, and seqMap records the maximum request ID submitted by each clientId for write idempotency determination.

func StartKVServer(servers []*labrpc.ClientEnd, me int, persister *raft.Persister, maxraftstate int) *KVServer {
	// call labgob.Register on structures you want
	// Go's RPC library to marshall/unmarshall.
	labgob.Register(&Op{})
 
	kv := new(KVServer)
	kv.me = me
	kv.maxraftstate = maxraftstate
 
	// You may need initialization code here.
 
	kv.applyCh = make(chan raft.ApplyMsg)
	kv.rf = raft.Make(servers, me, persister, kv.applyCh)
 
	// You may need initialization code here.
	kv.kvStore = make(map[string]string)
	kv.reqMap = make(map[int]*OpContext)
	kv.seqMap = make(map[int64]int64)
 
	go kv.applyLoop()
 
	return kv
}

This function is responsible for starting the service. Here, the MAKE function of raft is called to initialize the raft cluster and start kvsever

func Make(peers []*labrpc.ClientEnd, me int,
	persister *Persister, applyCh chan ApplyMsg) *Raft {

	rf := &Raft{}
	rf.peers = peers
	rf.persister = persister
	rf.me = me

	// Your initialization code here (2A, 2B, 2C).
	rf.role = ROLE_FOLLOWER
	rf.leaderId = -1
	rf.votedFor = -1
	rf.lastActiveTime = time.Now()
	rf.lastIncludedIndex = 0
	rf.lastIncludedTerm = 0
	rf.applyCh = applyCh
	// rf.nextIndex = make([]int, len(rf.peers))
	// rf.matchIndex = make([]int, len(rf.peers))
	// initialize from state persisted before a crash
	rf.readPersist(persister.ReadRaftState())

	//rf.installSnapshotToApplication()

	DPrintf("RaftNode[%d] Make again", rf.me)
	// start ticker goroutine to start elections
	go rf.electionLoop()

	go rf.appendEntriesLoop()

	go rf.applyLogLoop()
	//go rf.ticker()
	DPrintf("Raftnode[%d]start-up", me)
	return rf
}
func (kv *KVServer) PutAppend(args *PutAppendArgs, reply *PutAppendReply) {
	// Your code here.
	reply.Err = OK
 
	op := &Op{
		Type: args.Op,
		Key: args.Key,
		Value: args.Value,
		ClientId: args.ClientId,
		SeqId: args.SeqId,
	}
 
	// Write raft layer
	var isLeader bool
	op.Index, op.Term, isLeader = kv.rf.Start(op)
	if !isLeader {
		reply.Err = ErrWrongLeader
		return
	}
 
	opCtx := newOpContext(op)
 
	func() {
		kv.mu.Lock()
		defer kv.mu.Unlock()
 
		// Save the RPC context and wait for the callback to be submitted. The same Index may be overwritten because of the Leader change. However, the previous RPC will timeout and exit and make the client retry
		kv.reqMap[op.Index] = opCtx
	}()
 
	// Clean up context before RPC end
	defer func() {
		kv.mu.Lock()
		defer kv.mu.Unlock()
		if one, ok := kv.reqMap[op.Index]; ok {
			if one == opCtx {
				delete(kv.reqMap, op.Index)
			}
		}
	}()
 
	timer := time.NewTimer(2000 * time.Millisecond)
	defer timer.Stop()
	select {
	case <- opCtx.committed:	// If submitted
		if opCtx.wrongLeader {	// Similarly, the term in the index position is different, indicating that the leader has changed and the client needs to write to the new leader again
			reply.Err = ErrWrongLeader
		} else if opCtx.ignored {
			// Note the req id has expired and the request is ignored. For MIT lab, you only need to tell the client OK to skip
		}
	case <- timer.C:	// If the submission fails for 2 seconds, ask the client to retry
		reply.Err = ErrWrongLeader
	}
}

Here, the start function in raft is called. Here, the code is compared. The putappend logic is to write the op to raft, wait for applyloop to send the submission model, and then return the rpc call (ignore is the embodiment of idempotency)

func (rf *Raft) Start(command interface{}) (int, int, bool) {
	index := -1
	term := -1
	isLeader := true

	// Your code here (2B).
	rf.mu.Lock()
	defer rf.mu.Unlock()
	if rf.role != ROLE_LEADER {
		return -1, -1, false
	}
	logEntry := LogEntry{
		Command: command,
		Term:    rf.currentTerm,
	}
	rf.log = append(rf.log, logEntry)
	index = rf.lastIndex()
	term = rf.currentTerm
	rf.persist()

	DPrintf("RaftNode[%d] Add Command, logIndex[%d] currentTerm[%d]", rf.me, index, term)
	return index, term, isLeader
}

Obviously, the server side writes the operation to the log

func newOpContext(op *Op) (opCtx *OpContext) {
	opCtx = &OpContext{
		op:        op,
		committed: make(chan byte),
	}
	return
}
func (kv *KVServer) Get(args *GetArgs, reply *GetReply) {
	// Your code here.
	reply.Err = OK
 
	op := &Op{
		Type: OP_TYPE_GET,
		Key: args.Key,
		ClientId: args.ClientId,
		SeqId: args.SeqId,
	}
 
	// Write raft layer
	var isLeader bool
	op.Index, op.Term, isLeader = kv.rf.Start(op)
	if !isLeader {
		reply.Err = ErrWrongLeader
		return
	}
 
	opCtx := newOpContext(op)
 
	func() {
		kv.mu.Lock()
		defer kv.mu.Unlock()
 
		// Save the RPC context and wait for the callback to be submitted. The same Index may be overwritten because of the Leader change. However, the previous RPC will timeout and exit and make the client retry
		kv.reqMap[op.Index] = opCtx
	}()
 
	// Clean up context before RPC end
	defer func() {
		kv.mu.Lock()
		defer kv.mu.Unlock()
		if one, ok := kv.reqMap[op.Index]; ok {
			if one == opCtx {
				delete(kv.reqMap, op.Index)
			}
		}
	}()
 
	timer := time.NewTimer(2000 * time.Millisecond)
	defer timer.Stop()
	select {
	case <-opCtx.committed: // If submitted
		if opCtx.wrongLeader { // Similarly, the term in the index position is different, indicating that the leader has changed and the client needs to write to the new leader again
			reply.Err = ErrWrongLeader
		} else if !opCtx.keyExist { // key does not exist
			reply.Err = ErrNoKey
		}  else {
			reply.Value = opCtx.value	// Return value
		}
	case <- timer.C:	// If the submission fails for 2 seconds, ask the client to retry
		reply.Err = ErrWrongLeader
	}
}

Get requests are as like as two peas in putappend, but the error handling state is somewhat different, and roughly logical thinking is exactly the same as putappend.

func (kv *KVServer) applyLoop() {
	for !kv.killed() {
		select {
		case msg := <- kv.applyCh:
			cmd := msg.Command
			index := msg.CommandIndex
 
			func() {
				kv.mu.Lock()
				defer kv.mu.Unlock()
 
				// Operation log
				op := cmd.(*Op)
 
				opCtx, existOp := kv.reqMap[index]
				prevSeq, existSeq := kv.seqMap[op.ClientId]
				kv.seqMap[op.ClientId] = op.SeqId
 
				if existOp {	// If there are RPC s waiting for results, judge whether the status is consistent with that at the time of writing
					if opCtx.op.Term != op.Term {
						opCtx.wrongLeader = true
					}
				}
 
				// Only client write requests with monotonically increasing ID are processed
				if op.Type == OP_TYPE_PUT || op.Type == OP_TYPE_APPEND {
					if !existSeq || op.SeqId > prevSeq { // If it is an incremental request ID, accept its change
						if op.Type == OP_TYPE_PUT {	// put operation
							kv.kvStore[op.Key] = op.Value
						} else if op.Type == OP_TYPE_APPEND {	// Put append operation
							if val, exist := kv.kvStore[op.Key]; exist {
								kv.kvStore[op.Key] = val + op.Value
							} else {
								kv.kvStore[op.Key] = op.Value
							}
						}
					} else if existOp {
						opCtx.ignored = true
					}
				} else {	// OP_TYPE_GET
					if existOp {
						opCtx.value, opCtx.keyExist = kv.kvStore[op.Key]
					}
				}
				DPrintf("RaftNode[%d] applyLoop, kvStore[%v]", kv.me, kv.kvStore)
 
				// Wake up pending RPC
				if existOp {
					close(opCtx.committed)
				}
			}()
		}
	}
}

The logic of applyloop is to listen to the kv.applyCh pipeline. Once there is an embodied log. SEQID processing here is very wonderful. SEQID is added to the client atom, and then it is cleverly incremental. The server does not care about expired requests, and finally wakes up the blocked committed pipeline, that is, the putappend and get handle functions. Let them judge, and then pRC returns. marvellous!!!

Write it at the back

3A the overall difficulty is not very great, but there has been an item timeout in test. I really don't know what the problem is. Let's put this problem aside for the time being

  1. Linear consistency requires that the client must initiate OP serially, and the server cannot guarantee the effectiveness of the first op. in actual engineering, I think the client should queue the request number seqd and submit it serially to KV server.
  2. As long as the ID of each client is guaranteed to be unique, the seqd does not need to be persisted. It is relatively simple to ensure the uniqueness of the client ID, such as ip+pid+timestamp.
  3. Write idempotency determines whether the current OP can be executed by comparing the seqd of the last submitted log of clientId.
  4. The read consistency is achieved by writing a read log to raft. When the read log is submitted, the data of kvStore at this moment is returned to RPC. Then the data seen by RPC must comply with linear consistency, that is, subsequent read operations can continue to read this value.
  5. The RPC server needs to do timeout processing, because it is likely that the master selection occurs after the leader writes to the local log, then the new leader will truncate the log written by the old leader, resulting in the continuous failure to submit the corresponding index position. Therefore, the client should write to the log again through timeout to achieve submission and obtain the final result.

Tags: Big Data mapreduce

Posted on Thu, 25 Nov 2021 15:03:38 -0500 by bacarudaguy