Zookeeper Watcher process analysis (combined with source code)

summary

ZK provides the publish / subscribe function of distributed data. A typical publish / subscribe model system defines a one to many subscription relationship, which enables multiple subscribers to listen to a topic object at the same time. When the status of the topic object changes, all subscribers will be notified. Watcher mechanism is introduced into ZK to realize this distributed notification function.

ZK allows the client to register a Watcher to listen to the server. When some specified events of the server trigger the Watcher, an event notification will be sent to the specified client to realize the distributed notification function.

The general process is that the Client registers the watcher with the ZK. If the registration is successful, the corresponding Watcher will be stored locally. When the ZK server triggers the watcher event, it will send a notification to the Client, and the Client will take the corresponding watcher from the ClientWatchManager for callback.

Watcher interface

What is Watcher after all that talking? What's the use?

/**
 * This interface specifies the public interface an event handler class must
 * implement. A ZooKeeper client will get various events from the ZooKeeper
 * server it connects to. An application using such a client handles these
 * events by registering a callback object with the client. The callback object
 * is expected to be an instance of a class that implements Watcher interface.
 */
@InterfaceAudience.Public
public interface Watcher {
    void process(WatchedEvent event);
}

As long as you register and listen to the ZK server through the implementation class object of this interface, when an event is notified to the Client by the ZK server, the process method will be called back.

WatchedEvent

So what's the mystery of WatchedEvent?

public class WatchedEvent {
   /**
    * Enumeration of states the ZooKeeper may be at the event
    */
    private final KeeperState keeperState;
   /**
    * Enumeration of types of events that may occur on the ZooKeeper
       */
    private final EventType eventType;
    private String path;
}

KeeperState and EventType are two enumeration classes, representing notification state and event type respectively. Path is the path that the client listens to.

Common combinations of KeeperState and EventType

KeeperState EventType Trigger condition explain
SyncConnected None(-1) Client and server successfully establish a session Client and server are connected
SyncConnected NodeCreated(1) Watcher listens that the corresponding data node is created Client and server are connected
SyncConnected NodeDeleted(2) Watcher listens that the corresponding data node is deleted Client and server are connected
SyncConnected NodeDataChanged(3) Watcher listens for changes in the content of the corresponding data node (data content and data version number) Client and server are connected
SyncConnected NodeChildrenChanged(4) Watcher listens for changes in the child node list of the corresponding data node Client and server are connected

As for the NodeDataChanged event type, the change here includes the change of the data content of the node, as well as the change of the data version. So as long as a client calls the data update interface, regardless of whether the data content changes, the dataVersion will change, thus triggering the corresponding Watcher's listening. In this way, we can avoid the problem of typical optimistic locking ABA.

WatcherEvent

We can find such a method in WatchedEvent

 /**
     *  Convert WatchedEvent to type that can be sent over network
     */
    public WatcherEvent getWrapper() {
        return new WatcherEvent(eventType.getIntValue(), keeperState.getIntValue(), path);
    }

Generally speaking, WatcherEvent and WatchedEvent represent the same thing, both of which are encapsulation of server-side events. WatchedEvent is an object for logical processing, while WatcherEvent is an entity object for transport. From the above code, we can see that the parameter for creating WatcherEvent is the value of each property in WatchedEvent.

http://people.apache.org/~larsgeorge/zookeeper-1215258/build/docs/dev-api/org/apache/zookeeper/proto/WatcherEvent.html You can see that it implements the Record interface in

public class WatcherEvent
extends Object
implements org.apache.jute.Record

In the Record interface, the methods of serialization and deserialization are defined

@InterfaceAudience.Public
public interface Record {
    void serialize(OutputArchive archive, String tag) throws IOException;
    void deserialize(InputArchive archive, String tag) throws IOException;
}

Related components

Related processes

Generalization can be divided into three processes

  • Client registration Watcher
  • Server processing Watcher
  • Client callback Watcher

Client registration Watcher

When creating a ZK client instance object, we can pass a default Watcher to the constructor

 public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher) 

The Watcher in the parameter will be saved in ZKWatchManager as the default Watcher for the entire session

watchManager.defaultWatcher = watcher;

In addition, ZK client can register Watcher with ZK server through three interfaces: GetData, getchildren and exist

We use the getData interface to analyze

public byte[] getData(final String path, Watcher watcher, Stat stat){
  .....
}
public byte[] getData(String path, boolean watch, Stat stat) throws KeeperException, InterruptedException {
        return getData(path, getDefaultWatcher(watch), stat);
}

If our parameter watch is true, getDefaultWatcher is the default Watcher passed in when we create Zookeeper

 private Watcher getDefaultWatcher(boolean required) {
        if (required) {
            if (watchManager.defaultWatcher != null) {
                return watchManager.defaultWatcher;
            } else {
                throw new IllegalStateException("Default watcher is required, but it is null.");
            }
        }
        return null;
    }

Here is the complete getData code

 public byte[] getData(final String path, Watcher watcher, Stat stat) throws KeeperException, InterruptedException {
        final String clientPath = path;
        PathUtils.validatePath(clientPath);

        // the watch contains the un-chroot path
        // Create a watch registration of data type
        WatchRegistration wcb = null;
        if (watcher != null) {
            wcb = new DataWatchRegistration(watcher, clientPath);
        }

        // Add and change the path of the client change root directory back to the normal path of the server
        final String serverPath = prependChroot(clientPath);

        RequestHeader h = new RequestHeader();
        h.setType(ZooDefs.OpCode.getData);
        GetDataRequest request = new GetDataRequest();
        request.setPath(serverPath);
        // Mark whether there is a watcher
        request.setWatch(watcher != null);
        GetDataResponse response = new GetDataResponse();
        
        ReplyHeader r = cnxn.submitRequest(h, request, response, wcb);
        if (r.getErr() != 0) {
            throw KeeperException.create(KeeperException.Code.get(r.getErr()), clientPath);
        }
        if (stat != null) {
            DataTree.copyStat(response.getStat(), stat);
        }
        return response.getData();
    }
  1. Create a DataWatchRegistration
  2. Convert path (the client side may change root directory, which should be converted to the server side path before sending the request)
  3. Submit this request using ClientCnxn
public ReplyHeader submitRequest(
        RequestHeader h,
        Record request,
        Record response,
        WatchRegistration watchRegistration,
        WatchDeregistration watchDeregistration) throws InterruptedException {
        ReplyHeader r = new ReplyHeader();
        Packet packet = queuePacket(
            h,
            r,
            request,
            response,
            null,
            null,
            null,
            null,
            watchRegistration,
            watchDeregistration);
           ....
           ....
        return r;
    }

Finally, the Request is added to the outgoing queue

public Packet queuePacket(
        RequestHeader h,
        ReplyHeader r,
        Record request,
        Record response,
        AsyncCallback cb,
        String clientPath,
        String serverPath,
        Object ctx,
        WatchRegistration watchRegistration,
        WatchDeregistration watchDeregistration) {
        Packet packet = null;

        packet = new Packet(h, r, request, response, watchRegistration);
         
        synchronized (state) {
                       ...
              ....
                outgoingQueue.add(packet);
            }
        }

Finally, send the request to the server, and process the returned result in sendthread readresponse

void readResponse(ByteBuffer incomingBuffer) throws IOException {
            ByteBufferInputStream bbis = new ByteBufferInputStream(incomingBuffer);
            BinaryInputArchive bbia = BinaryInputArchive.getArchive(bbis);
            ReplyHeader replyHdr = new ReplyHeader();

            replyHdr.deserialize(bbia, "header");
            switch (replyHdr.getXid()) {
            case PING_XID:
               ....
               ....
                return;
              case AUTHPACKET_XID:
                ...
                ...
              return;
                // Handle service end-to-end notifications
            case NOTIFICATION_XID:
                LOG.debug("Got notification session id: 0x{}",
                    Long.toHexString(sessionId));
                WatcherEvent event = new WatcherEvent();
                event.deserialize(bbia, "response");

                // convert from a server path to a client path
                if (chrootPath != null) {
                    String serverPath = event.getPath();
                    if (serverPath.compareTo(chrootPath) == 0) {
                        event.setPath("/");
                    } else if (serverPath.length() > chrootPath.length()) {
                        event.setPath(serverPath.substring(chrootPath.length()));
                     } else {
                         LOG.warn("Got server path {} which is too short for chroot path {}.",
                             event.getPath(), chrootPath);
                     }
                }

                WatchedEvent we = new WatchedEvent(event);
                LOG.debug("Got {} for session id 0x{}", we, Long.toHexString(sessionId));
                // Added to the event queue and processed by EventThread
                eventThread.queueEvent(we);
                return;
            default:
                break;
            }

           // Remove this Pacjet
            Packet packet;
            synchronized (pendingQueue) {
                if (pendingQueue.size() == 0) {
                    throw new IOException("Nothing in the queue, but got " + replyHdr.getXid());
                }
                packet = pendingQueue.remove();
            }
            /*
             * Since requests are processed in order, we better get a response
             * to the first request!
             */
            try {
                   ....
                   .....
            } finally {
                  // Save Watcher in ClientWatchManager
                finishPacket(packet);
            }
        }

What did you do

  1. Deserialization: get the XID in the request header to determine whether it is a service-end notification, if so, add it to the event queue, and handle it by EventThread
  2. Remove the Packet from the outgoing queue.
  3. Call the finishPacket function for some subsequent processing
 protected void finishPacket(Packet p) {
        int err = p.replyHeader.getErr();
        if (p.watchRegistration != null) {
            p.watchRegistration.register(err);
        }
        ...
        ...            
 }

Finally, return to WatchRegistration to register the corresponding Watcher in the corresponding map < string, set < watcher > >.

Server processing Watcher

Let's start with a few major component classes

WatchManager is the manager of watcher on ZK server. It has two storage structures, watchTable and watch2Paths, which store Watcher in two dimensions.

  • watchTable hosts Watcher from the granularity of data node path.
  • watch2Paths controls the data nodes that need to be triggered when an event is triggered from the Watcher's granularity.

ServerCnxn It's a Zookeeper The connection interface between the client and the burden represents the connection between the client and the server. The default implementation is NIOServerCnxn ,From 3.4.0 Started to introduce the Netty Implementation of NettyServerCnxn .

ServerCnxn At the same time, it realizes Watcher Interface, so we can think of it as a Watcher object.

Paths and ServerCnxn Will be stored in WatchManager in

After receiving the client's request, the server will FinalRequestProcessor#processRequest To determine whether the current request needs to be registered Watcher.

case OpCode.getData: {
                lastOp = "GETD";
                GetDataRequest getDataRequest = new GetDataRequest();
                ByteBufferInputStream.byteBuffer2Record(request.request, getDataRequest);
                path = getDataRequest.getPath();
                              // Call the method to process the getData request
                rsp = handleGetDataRequest(getDataRequest, cnxn, request.authInfo);
                requestPathMetricsCollector.registerRequest(request.type, path);
                break;
            }
private Record handleGetDataRequest(Record request, ServerCnxn cnxn, List<Id> authInfo) throws KeeperException, IOException {
          ....
        ....
        // Note: whether the client needs to register the Watcher or not, there is only a boolean field in the request to indicate
        // Get from request whether to register Watcher or not
        byte[] b = zks.getZKDatabase().getData(path, stat, getDataRequest.getWatch() ? cnxn : null);
        return new GetDataResponse(b, stat);
    }
public byte[] getData(String path, Stat stat, Watcher watcher)  {
        return dataTree.getData(path, stat, watcher);
    }

public byte[] getData(String path, Stat stat, Watcher watcher)  {
         
        synchronized (n) {
            n.copyStat(stat);
            if (watcher != null) {
              // Here, dataWatches is the corresponding instance of IWatchManager interface
                dataWatches.addWatch(path, watcher);
            }
            data = n.data;
        }
        updateReadStat(path, data == null ? 0 : data.length);
        return data;
    }

Finally, it will be stored in watchTable and watch2Paths

 @Override
    public boolean addWatch(String path, Watcher watcher) {
        return addWatch(path, watcher, WatcherMode.DEFAULT_WATCHER_MODE);
    }

    @Override
    public synchronized boolean addWatch(String path, Watcher watcher, WatcherMode watcherMode) {
        if (isDeadWatcher(watcher)) {
            return false;
        }
                // Take Set out of it
        Set<Watcher> list = watchTable.get(path);
        if (list == null) {
            list = new HashSet<>(4);
            watchTable.put(path, list);
        }
        list.add(watcher);
                // 
        Set<String> paths = watch2Paths.get(watcher);
        if (paths == null) {
            paths = new HashSet<>();
            watch2Paths.put(watcher, paths);
        }

        watcherModeManager.setWatcherMode(watcher, path, watcherMode);
        return paths.add(path);
    }

Trigger of Watcher

The trigger of NodeDataChange is the change of data content or dataVersion of our node.

Then we can have a look org.apache.zookeeper.server.DataTree#setData method

public Stat setData(String path, byte[] data, int version, long zxid, long time) throws KeeperException.NoNodeException {
        Stat s = new Stat();
        DataNode n = nodes.get(path);
        if (n == null) {
            throw new KeeperException.NoNodeException();
        }
        byte[] lastdata = null;
        synchronized (n) {
            lastdata = n.data;
            nodes.preChange(path, n);
            n.data = data;
            n.stat.setMtime(time);
            n.stat.setMzxid(zxid);
            n.stat.setVersion(version);
            n.copyStat(s);
            nodes.postChange(path, n);
        }
      
                ....
        ....
        updateWriteStat(path, dataBytes);
              // Method to call IWatchManager
        dataWatches.triggerWatch(path, EventType.NodeDataChanged);
        return s;
    }
 @Override
    public WatcherOrBitSet triggerWatch(String path, EventType type) {
        return triggerWatch(path, type, null);
    }

    @Override
    public WatcherOrBitSet triggerWatch(String path, EventType type, WatcherOrBitSet supress) {
      // Encapsulate as WatchedEvent 
        WatchedEvent e = new WatchedEvent(type, KeeperState.SyncConnected, path);
        Set<Watcher> watchers = new HashSet<>();
        PathParentIterator pathParentIterator = getPathParentIterator(path);
        synchronized (this) {
            for (String localPath : pathParentIterator.asIterable()) {
                Set<Watcher> thisWatchers = watchTable.get(localPath);
              // No monitoring
                if (thisWatchers == null || thisWatchers.isEmpty()) {
                    continue;
                }
                Iterator<Watcher> iterator = thisWatchers.iterator();
                while (iterator.hasNext()) {
                    Watcher watcher = iterator.next();
                    WatcherMode watcherMode = watcherModeManager.getWatcherMode(watcher, localPath);
                    if (watcherMode.isRecursive()) {
                         
                    } else if (!pathParentIterator.atParentPath()) {
                        watchers.add(watcher);
                        if (!watcherMode.isPersistent()) {
                          // Remove
                            iterator.remove();
                            Set<String> paths = watch2Paths.get(watcher);
                            if (paths != null) {
                              // Remove from watch2Paths
                                paths.remove(localPath);
                            }
                        }
                    }
                }
               
            }
        }
        for (Watcher w : watchers) {
            if (supress != null && supress.contains(w)) {
                continue;
            }
          // Call the process method
            w.process(e);
        }
                .....
        .....
        return new WatcherOrBitSet(watchers);
    }

As mentioned above, ServerCnxn implements the Watcher interface. Let's see org.apache.zookeeper.server.NIOServerCnxn#process

@Override
    public void process(WatchedEvent event) {
      // XID in the request header is set to - 1, as analyzed above SendThread.readResponse  When I mentioned
        ReplyHeader h = new ReplyHeader(ClientCnxn.NOTIFICATION_XID, -1L, 0);
     
        // WatchedEvent to WatcherEvent
        WatcherEvent e = event.getWrapper();
                // Send notifications to clients
        sendResponse(h, e, "notification", null, null, ZooDefs.OpCode.error);
    }

Basic process

  • Encapsulate WatchedEvent
  • Find the corresponding Watcher from the watchTable, and clear the relevant Watcher and path in the watchTable and watch2Paths (it can only be triggered once)
  • Call the process method.

Client callback Watcher

Let's first recognize the EventThread class

Inherited from Thread, use linkedblockingqueue < Object > waitingevents to save the event to be processed, and then the 'run' method constantly gets it from the queue for processing.

We already know that the client is processed by sendthread? Readresponse (this code also appears when the client above registers the Watcher)

case NOTIFICATION_XID:
                LOG.debug("Got notification session id: 0x{}",
                    Long.toHexString(sessionId));
                WatcherEvent event = new WatcherEvent();
                event.deserialize(bbia, "response");

                // convert from a server path to a client path
                if (chrootPath != null) {
                    String serverPath = event.getPath();
                    if (serverPath.compareTo(chrootPath) == 0) {
                        event.setPath("/");
                    } else if (serverPath.length() > chrootPath.length()) {
                        event.setPath(serverPath.substring(chrootPath.length()));
                     } else {
                         LOG.warn("Got server path {} which is too short for chroot path {}.",
                             event.getPath(), chrootPath);
                     }
                }

                WatchedEvent we = new WatchedEvent(event);
                LOG.debug("Got {} for session id 0x{}", we, Long.toHexString(sessionId));
                // Added to the event queue and processed by EventThread
                eventThread.queueEvent(we);
                return;

Join to ` waitingEvents queue

public void queueEvent(WatchedEvent event) {
            queueEvent(event, null);
        }

        private void queueEvent(WatchedEvent event, Set<Watcher> materializedWatchers) {
            if (event.getType() == EventType.None && sessionState == event.getState()) {
                return;
            }
            sessionState = event.getState();
            final Set<Watcher> watchers;
            if (materializedWatchers == null) {
                // Getting the corresponding Watcher from the clientWatchManager will also remove the Watcher from the corresponding Map
              // It's one time
                watchers = watcher.materialize(event.getState(), event.getType(), event.getPath());
            } else {
                watchers = new HashSet<Watcher>();
                watchers.addAll(materializedWatchers);
            }
            WatcherSetEventPair pair = new WatcherSetEventPair(watchers, event);
            // Add to waitingEvents and wait for the run method to be taken out for processing
            waitingEvents.add(pair);
        }

run method

 public void run() {
            try {
                isRunning = true;
                while (true) {
                    Object event = waitingEvents.take();
                    if (event == eventOfDeath) {
                        wasKilled = true;
                    } else {
                        processEvent(event);
                    }
                  ......
                  ......
                }}
        }

        private void processEvent(Object event) {
            try {
                if (event instanceof WatcherSetEventPair) {
                    // each watcher will process the event
                    WatcherSetEventPair pair = (WatcherSetEventPair) event;
                    for (Watcher watcher : pair.watchers) {
                        try {
                          // Calling the process method, serial synchronous processing
                            watcher.process(pair.event);
                        } catch (Throwable t) {
                            LOG.error("Error while calling watcher.", t);
                        }
                    }
                } }
          .......
          .......

    }

summary

Features of Watcher

  • One time: no matter the client or the server, once the Watcher is triggered, it will be removed from the storage.
  • Client serial execution: the process of serial synchronous execution, never affect the whole client callback Watcher because of one Watcher
  • Lightweight: WatchedEvent is the smallest notification unit in the notification mechanism, which only contains three parts: notification status, event type and node path. Instead of notifying the client of the content of the node, the client needs to take the initiative to get data from the server after receiving the notification.

Related articles

ZooKeeper data model

Compile and run Zookeeper source code

Tags: Java Zookeeper Session Apache network

Posted on Sat, 06 Jun 2020 01:32:13 -0400 by kmarsh