In an earlier post, I outlined a proposed overhaul of the scheme that knfsd uses to track client names on stable storage. There’s another aspect we need to consider as well. First, some background…
There are actually two sets of client “id’s” that we need to consider:
- The long, client-generated string that (supposedly) uniquely identifies the client. This object is referred to in RFC3530 as an nfs_client_id4. This is the thing that the client tracking daemon will be tracking. Henceforth, I’ll call this the “client name”. This value is generally just used to acquire an ID of the other type.
- The shorthand (64-bit) value that the server hands out to the client. This is referred to as a clientid4 in the RFC. This is the value that’s generally used for “day to day” stateful operations. We’ll call this the “client id”
Currently, the server generates client id’s by taking a 32-bit “boot time” value, and a 32-bit “counter”. This is generally fine in a single-node configuration. It’s highly unlikely you’d ever get a client id collision. This is important because when a call comes in with a particular client ID, we need to be able to use that to (quickly) find the actual client tracking object in the server.
In a cluster however, this scheme is probably not going to be sufficient. If 2 cluster hosts boot at exactly the same second, then they could hand out the same client ID. This is a big problem. Consider the following example:
Cluster hosts foo and bar are clustered NFS servers. They boot at the same time and hand out the same client id to two different clients. Now, an address floats from foo to bar. The client that was originally talking to foo then does an operation against bar that requires the clientid. bar sees that clientid and confuses it with the one that it originally handed out.
There are a few different ways to avoid this:
- We can try to ensure that clientid collisions like this never happen. Sounds simple, but is difficult in practice.
- Instead of identifying clients just by client id, we can use a client id + server address tuple
- We can containerize nfsd and ensure that each address is started within its own container. Presumably a containerized nfsd will have its own list of clients, and a collision is no big deal.
- We can consider abandoning the floating address model altogether and rely on clients using RR DNS and fs_locations info to find the servers. If the client knows that it must reclaim state before it ever talks to the failover host, then collisions aren’t a problem.
Containerizing sounds like the best approach, but perhaps abandoning floating addresses for this would be easier?