Monthly Archives: April 2012

Active/Active NFSv4 on Clustered Filesystems: avoiding the VFS layer

So far, most of my posts about this proposed design have been about teaching the VFS and underlying filesystems how to handle the state properly for NFSv4 semantics. We could however, consider another design. We could have a clustered daemon that runs and tracks the state independently of the VFS. In other words, we could go with a more Samba/CTDB type design here.

knfsd would upcall to talk to a daemon and it would be the final arbiter. Such a daemon could be integrated with Samba/CTDB such that knfsd and samba are aware of each others state. In principle, that would allow you to export the same filesystem via both knfsd and Samba/CTDB and get reasonable semantics for both. Oplocks and delegations would get broken appropriately, share mode reservations should work, and (possibly) we could make locking work in a more consistent fashion than we do today.

Of course, there’s a catch — nothing else would be aware of that state. It would all be tied up inside the CTDB database (or whatever clustered DB infrastructure we’d end up using). The result would be a NFS+SMB “appliance”. I think that’s a less desirable design than one that’s more general purpose, but it might be easier to achieve and we might be able to hammer that out quicker since we’d be able to avoid a lot of the VFS-layer work.

In the near term, we don’t really need to make this decision. Either way, we’ll still need to be able to swap in the correct set of operations to handle it, so the focus for now can be on simply abstracting out the parts of the server code that we’ll need to swap out to handle this later. It should even be possible to do this sort of design as an interim step, and then add what the VFS would need for a more general solution later.