> In the recent past, have seen multiple instances where we needed to > send some data along with the fop on wire. And we have been using the > xdata for the same. Eg: > > 1. Add lease-id, transaction id to every fop. The original purpose of xdata was to convey extra information without having to bump a protocol version - e.g. hints that could safely be ignored, or new fields that are subject to experimentation during the prototype phase of a new translator. For something that applies to every fop, and particularly that we expect to be a permanent part of how our protocol operates, I believe we should bump the protocol version. > 2. Send the xattrs to invalidate in xdata. > 3. Send the share flags during open. > > There were concerns raised around this for the below reasons: > 1. word-size and endianness This should only be an issue if people are using dict_set_bin. In most cases I've seen, xdata values are ints that are represented internally as strings, which should avoid any endian-ness issues. > 2. security issues I don't see how this is an issue for xdata any more than it is for separate fields. All of our security is either at the connection level (TLS) or the request level (uid/gid). Whatever's in xdata now would be exactly as in/secure if it were in a new field associated with a new protocol version. > 3. Backward compatibility issues i.e. old/new client and server > combination. This is the biggest issue, though I'll add one more: performance/complexity. Creating dicts and adding keys to them involves more memory allocations than we really want in our main I/O paths. Because allocation errors have to be checked, this also makes code more complex and cumbersome than it would be with normal fields. > Initiating this mail to arrive at a conclusion, whether we can use > xdata or we need to find a different solution, if so what is the > solution. Your thoughts comments are appreciated. > > Solution 1: > To get rid of sending xdata on wire, one of the solution could be to > have protocol versioning in Gluster. With this we can modify xdr > structures for each release and get rid of xdata. But this will be > huge work. We should have protocol versioning anyway, but that's no reason to get rid of xdata. It's still a useful feature, even if we change things that are currently using it to use separate fields instead. We can consider these on a case-by-case basis. > Solution 2: > - Change dict, to not be an opaque structure , but an array of data > elements which is a union of (int, string etc.). I don't see how this would be an improvement. Each key in xdata is already such a union, just represented a bit inefficiently, and addressing them by name instead of by index is more portable/extensible. We'd gain some in efficiency, but at the cost of having to coordinate use of indices across all translators (including third-party). That would also increase the risk of misinterpreting a field set for one purpose as meaning something totally different, which could be disastrous. With named fields the risk of such collisions is negligible. > - Backward compatibility issues is when the newer server/client adds > data to dict but the old client/server fails to read the dict. This is > the responsibility of the programmer to make sure, thta this case > doesn't fail silently, op version can be used if it is done as a part > of adding new feature/volume set. Another approach would be, if client > has a list of capabilities(features) the server supports it can > accordingly tune itself to access the xdata. I'm a big fan of negotiating specific capabilities during the initial connection handshake, instead of tying everything to a protocol version. Believe it or not, several of the DECnet protocols had a nice way of doing this using extensible bitfields. I first encountered this in '89; for me it remains the gold standard against which other protocol-versioning or capability-negotiation strategies tend to fall short. We could easily do such negotiation for "global" protocol features (e.g. lease/transaction ID) as part of our connection handshake, and individual translators could do something similar via GF_FOP_IPC. _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel