Hi Bruce, On Fri, Aug 30, 2019 at 10:08 PM J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Thu, Aug 29, 2019 at 09:12:49PM +0300, Alex Lyakas wrote: > > We evaluated the network namespaces approach. But, unfortunately, it > > doesn't fit easily into how our system is currently structured. We > > would have to create and configure interfaces for every namespace, and > > have a separate IP address (presumably) for every namespace. > > Yes. > > > All this > > seems a bit of an overkill, to just have several local filesystems > > exported to the same client (which is when we hit the issue). I would > > assume that some other users would argue as well that creating a > > separate network namespace for every local filesystem is not the way > > to go from the administration point of view. > > OK, makes sense. > > And I take it you don't want to go around to each client and shut things > down cleanly. And you're fine with the client applications erroring out > when you yank their filesystem out from underneath them. It's not that we don't want to unmount at each client. The problem is that we don't control the client machines, as they are owned by customers. We definitely recommend customers to unmount, before un-exporting. But in some cases it still doesn't happen, most notably in automated environments. Since the un-export operation is initiated by customer, I presume she understands that the nfs client might error out, if not unmounted properly beforehand. > > (I wonder what happens these days when that happens on a linux client > when there are dirty pages. I think you may just end up with a useless > mount that you can't get rid of till you power down the client.) Right, in some cases, the IO process gets stuck forever. > > > Regarding the failure injection code, we did not actually enable and > > use it. We instead wrote some custom code that is highly modeled after > > the failure injection code. > > Sounds interesting.... I'll try to understand it and give some comments > later. > ... > > Currently this code is invoked from a custom procfs entry, by > > user-space application, before unmounting the local file system. > > > > Would moving this code into the "unlock_filesystem" infrastructure be > > acceptable? > > Yes. I'd be interested in patches. > > > Since the "share_id" approach is very custom for our > > usage, what criteria would you suggest for selecting the openowners to > > be "forgotten"? > > The share_id shouldn't be necessary. I'll think about it. > > --b.