Jonathan Cameron wrote: > On Thu, 17 Oct 2024 16:39:57 -0500 > Ira Weiny <ira.weiny@xxxxxxxxx> wrote: > > > Jonathan Cameron wrote: > > > On Mon, 07 Oct 2024 18:16:27 -0500 > > > ira.weiny@xxxxxxxxx wrote: > > > [snip] > > > > Simplify extent tracking with the following restrictions. > > > > > > > > 1) Flag for removal any extent which overlaps a requested > > > > release range. > > > > 2) Refuse the offer of extents which overlap already accepted > > > > memory ranges. > > > > 3) Accept again a range which has already been accepted by the > > > > host. Eating duplicates serves three purposes. First, this > > > > simplifies the code if the device should get out of sync with > > > > the host. > > > > > > Maybe scream about this a little. AFAIK that happening is a device > > > bug. > > > > Agreed but because of the 2nd purpose this is difficult to scream about because > > this situation can come up in normal operation. Here is the scenario: > > > > 1) Device has 2 DCD partitions active, A and B > > 2) Host crashes > > 3) Region X is created on A > > 4) Region Y is created on B > > 5) Region Y scans for extents > > 6) Region X surfaces a new extent while Y is scanning > > 7) Gen number changes due to new extent in X > > 8) Region Y rescans for existing extents and sees duplicates. > > > > These duplicates need to be ignored without signaling an error. > Hmm. If we can know that path is the trigger (should be able to > as it's a scan after a gen number change), can we just muffle the > screams on that path? (Halloween is close, the analogies will get > ever worse :) Ok yea since this would be a device error we should do something here. But the code is going to be somewhat convoluted to print an error whenever this happens. What if we make this a warning and change the rescan debug message to a warning as well? This would allow enough bread crumbs to determine if a device is failing without a lot of extra code to alter print messages on the fly? Ira