On Tue, Oct 16, 2018 at 9:05 AM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > On Mon, Oct 15, 2018 at 6:58 PM Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote: > > > > In CephFS testing, we've observed transient failures caused by what > > appears to messages being dropped [1,2]. These appear to have been > > caused by the recent refactor PR [3,4] but I have no evidence other > > than the problems appearing during testing with [4] after [4] was > > merged. > > > > I'm running tests [5] to see if I can get more debugging (debug ms = > > 20) but I wanted to canvas for ideas/advice before I get much deeper. > > Has anyone else seen transient failures with messages getting dropped? > > I will note that these tickets are both from after patch 1 but before > patch 2. No, the tickets were both from testing with the second patch ([4] in my OP). I'll report back if I can reproduce this with higher debugging. -- Patrick Donnelly