On Fri, 26 Jul 2013 10:19:16 -0400 "J.Bruce Fields" <bfields@xxxxxxxxxxxxxx> wrote: > On Fri, Jul 26, 2013 at 06:33:03AM +1000, NeilBrown wrote: > > On Thu, 25 Jul 2013 16:18:05 -0400 "J.Bruce Fields" <bfields@xxxxxxxxxxxxxx> > > wrote: > > > > > On Thu, Jul 25, 2013 at 11:30:23AM +1000, NeilBrown wrote: > > > > > > > > Since we enabled auto-tuning for sunrpc TCP connections we do not > > > > guarantee that there is enough write-space on each connection to > > > > queue a reply. > ... > > > This is great, thanks! > > > > > > Inclined to queue it up for 3.11 and stable.... > > > > I'd agree for 3.11. > > It feels a bit border-line for stable. "dead-lock" and "has been seen in the > > wild" are technically enough justification... > > I'd probably mark it as "pleas don't apply to -stable until 3.11 is released" > > or something like that, just for a bit of breathing space. > > Your call though. > > > So my takeaway from http://lwn.net/Articles/559113/ was that Linus and > Greg were requesting that: > > - criteria for -stable and late -rc's should really be about the > same, and > - people should follow Documentation/stable-kernel-rules.txt. > > So as an exercise to remind me what those rules are: > > Easy questions: > > - "no bigger than 100 lines, with context." Check. > - "It must fix only one thing." Check. > - "real bug that bothers people". Check. > - "tested": yep. It doesn't actually say "tested on stable > trees", and I recall this did land you with a tricky bug one > time when a prerequisite was omitted from the backport. > > Judgement calls: > > - "obviously correct": it's short, but admittedly subtle, and > performance regressions can take a while to get sorted out. > - "It must fix a problem that causes a build error (but not for > things marked CONFIG_BROKEN), an oops, a hang, data > corruption, a real security issue, or some "oh, that's not > good" issue. In short, something critical." We could argue > that "server stops responding" is critical, though not to the > same degree as a panic. > - OR: alternatively: "Serious issues as reported by a user of a > distribution kernel may also be considered if they fix a > notable performance or interactivity issue." The only bz I've > personally seen was the result of artificial testing of some > kind, and it sounds like your case involved a disk failure? > > --b. Looks like good analysis ... except that it doesn't seem conclusive. Being conclusive would make it really good. :-) The case that brought it to my attention doesn't require the fix. A file system was mis-behaving (blocking when it should return EJUKEBOX) and this resulted in nfsd behaviour different than my expectation. I expected nfsd to keep accepting requests until all threads were blocks. However only 4 requests were accepted (which is actually better behaviour, but not what I expected). So I looked into it and thought that what I found wasn't really right. Which turned out to be the case, but not the way I thought... So my direct experience doesn't argue for the patch going to -stable at all. If the only other reports are from artificial testing then I'd leave it out of -stable. I don't feel -rc4 (that's next I think) is too late for it though. NeilBrown
Attachment:
signature.asc
Description: PGP signature