> > 23.09.2014 14:55, Chris.Pringle@xxxxxxxxxxxxxxx пишет: > > > > > > > I've attached the patch for reference. > > > > ported you fixes to 3.14.19 + patch-3.14.12-rt9.patch.gz > > Intel Celeron G1620, while working :) > > > > > http://i67.fastpic.ru/big/2014/0923/39/df38663bd2414afe064c1978ad7b3d39.jpg > > > Thanks for trying. Unfortunately, I don't think this is a workable fix as > I'm pretty convinced that changing the spin lock type is not the right > thing to do for RT.. I have backported all fixes in aio.c and workqueue.c > from v3.14 (rt9) to v3.12 and have seen some improvement in one of my test > cases (although I have suspicions it's just masking the problem), however > the other is still failing as previously posted. In v3.0 of the kernel, > the aio stuff never used the new percpu ref counting so I think this is a > new problem introduced/revealed by that. I guess the question is, do we > expect it to be safe to call schedule_work() from atomic context (as Ben > LaHaise suggested)? If the answer is yes, then there is a bug in the > workqueues, if the answer is no, then the bug is in fs/aio.c in the way it > cleans up reqs via the percpu refcounting (which eventually calls > schedule_work()). Okay - just for complete-ness in case anyone else runs into this issue or wants to pick it up... The original patch I sent out does not fix the problem as there is another lock (local_lock_irqsave etc) in there which would also need changing; changing this to a raw spinlock causes other areas of the kernel to get complain about scheduling whilst atomic. I didn't think this was the right way to fix it anyway. I believe the problem is with the way aio is using the percpu ref counting; it's fine without RT, but with RT the way aio uses per cpu ref counting eventually calls schedule_work (not currently atomic) from within percpu_ref_kill_rcu (atomic) and then spits out a kernel BUG. I think there are a few ways we could fix this specific problem: 1) We could fix schedule_work() so it can be called from atomic context which would solve the problem for aio with the way it uses percpu ref counting. It only solves this one case however and I think some clarification around what should and shouldn't be callable from atomic context would be helpful. 2) For RT we could change aio so it's not reliant on percpu ref counting (like the v3.0 implementation) but this is likely to give quite a different implementation than the stock kernel; either that or look at at a change to the mainline kernels aio implementation (i.e. not in RT patches). 3) Maybe we could find a way of changing the percpu ref counting to make it safe to call non-atomic code (with RT patches) - so it executes functions in a non-atomic context perhaps?... I also think some clarification on the following two things would also help decide on the best way to address this: 1) Is it intended that schedule_work() be callable from an atomic context? Currently it's not, but I've not seen any documentation to suggest whether it is or is not supposed to be callable from atomic context 2) Should functions run by percpu_ref_kill_rcu be running in an atomic or non-atomic context? Currently it's atomic, but again, I've not seen any documentation to suggest whether it is or is not supposed to be running in atomic context I think it needs someone who has extensive knowledge/experience with these areas of code to take care of this issue as this is not really my area of expertise. For now I've gone with solution (2) and ported the original v3.0 aio implementation onto v3.12 to resolve the problem for our use. Happy to assist with testing a v3.12 fix on our platform (we can't upgrade to v3.14 due to lock-in with the Freescale SDK) if anyone has any suggested fixes. DISCLAIMER: Privileged and/or Confidential information may be contained in this message. If you are not the addressee of this message, you may not copy, use or deliver this message to anyone. In such event, you should destroy the message and kindly notify the sender by reply e-mail. It is understood that opinions or conclusions that do not relate to the official business of the company are neither given nor endorsed by the company. Thank You. ЪТХ╨{.nг+┴╥÷╝┴╜├+%┼кЪ╠Ищ╤╔┼wЪ╨{.nг+┴╥╔┼{╠Ч╩Ъ╨г╚ЁЬ╖╤⌡║э╗}╘·╡ф═zз&j:+v┴╗ЧЬ╞Ы╝w╔Ч┼Ю2┼ч≥╗Х╜з&╒)ъ║╚a╤зЪЪШЮz©Дz╧ч≈З+┐Ы ▌┼щ╒jЪ┼wХЧf