Hi Jens: After debug the v3 patch, I found there is a bug in the patch. On the first fio_rbd_getevents loop, the fri->io_seen is set to 1, and this variable never set to 0 again. So the program get into endless loop in such code: do { this_events = rbd_iter_events(td, &events, min, wait); if (events >= min) break; if (this_events) continue; wait = 1; } while (1); this_events and events always be 0, because the fri->io_seen is always 1, so no events can be getted. The Bug fix is: in the function _fio_rbd_finish_read_aiocb, _fio_rbd_finish_write_aiocb and _fio_rbd_finish_sync_aiocb add "fio_rbd_iou->io_seen = 0;" after "fio_rbd_iou->io_complete = 1;". The attchment is the new patch. 2014-10-27 17:27 GMT+08:00 Ketor D <d.ketor@xxxxxxxxx>: > Hi, Jens: > I have test your v2 and v3 patch. > The v2 patch get SIGABT and crash. The v3 patch hang. > > Why not simply comment usleep? > > > 2014-10-26 6:25 GMT+08:00 Mark Kirkwood <mark.kirkwood@xxxxxxxxxxxxxxx>: >> >> On 26/10/14 08:20, Jens Axboe wrote: >>> >>> On 10/24/2014 10:50 PM, Mark Kirkwood wrote: >>>> >>>> On 25/10/14 16:47, Jens Axboe wrote: >>>>> >>>>> >>>>> Since you're running rbd tests... Mind giving this patch a go? I don't >>>>> have an easy way to test it myself. It has nothing to do with this >>>>> issue, it's just a potentially faster way to do the rbd completions. >>>>> >>>> >>>> Sure - but note I'm testing this on my i7 workstation (4x osd's running >>>> on 2x Crucial M550) so not exactly server grade :-) >>>> >>>> With that in mind, I'm seeing slightly *slower* performance with the >>>> patch applied: e.g: for 128k blocks - 2 runs, 1 uncached and the next >>>> cached. >>> >>> >>> Yeah, that doesn't look good. Mind trying this one out? I wonder if we >>> doubly wait on them - or perhaps rbd_aio_wait_for_complete() isn't >>> working correctly. If you try this one, we should know more... >>> >>> Goal is, I want to get rid of that usleep() in getevents. >>> >> >> Testing with v3 patch applied hangs. I did wonder if we had somehow hit a >> new variant of the cache issue - so reran with it disabled in ceph.conf. >> Result is the same: >> >> $ fio read-test.fio >> rbd_thread: (g=0): rw=read, bs=128K-128K/128K-128K/128K-128K, >> ioengine=rbd, iodepth=32 >> fio-2.1.13-88-gb2ee7 >> Starting 1 process >> rbd engine: RBD version: 0.1.8 >> Jobs: 1 (f=1): [R(1)] [0.1% done] [0KB/0KB/0KB /s] [0/0/0 iops] [eta >> 01h:25m:15s] >> >> >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >
Attachment:
rbd-comp-v4.patch
Description: Binary data