2010/11/29 Stefan Hajnoczi <stefanha@xxxxxxxxx>: > On Thu, Nov 25, 2010 at 6:06 AM, Yoshiaki Tamura > <tamura.yoshiaki@xxxxxxxxxxxxx> wrote: >> event-tap controls when to start FT transaction, and provides proxy >> functions to called from net/block devices. While FT transaction, it >> queues up net/block requests, and flush them when the transaction gets >> completed. >> >> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@xxxxxxxxxxxxx> >> Signed-off-by: OHMURA Kei <ohmura.kei@xxxxxxxxxxxxx> >> --- >> Makefile.target | 1 + >> block.h | 9 + >> event-tap.c | 794 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> event-tap.h | 34 +++ >> net.h | 4 + >> net/queue.c | 1 + >> 6 files changed, 843 insertions(+), 0 deletions(-) >> create mode 100644 event-tap.c >> create mode 100644 event-tap.h > > event_tap_state is checked at the beginning of several functions. If > there is an unexpected state the function silently returns. Should > these checks really be assert() so there is an abort and backtrace if > the program ever reaches this state? > >> +typedef struct EventTapBlkReq { >> + char *device_name; >> + int num_reqs; >> + int num_cbs; >> + bool is_multiwrite; > > Is multiwrite logging necessary? If event tap is called from within > the block layer then multiwrite is turned into one or more > bdrv_aio_writev() calls. > >> +static void event_tap_replay(void *opaque, int running, int reason) >> +{ >> + EventTapLog *log, *next; >> + >> + if (!running) { >> + return; >> + } >> + >> + if (event_tap_state != EVENT_TAP_LOAD) { >> + return; >> + } >> + >> + event_tap_state = EVENT_TAP_REPLAY; >> + >> + QTAILQ_FOREACH(log, &event_list, node) { >> + EventTapBlkReq *blk_req; >> + >> + /* event resume */ >> + switch (log->mode & ~EVENT_TAP_TYPE_MASK) { >> + case EVENT_TAP_NET: >> + event_tap_net_flush(&log->net_req); >> + break; >> + case EVENT_TAP_BLK: >> + blk_req = &log->blk_req; >> + if ((log->mode & EVENT_TAP_TYPE_MASK) == EVENT_TAP_IOPORT) { >> + switch (log->ioport.index) { >> + case 0: >> + cpu_outb(log->ioport.address, log->ioport.data); >> + break; >> + case 1: >> + cpu_outw(log->ioport.address, log->ioport.data); >> + break; >> + case 2: >> + cpu_outl(log->ioport.address, log->ioport.data); >> + break; >> + } >> + } else { >> + /* EVENT_TAP_MMIO */ >> + cpu_physical_memory_rw(log->mmio.address, >> + log->mmio.buf, >> + log->mmio.len, 1); >> + } >> + break; > > Why are net tx packets replayed at the net level but blk requests are > replayed at the pio/mmio level? > > I expected everything to replay either as pio/mmio or as net/block. Stefan, After doing some heavy load tests, I realized that we have to take a hybrid approach to replay for now. This is because when a device moves to the next state (e.g. virtio decreases inuse) is different between net and block. For example, virtio-net decreases inuse upon returning from the net layer, but virtio-blk does that inside of the callback. If we only use pio/mmio replay, even though event-tap tries to replay net requests, some get lost because the state has proceeded already. This doesn't happen with block, because the state is still old enough to replay. Note that using hybrid approach won't cause duplicated requests on the secondary. Thanks, Yoshi -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html