2010/11/29 Stefan Hajnoczi <stefanha@xxxxxxxxx>: > On Thu, Nov 25, 2010 at 6:06 AM, Yoshiaki Tamura > <tamura.yoshiaki@xxxxxxxxxxxxx> wrote: >> event-tap controls when to start FT transaction, and provides proxy >> functions to called from net/block devices. While FT transaction, it >> queues up net/block requests, and flush them when the transaction gets >> completed. >> >> Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@xxxxxxxxxxxxx> >> Signed-off-by: OHMURA Kei <ohmura.kei@xxxxxxxxxxxxx> >> --- >> Makefile.target | 1 + >> block.h | 9 + >> event-tap.c | 794 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> event-tap.h | 34 +++ >> net.h | 4 + >> net/queue.c | 1 + >> 6 files changed, 843 insertions(+), 0 deletions(-) >> create mode 100644 event-tap.c >> create mode 100644 event-tap.h > > event_tap_state is checked at the beginning of several functions. If > there is an unexpected state the function silently returns. Should > these checks really be assert() so there is an abort and backtrace if > the program ever reaches this state? I'm wondering whether abort is too strong, but I think you're right because stopping Kemari may not be enough as an error handling. > >> +typedef struct EventTapBlkReq { >> + char *device_name; >> + int num_reqs; >> + int num_cbs; >> + bool is_multiwrite; > > Is multiwrite logging necessary? If event tap is called from within > the block layer then multiwrite is turned into one or more > bdrv_aio_writev() calls. If we move event-tap into block layer, I guess it won't be necessary. >> +static void event_tap_replay(void *opaque, int running, int reason) >> +{ >> + EventTapLog *log, *next; >> + >> + if (!running) { >> + return; >> + } >> + >> + if (event_tap_state != EVENT_TAP_LOAD) { >> + return; >> + } >> + >> + event_tap_state = EVENT_TAP_REPLAY; >> + >> + QTAILQ_FOREACH(log, &event_list, node) { >> + EventTapBlkReq *blk_req; >> + >> + /* event resume */ >> + switch (log->mode & ~EVENT_TAP_TYPE_MASK) { >> + case EVENT_TAP_NET: >> + event_tap_net_flush(&log->net_req); >> + break; >> + case EVENT_TAP_BLK: >> + blk_req = &log->blk_req; >> + if ((log->mode & EVENT_TAP_TYPE_MASK) == EVENT_TAP_IOPORT) { >> + switch (log->ioport.index) { >> + case 0: >> + cpu_outb(log->ioport.address, log->ioport.data); >> + break; >> + case 1: >> + cpu_outw(log->ioport.address, log->ioport.data); >> + break; >> + case 2: >> + cpu_outl(log->ioport.address, log->ioport.data); >> + break; >> + } >> + } else { >> + /* EVENT_TAP_MMIO */ >> + cpu_physical_memory_rw(log->mmio.address, >> + log->mmio.buf, >> + log->mmio.len, 1); >> + } >> + break; > > Why are net tx packets replayed at the net level but blk requests are > replayed at the pio/mmio level? > > I expected everything to replay either as pio/mmio or as net/block. It's my mistake, sorry about that. We're just in way of moving replay from pio/mmio to net/block, and I mixed up. I'll revert it to pio/mmio replay in the next spin. BTW, I would like to ask a question regarding this. There is a callback which net/block calls after processing the requests, and is there a clean way to set this callback on the failovered host upon replay? >> +static void event_tap_blk_load(QEMUFile *f, EventTapBlkReq *blk_req) >> +{ >> + BlockRequest *req; >> + ram_addr_t page_addr; >> + int i, j, len; >> + >> + len = qemu_get_byte(f); >> + blk_req->device_name = qemu_malloc(len + 1); >> + qemu_get_buffer(f, (uint8_t *)blk_req->device_name, len); >> + blk_req->device_name[len] = '\0'; >> + blk_req->num_reqs = qemu_get_byte(f); >> + >> + for (i = 0; i < blk_req->num_reqs; i++) { >> + req = &blk_req->reqs[i]; >> + req->sector = qemu_get_be64(f); >> + req->nb_sectors = qemu_get_be32(f); >> + req->qiov = qemu_malloc(sizeof(QEMUIOVector)); > > It would make sense to have common QEMUIOVector load/save functions > instead of inlining this code here. OK. >> +static int event_tap_load(QEMUFile *f, void *opaque, int version_id) >> +{ >> + EventTapLog *log, *next; >> + int mode; >> + >> + event_tap_state = EVENT_TAP_LOAD; >> + >> + QTAILQ_FOREACH_SAFE(log, &event_list, node, next) { >> + QTAILQ_REMOVE(&event_list, log, node); >> + event_tap_free_log(log); >> + } >> + >> + /* loop until EOF */ >> + while ((mode = qemu_get_byte(f)) != 0) { >> + EventTapLog *log = event_tap_alloc_log(); >> + >> + log->mode = mode; >> + switch (log->mode & EVENT_TAP_TYPE_MASK) { >> + case EVENT_TAP_IOPORT: >> + event_tap_ioport_load(f, &log->ioport); >> + break; >> + case EVENT_TAP_MMIO: >> + event_tap_mmio_load(f, &log->mmio); >> + break; >> + case 0: >> + DPRINTF("No event\n"); >> + break; >> + default: >> + fprintf(stderr, "Unknown state %d\n", log->mode); >> + return -1; > > log is leaked here... Oops:( > >> + } >> + >> + switch (log->mode & ~EVENT_TAP_TYPE_MASK) { >> + case EVENT_TAP_NET: >> + event_tap_net_load(f, &log->net_req); >> + break; >> + case EVENT_TAP_BLK: >> + event_tap_blk_load(f, &log->blk_req); >> + break; >> + default: >> + fprintf(stderr, "Unknown state %d\n", log->mode); >> + return -1; > > ...and here. Oops again:( Will fix them. Yoshi > > Stefan > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html