On 15.06.2015 21:52, Doug Dumitru wrote: >> Sounds pretty easy and I also got surprisingly far with my little kernel >> module. I've so far implemented ctr, dtr, map and status. > > Congratulations, you are actually a long way there. Thanks but I think I have the mountain still ahead -- still, I would really like to figure out the nitty-gritty. > You have to allocate a bio, populate it, allocate pages for buffer, > populate the bvec, and call make_request (or generic make request). You > will get the completion from the bio on the bottom half of the interrupt > handler, so how much work you can do there is debatable. You cannot start > an new IO from there, which you need to. You will probably want to start a > helper thread and have the completion routine schedule itself onto your > thread. Once you are back on your thread, you can do just about anything. > > Because you need to do IO, you will not be able to do a simple bio "bounce > redirect". You will need to do the IO youself (ie, call another make > request), but you can use the callers bvec for this, so there is no data > copy required. Once the request completes, you can then fin the caller. Oh, wow. This sounds truly terrifying. Let's dive in! I tried to read your hints one word at a time. So here's the somewhat pseudocodish solution to my homework: struct bio *b = bio_alloc(GFP_NOIO, 1); b->bi_size = 8; bio_alloc_pages(b, GFP_NOIO); b->bi_sector = 1234; b->bi_bdev = lc->metadev->bdev; b->bi_rw = READ; b->bi_private = local_ctx; b->bi_end_io = read_complete_callback; generic_make_request(bi); static void read_complete_callback(struct bio *b, int error) { // ??? printk(KERN_INFO "First read byte: %02x\n", b->bi_io_vec[0]->bv_page[0]); } So I hope this is even remotely close to what I should end up with. This will alloc a new bio with, as I understand it, one page buffer in b->bi_io_vec. This buffer is then allocated with bio_alloc_pages to 8 sectors in size (i.e. exactly one page of 4096 bytes). Then the read address, block device and read mode is set. I pass some kind of local context so I can do something meaningful in the callback and specify the callback function. Then I execute the request. As I understand, this executes asynchronously. So here comes the threading into play, right? Just pseudocode (because I can't judge how far I'm off here), but let's say this is map(): void read_complete_callback() { semaphore_inc(local_ctx); } void map() { local_ctx->semaphore->value = 0; // Issue read as above generic_make_request(bi); semaphore_dec(&local_ctx->semaphore); // Now the concurrent async IO has finished and we interpret the data [...] } Oh boy I really don't know if this is even remotely close. Any hints, as easy as they may seem to you guys, are really greatly appreciated. I've never worked with this stuff. > If you cannot continue because devices are not present or the right size, > yes you should fail the ctr routine. Alright! > If you want to setup /proc or other monitoring stuff, you can use the init > routine, probably plus some statics, to setup "views" into your module. If > you want to support multiple instances (and you should), setup a > /proc/{yourname} directory on the init and then populate it with > sub-directories every time you create a device. Okay, I'll try to do this (want to make statistics available via procfs later on), but one construction site at a time for me. >> - Can I determine the size the bio in map() will have already in ctr() >> somehow? Can I assume it will never change if it was once determined? >> The reason is that for my example I need to make sure the chunk size is >> a integer multiple of the bio size and I would only like to check this >> once (in ctr) and not every time (in map). > > Block size will not change. The size of requests to you is limited by the > setup of ti->max_io_len. If you don't set this with recent kernels, you > will only get 4K, which is not all that efficient. This is actually part > of another big topic of "stacked limits", which someone could write a book > on (and I would read it). So if I would want to do a large I/O operation (say write one megabyte of data to a block device somewhere within my driver) I'd have to make lots of calls to generic_make_request? Thank you so much for your help, Best regards, Johannes -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel