Re: Newbie device mapper questions

Doug Dumitru <doug@xxxxxxxxxx> · Mon, 15 Jun 2015 12:52:57 -0700

On Mon, Jun 15, 2015 at 11:39 AM, Johannes Bauer <dfnsonfsduifb@xxxxxx> wrote:
Hi list,

so I've had this idea stuck in my head for a while and am finally not

intimidated enough my the dm API to actually give it a shot and play

around. I'm just getting used to the DM internals so please apologize if

I sound like an idiot, I'm just new to the DM.

Something that I would like to implement first is a device mapper target

that takes three block devices as input: Two equally sized devices (src1

and src2) and a separate metadata device (meta). I want to map chunks of

the src devices to bits of a bitmap in the meta device. If the bit is

set in the meta device decides whether src1 and src2 is returned.

Sounds pretty easy and I also got surprisingly far with my little kernel

module. I've so far implemented ctr, dtr, map and status.

Congratulations, you are actually a long way there.

In map() I actually do the switching operation. I've looked at how

dm-linear implements this and copied a lot of information. Currently I

do static switching (fixed block size, ingoring the meta device). Here

are my questions:

- How can I read within the kernel from the block device lc->meta->bdev?

If I call "read_dev_sector" from "map" this results in a deadlock, I'm

guessing this is now how it's supposed to work. The bcache module must

perform something similar (because it also reads and writes metadata,

only much more complex), but I'll be damned but couldn't find out where

the actual reading/writing is performed in the code. What are things

that I should look at?

You have to allocate a bio, populate it, allocate pages for buffer, populate the bvec, and call make_request (or generic make request).  You will get the completion from the bio on the bottom half of the interrupt handler, so how much work you can do there is debatable.  You cannot start an new IO from there, which you need to.  You will probably want to start a helper thread and have the completion routine schedule itself onto your thread.  Once you are back on your thread, you can do just about anything.

Because you need to do IO, you will not be able to do a simple bio "bounce redirect".  You will need to do the IO youself (ie, call another make request), but you can use the callers bvec for this, so there is no data copy required.  Once the request completes, you can then fin the caller.

- Is the ctr callback the appropriate place to fail if a logical error

occurs? For example, if two src devices of dissimilar size are passed to

dmsetup?

If you cannot continue because devices are not present or the right size, yes you should fail the ctr routine.

If you want to setup /proc or other monitoring stuff, you can use the init routine, probably plus some statics, to setup "views" into your module.  If you want to support multiple instances (and you should), setup a /proc/{yourname} directory on the init and then populate it with sub-directories every time you create a device.

- Is i_size_read(lc->src1dev->bdev->bd_inode) the correct way of

determining the size of the underlying block device? If not, which

function is?

... I am happy to leave out answers that I don't know ...

- Can I safely assume the logical sector size is fixed to be 512 bytes

in all cases?

Probably not, but maybe.  You are in control of the hardware.

- In the dm-linear example, bio_sectors(bio) is checked. This gives, if

I understand it correctly, the size in sectors of the BIO (usually this

is 8). What I don't understand is in which cases this can become zero

(dm-linear has a if that checks for bio_sectors(bio) != 0).

.. just a sanity check.  If you get a call of zero size, it means something else is broken.

- Can I determine the size the bio in map() will have already in ctr()

somehow? Can I assume it will never change if it was once determined?

The reason is that for my example I need to make sure the chunk size is

a integer multiple of the bio size and I would only like to check this

once (in ctr) and not every time (in map).

Block size will not change.  The size of requests to you is limited by the setup of ti->max_io_len.  If you don't set this with recent kernels, you will only get 4K, which is not all that efficient.  This is actually part of another big topic of "stacked limits", which someone could write a book on (and I would read it).

Thank you very much for helping out a complete newbie :-)

Best regards,

Johannes

--

dm-devel mailing list

dm-devel@xxxxxxxxxx

https://www.redhat.com/mailman/listinfo/dm-devel

-- 
Doug Dumitru
EasyCo LLC

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel