Re: About bio_endio

Pranay Srivastava <pranjas@xxxxxxxxx> · Tue, 24 Jun 2014 23:39:12 +0530

Hello Alvin,

On Tue, Jun 24, 2014 at 9:53 PM, Alvin Abitria <abitria.alvin@xxxxxxxxx> wrote:
> Hello Pranay!
>
> Thanks for your reply.  I apologize for my very late reply, I was very
> preoccupied earlier at work.
>
>
> On Tue, Jun 24, 2014 at 1:07 PM, Pranay Srivastava <pranjas@xxxxxxxxx>
> wrote:
>>
>> Hello Alvin,
>>
>> On Mon, Jun 23, 2014 at 10:39 PM, Alvin Abitria <abitria.alvin@xxxxxxxxx>
>> wrote:
>> > Hello,
>> >
>> > I'm developing a block driver using the make_request method, effectively
>> > bypassing existing scsi or request stack in block layer.  So that means
>> > im
>> > directly working with bios.  As prescribed in linux documentation and
>> > from
>> > referring to similar drivers in kernel, you close a session with a bio
>> > with
>> > the bio_endio function.
>>
>> So it means you are just passing on the bios without the request
>> structure if I'm correct?
>> I don't know how you are handling blk_finish_plug without having
>> request or request queue,
>> I maybe wrong in understanding how you are handling it.
>>
> Yes, I'm working on bio's level.  No struct requests, and I haven't used
> blk_finish_plug yet.
> The block driver method I'm implementing is somewhat along the same line
> with nvme and mtip2xxx
> drivers in drivers/block directory (but differing in hardware specific level
> of course).

Ok I got it now. But why not use request structures? You can use at
least one for chaining bio(s). See below what I mean..

>>
>> >
>> > I usually invoke bio_endio during successful I/O completion, meaning
>> > with an
>> > error code of zero.  But there are cases that this is not fulfilled or
>> > there
>> > are error cases.  My question is, what are the valid error codes that
>> > can be
>> > used with it?  My initial impression is that other than zero as error
>> > code,
>> -EIO is the one that you should use I think,
>> > bio_endio will fail.  I've read somewhere that -EBUSY is not recognized,
>> > and
>> > I tried -EIO but my driver crashed.  I got a panic in some dio_xxx
>> > function
>> > leading from bio_endio(bio,-EIO). I would like to block subsequent bios
>> > sent
>>
>> If it's okay for you to post the error then can you do that? I was
>> seeing the code for
>> dio_end_io but it would be good if you can post the exact crash
>> backtrace if you've got that.
>
>
> Here you go:
>
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffff811b9a80>] bio_check_pages_dirty+0x50/0xe0
> PGD 42e5e4067 PUD 42e6e7067 PMD 0
> Oops: 0000 [#1] SMP
> last sysfs file:
> /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size
> CPU 7
> Modules linked in: block_module(U) fuse ip6table_filter ip6_tables
> ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4
> nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle
> iptable_filter ip_tables bridge autofs4 sunrpc 8021q garp stp llc
> cpufreq_ondemand freq_table pcc_cpufreq ipv6 vhost_net macvtap macvlan tun
> kvm_intel kvm uinput power_meter hpilo hpwdt sg tg3 microcode serio_raw
> iTCO_wdt iTCO_vendor_support ioatdma dca shpchp ext4 mbcache jbd2 sd_mod
> crc_t10dif hpsa pata_acpi ata_generic ata_piix dm_mirror dm_region_hash
> dm_log dm_mod [last unloaded: scsi_wait_scan]
>
> Pid: 3740, comm: fio Not tainted 2.6.32-358.el6.x86_64 #1 HP ProLiant DL380p
> Gen8
> RIP: 0010:[<ffffffff811b9a80>]  [<ffffffff811b9a80>]
> bio_check_pages_dirty+0x50/0xe0
> RSP: 0018:ffff8804191618c8  EFLAGS: 00010046
> RAX: 2000000000000000 RBX: ffff88041909f0c0 RCX: 00000000000011ae
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffff8804191618f8 R08: ffffffff81c07728 R09: 0000000000000040
> R10: 0000000000000002 R11: 0000000000000002 R12: 0000000000000000
> R13: ffff8804191b9b80 R14: ffff8804191b9b80 R15: 0000000000000000
> FS:  00007fcd43e2d720(0000) GS:ffff8800366e0000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000041e69f000 CR4: 00000000000407e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process fio (pid: 3740, threadinfo ffff880419160000, task ffff88043341f500)
> Stack:
>  0000000000000000 ffff88043433f400 ffff88043433f520 ffff88041909f0c0
> <d> ffff88041909f0c0 ffff8804191b9b80 ffff880419161948 ffffffff811bdc38
> <d> ffff880419161968 0000000034236400 00000000fffffffb ffff88043433f400
> Call Trace:
>  [<ffffffff811bdc38>] dio_bio_complete+0xc8/0xd0

You are using a different kernel then mine. See if the offset within
this function leads you to

                spin_lock_irqsave(&bio_dirty_lock, flags);
                bio->bi_private = bio_dirty_list; /*This is the line
number I got from the offset*/
                bio_dirty_list = bio;
                spin_unlock_irqrestore(&bio_dirty_lock, flags);

You are not doing bio_put in the driver code right? Assuming you are
not doing bio_get also. On which filesystem you are testing this? Can
you see if it happens on a simpler one like ext2 since it uses the
generic direct_IO call. So we can further narrow it down.

>  [<ffffffff811bef4f>] dio_bio_end_aio+0x2f/0xd0
>  [<ffffffff811b920d>] bio_endio+0x1d/0x40
>  [<ffffffffa02c15a1>] block_make_request+0xe1/0x150 [block_module]
>  [<ffffffff8125ccce>] generic_make_request+0x25e/0x530
>  [<ffffffff811bae72>] ? bvec_alloc_bs+0x62/0x110
>  [<ffffffff8125d02d>] submit_bio+0x8d/0x120
>  [<ffffffff811bdf6c>] dio_bio_submit+0xbc/0xc0
>  [<ffffffff811be951>] __blockdev_direct_IO_newtrunc+0x631/0xb30
>  [<ffffffff8111afe3>] ? filemap_fault+0xd3/0x500
>  [<ffffffff811beeae>] __blockdev_direct_IO+0x5e/0xd0
>  [<ffffffff811bb280>] ? blkdev_get_blocks+0x0/0xc0
>  [<ffffffff811bc347>] blkdev_direct_IO+0x57/0x60
>  [<ffffffff811bb280>] ? blkdev_get_blocks+0x0/0xc0
>  [<ffffffff8111bb8b>] generic_file_aio_read+0x6bb/0x700
>  [<ffffffff81166a2a>] ? kmem_getpages+0xba/0x170
>  [<ffffffff81166f87>] ? cache_grow+0x217/0x320
>  [<ffffffff811bb893>] blkdev_aio_read+0x53/0xc0
>  [<ffffffff8111c633>] ? mempool_alloc+0x63/0x140
>  [<ffffffff811bb840>] ? blkdev_aio_read+0x0/0xc0
>  [<ffffffff811cadc4>] aio_rw_vect_retry+0x84/0x200
>  [<ffffffff811cc784>] aio_run_iocb+0x64/0x170
>  [<ffffffff811cdbb1>] do_io_submit+0x291/0x920
>  [<ffffffff811ce250>] sys_io_submit+0x10/0x20
>  [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
> Code: 31 e4 45 31 ff eb 15 0f 1f 40 00 0f b7 43 28 41 83 c4 01 41 83 c7 01
> 44 39 e0 7e 36 4d 63 ec 49 c1 e5 04 4f 8d 2c 2e 49 8b 7d 00 <48> 8b 07 a8 10
> 75 06 66 a9 00 c0 74 d3 e8 5e 59 f7 ff 49 c7 45
> RIP  [<ffffffff811b9a80>] bio_check_pages_dirty+0x50/0xe0
>  RSP <ffff8804191618c8>
> CR2: 0000000000000000
>
> Basically, after receiving the bio from generic_make_request, I checked and
> found that
> I already have the maximum outstanding pending I/O (bios) and couldn't
> accomodate
> this bio for now.  Rather than just exiting the make_request() routine, I
> decided to close
> this bio or return it to kernel via bio_endio(bio, -EIO).  No processing or
> modification
>  was done to the bio, I just used the callback fn above.
>
>>
>> > to me after reaching my queue depth and with no tags left, and so I want
>> > to
>> > use bio_endio with an error code.
>>
>> If you have a request queue then you could call blk_stop_queue and
>> blk_start_queue but I don't know if this is
>> relevant to your case.
>
>
> I do have a request queue allocated using blk_alloc_queue, but this is more
> of a dummy queue
> in my case - I don't use it as much as the struct requests would use it.
> Its function that I used is
> for me to register my make_request function, register the maximum I/O size
> and max number of
> buffer segments that can be sent to my driver.

See request generation is not in your hands. You can't tell a file
system don't send in more requests but what your driver can do is
choose to delay their processing when it is able to do so. So I don't
think you should return error from make_request at least hold on to
those bios so you can process them later. We don't want to lie to
applications that an I/O error occurred right?

For this purpose I thought you might use one static request structure
and since you are already overriding the default one just put those
bios in that request, chain them. I guess you'll have to do a bit more
if you work this way so that you don't mess with the request code
anywhere else, but if putting those bios on a list is easier go that
way then.

>
> Thanks for the suggestion, I haven't tried blk_stop_queue and
> blk_start_queue but I hope it will
> work.  I also like the possibility that you can prevent the upper layers
> from sending me further bio
> when my queue depth is full, then letting them know once my driver can
> accomodate new bios -
> more of telling them I'm busy right now, don't send me further bios, etc.
> Is that the general idea behind
> the two functions you mentioned?

Yup that's what those functions are for. But they just stop processing
part not the generation part of request. I hope you got the point I'm
trying to make.

>
> I also notice that the two kernel APIs you gave either set/clear a certain
> request queue flag.  I'd like to
> think that upper layers (generic_make_request etc) check that flag first to
> decide if it can dispatch bios
> to my driver, is that right?
Right that's for the queue, See blk_queue_stopped, if that's set then
processing won't be done right now i.e. request_function won't be
called. But nothing on the make_request function since that's not in
your hands. Let the requests be queued but process them at your
discretion.
>>
>>
>> >
>> > What are those error codes, and will they work for my intended function?
>> > Thanks!

I thought -EIOCBQUEUED might help but I don't think it would. You can
try it though.

>>
>> -EIO should work, but first let's find out why you got the crash.
>>
> Thanks.  I hope we get to the bottom why it failed and crashed.  Let me know
> if you have questions.
>
>>
>> > _______________________________________________
>> > Kernelnewbies mailing list
>> > Kernelnewbies@xxxxxxxxxxxxxxxxx
>> > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>> >
>>
>>
>>
>> --
>>         ---P.K.S
>
> Alvin

-- 
        ---P.K.S

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@xxxxxxxxxxxxxxxxx
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies