Re: wip-msgr2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 15, 2020 at 2:14 PM Luis Henriques <lhenriques@xxxxxxx> wrote:
>
> Luis Henriques <lhenriques@xxxxxxx> writes:
>
> > Ilya Dryomov <idryomov@xxxxxxxxx> writes:
> >
> >> On Mon, Dec 14, 2020 at 4:55 PM Luis Henriques <lhenriques@xxxxxxx> wrote:
> >>>
> >>> Ilya Dryomov <idryomov@xxxxxxxxx> writes:
> >>>
> >>> > Hello,
> >>> >
> >>> > I've pushed wip-msgr2 and opened a dummy PR in ceph-client:
> >>> >
> >>> >   https://github.com/ceph/ceph-client/pull/22
> >>> >
> >>> > This set has been through a over a dozen krbd test suite runs with no
> >>> > issues other than those with the test suite itself.  The diffstat is
> >>> > rather big, so I didn't want to spam the list.  If someone wants it
> >>> > posted, let me know.  Any comments are welcome!
> >>>
> >>> That's *awesome*!  Thanks for sharing, Ilya.  Obviously this will need a
> >>> lot of time to digest but a quick attempt to do a mount using a v2 monitor
> >>> is just showing me a bunch of:
> >>>
> >>> libceph: mon0 (1)192.168.155.1:40898 socket closed (con state V1_BANNER)
> >>>
> >>> Note that this was just me giving it a try with a dummy vstart cluster
> >>> (octopus IIRC), so nothing that could be considered testing.  I'll try to
> >>> find out what I'm doing wrong in the next couple of days or, worst case,
> >>> after EOY vacations.
> >>
> >> Hi Luis,
> >>
> >> This is because the kernel continues to default to msgr1.  The socket
> >> gets closed by the mon right after it sees msgr1 banner and you should
> >> see "peer ... is using msgr V1 protocol" error in the log.
> >>
> >> For msgr2, you need to select a connection mode using the new ms_mode
> >> option:
> >>
> >>   ms_mode=legacy        - msgr1 (default)
> >>   ms_mode=crc           - crc mode, if denied fail
> >>   ms_mode=secure        - secure mode, if denied fail
> >>   ms_mode=prefer-crc    - crc mode, if denied agree to secure mode
> >>   ms_mode=prefer-secure - secure mode, if denied agree to crc mode
> >
> > Ah, right.  I should have took a quick look at the patches first to check
> > for any new parameters.  Thanks for pointing me at that, I'll retry my
> > test using ms_mode.
>
> Maybe you're already aware of this, but here it goes: I'm seeing the
> following BUG when doing a mount using ms_mode=secure:
>
> [  159.648716] ==================================================================
> [  159.652328] BUG: KASAN: slab-out-of-bounds in prepare_head_secure_small+0x101/0x110
> [  159.656203] Write of size 16 at addr ffff8881014eb5c0 by task kworker/0:1/12
> [  159.659764]
> [  159.660600] CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.10.0-rc2+ #55
> [  159.663985] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
> [  159.669423] Workqueue: ceph-msgr ceph_con_workfn
> [  159.671559] Call Trace:
> [  159.672657]  dump_stack+0x9a/0xcc
> [  159.674115]  ? prepare_head_secure_small+0x101/0x110
> [  159.676127]  print_address_description.constprop.0+0x1c/0x220
> [  159.678455]  ? prepare_head_secure_small+0x101/0x110
> [  159.680334]  ? prepare_head_secure_small+0x101/0x110
> [  159.682245]  kasan_report.cold+0x1f/0x37
> [  159.683730]  ? queue_zeros+0x141/0x1c0
> [  159.685069]  ? prepare_head_secure_small+0x101/0x110
> [  159.686849]  check_memory_region+0x145/0x1a0
> [  159.688306]  memset+0x20/0x40
> [  159.689318]  prepare_head_secure_small+0x101/0x110
> [  159.691034]  ? gcm_crypt+0x190/0x190
> [  159.692266]  ? memset+0x20/0x40
> [  159.693349]  ? encode_preamble+0xc0/0xd0
> [  159.694591]  __prepare_control+0x3f1/0x450
> [  159.695463]  ? prepare_read_preamble+0x190/0x190
> [  159.696450]  ? kasan_unpoison_shadow+0x33/0x40
> [  159.697439]  ? ceph_kvmalloc+0x7e/0x110
> [  159.698369]  __handle_control+0x1115/0x2120
> [  159.699294]  ? process_hello+0x510/0x510
> [  159.700130]  ? lock_chain_count+0x20/0x20
> [  159.700995]  ? lock_acquire+0x1b1/0x580
> [  159.701780]  ? create_object.isra.0+0x239/0x4e0
> [  159.702776]  ? lockdep_enabled+0x39/0x50
> [  159.703667]  ? find_held_lock+0x85/0xa0
> [  159.704448]  ? __kmalloc+0x16d/0x330
> [  159.705231]  ? mark_held_locks+0x65/0x90
> [  159.706020]  ? lockdep_enabled+0x39/0x50
> [  159.706856]  ? ceph_kvmalloc+0x7e/0x110
> [  159.707588]  ceph_con_v2_try_read+0x13dc/0x1e00
> [  159.708456]  ? populate_out_iter+0x14b0/0x14b0
> [  159.709306]  ? __mutex_lock+0x2e9/0xb70
> [  159.710000]  ? trace_hardirqs_on+0x3e/0x110
> [  159.710770]  ? ceph_con_workfn+0x41/0xb00
> [  159.711503]  ? lock_acquire+0x1b1/0x580
> [  159.712187]  ? process_one_work+0x4b2/0xaa0
> [  159.712935]  ? lock_release+0x410/0x410
> [  159.713617]  ? lock_downgrade+0x380/0x380
> [  159.714353]  ceph_con_workfn+0x2df/0xb00
> [  159.715048]  process_one_work+0x599/0xaa0
> [  159.715729]  ? pwq_dec_nr_in_flight+0x110/0x110
> [  159.716489]  ? rwlock_bug.part.0+0x60/0x60
> [  159.717187]  worker_thread+0x2bc/0x780
> [  159.717832]  ? process_one_work+0xaa0/0xaa0
> [  159.718542]  kthread+0x1de/0x200
> [  159.719087]  ? kthread_create_worker_on_cpu+0xd0/0xd0
> [  159.719892]  ret_from_fork+0x22/0x30
> [  159.720488]
> [  159.720744] Allocated by task 12:
> [  159.721285]  kasan_save_stack+0x1b/0x40
> [  159.721901]  __kasan_kmalloc.constprop.0+0xc2/0xd0
> [  159.722678]  ceph_kvmalloc+0x68/0x110
> [  159.723249]  alloc_conn_buf+0x42/0xd0
> [  159.723817]  __handle_control+0x10bb/0x2120
> [  159.724456]  ceph_con_v2_try_read+0x13dc/0x1e00
> [  159.725149]  ceph_con_workfn+0x2df/0xb00
> [  159.725755]  process_one_work+0x599/0xaa0
> [  159.726382]  worker_thread+0x2bc/0x780
> [  159.726957]  kthread+0x1de/0x200
> [  159.727445]  ret_from_fork+0x22/0x30
> [  159.727976]
> [  159.728216] The buggy address belongs to the object at ffff8881014eb580
> [  159.728216]  which belongs to the cache kmalloc-96 of size 96
> [  159.730011] The buggy address is located 64 bytes inside of
> [  159.730011]  96-byte region [ffff8881014eb580, ffff8881014eb5e0)
> [  159.731648] The buggy address belongs to the page:
> [  159.732333] page:00000000892842f3 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1014eb
> [  159.733656] flags: 0x8000000000000200(slab)
> [  159.734264] raw: 8000000000000200 ffffea000415eac0 0000001900000019 ffff888100041780
> [  159.735355] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
> [  159.736437] page dumped because: kasan: bad access detected
> [  159.737229]
> [  159.737454] Memory state around the buggy address:
> [  159.738130]  ffff8881014eb480: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
> [  159.739533]  ffff8881014eb500: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
> [  159.740898] >ffff8881014eb580: 00 00 00 00 00 00 00 00 04 fc fc fc fc fc fc fc
> [  159.742221]                                            ^
> [  159.743035]  ffff8881014eb600: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
> [  159.744053]  ffff8881014eb680: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
> [  159.745091] ==================================================================
>
> Looks like the memset in prepare_head_secure_small() is cleaning behind
> the base limits.  Unfortunately, I didn't really had time to dig deeper
> into this.

Ah, I disabled KASAN for some performance testing and didn't turn
it back on.  This doesn't actually corrupt any memory because the
96-byte object that gets allocated is big enough.  In fact, the
relevant code used to request 96 bytes independent of the connection
mode until I changed it to follow the on-wire format more strictly.

This frame is 68 bytes in plane mode and 96 bytes in secure mode
but we are requesting 68 bytes in both modes.  The following should
fix it:

diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c
index 5e38c847317b..11fd47b36fc8 100644
--- a/net/ceph/messenger_v2.c
+++ b/net/ceph/messenger_v2.c
@@ -1333,7 +1333,8 @@ static int prepare_auth_signature(struct
ceph_connection *con)
        void *buf;
        int ret;

-       buf = alloc_conn_buf(con, head_onwire_len(SHA256_DIGEST_SIZE, false));
+       buf = alloc_conn_buf(con, head_onwire_len(SHA256_DIGEST_SIZE,
+                                                 con_secure(con)));
        if (!buf)
                return -ENOMEM;

Thanks,

                Ilya



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux