Re: WARNING in netlbl_cipsov4_add

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 23, 2021 at 6:47 AM Vegard Nossum <vegard.nossum@xxxxxxxxxx> wrote:
> Hi Paul,
>
> This syzbot report reproduces in mainline for me and it looks like
> you're the author/maintainer of this code, so I'm just adding some info
> to hopefully aid the preparation of a fix:

Hi Vegard,

Yes, you've come to the right place, thank you for your help in
tracking this down.  Some comments and initial thoughts below ...

> On 2021-02-20 08:05, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:

...

> Running strace on the reproducer says:
>
> socket(PF_NETLINK, SOCK_RAW, NETLINK_GENERIC) = 3
> socket(PF_NETLINK, SOCK_RAW, NETLINK_GENERIC) = 4
> sendto(4,
> "(\0\0\0\20\0\5\0\0\0\0\0\0\0\0\0\3\0\0\0\21\0\2\0NLBL_CIPSOv4\0\0\0\0",
> 40, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 40
> recvfrom(4,
> "\234\0\0\0\20\0\0\0\0\0\0\0\f\r\0\0\1\2\0\0\21\0\2\0NLBL_CIPSOv4\0\0\0\0\6\0\1\0\24\0\0\0\10\0\3\0\3\0\0\0\10\0\4\0\0\0\0\0\10\0\5\0\f\0\0\0T\0\6\0\24\0\1\0\10\0\1\0\1\0\0\0\10\0\2\0\v\0\0\0\24\0\2\0\10\0\1\0\2\0\0\0\10\0\2\0\v\0\0\0\24\0\3\0\10\0\1\0\3\0\0\0\10\0\2\0\n\0\0\0\24\0\4\0\10\0\1\0\4\0\0\0\10\0\2\0\f\0\0\0",
> 4096, 0, NULL, NULL) = 156
> recvfrom(4,
> "$\0\0\0\2\0\0\0\0\0\0\0\f\r\0\0\0\0\0\0(\0\0\0\20\0\5\0\0\0\0\0\0\0\0\0",
> 4096, 0, NULL, NULL) = 36
> sendmsg(3, {msg_name(0)=NULL,
> msg_iov(1)=[{"T\0\0\0\24\0\1\0\0\0\0\0\0\0\0\0\1\0\0\0,\0\10\200\34\0\7\200\10\0\5\0\3608)
> \10\0\6\0\0\0\0\0\10\0\6\0\0\0\0\0\f\0\7\200\10\0\5\0\0\0\0\0\4\0\4\200\10\0\1\0\0\0\0\0\10\0\2\0\1\0\0\0",
> 84}], msg_controllen=0, msg_flags=0}, 0) = 84
>
> We're ending up in netlbl_cipsov4_add() with CIPSO_V4_MAP_TRANS, so it
> calls netlbl_cipsov4_add_std() where this is the problematic allocation:
>
> doi_def->map.std->lvl.local = kcalloc(doi_def->map.std->lvl.local_size,
>                                        sizeof(u32),
>                                        GFP_KERNEL);
>
> It looks like there is already a check on the max size:
>
> if (nla_get_u32(nla_b) >
>      CIPSO_V4_MAX_LOC_LVLS)
>          goto add_std_failure;
> if (nla_get_u32(nla_b) >=
>      doi_def->map.std->lvl.local_size)
>       doi_def->map.std->lvl.local_size =
>               nla_get_u32(nla_b) + 1;
>
> However, the limit is quite generous:
>
> #define CIPSO_V4_INV_LVL              0x80000000
> #define CIPSO_V4_MAX_LOC_LVLS         (CIPSO_V4_INV_LVL - 1)
>
> so maybe a fix would just lower this to something that agrees with the
> page allocator?

Hmm, I agree that from a practical point of view the limit does seem
high.  The issue is that I'm not sure we have an easy way to determine
an appropriate local limit considering that it is determined by the
LSM and in some cases it is determined by the LSM's loaded policy.  On
the plus side you need privilege to get this far in the code so the
impact is minimized, although we still should look into catching this
prior to the WARN_ON_ONCE() in __alloc_pages_nodemask().  If nothing
else it allows the fuzzers to keep making progress and not die here.

We could drop CIPSO_V4_MAX_LOC_LVLS to an arbitrary value, or better
yet make it a sysctl (or similar), but that doesn't feel right and I'd
prefer to not create another runtime config knob if we don't have to
do so.  Is there a safe/stable way to ask the allocator what size is
*too* big?  That might be a better solution as we could catch it in
the CIPSO code and return an error before calling kmalloc.  I'm not a
mm expert, but looking through include/linux/slab.h I wonder if we
could just compare the allocation size with KMALLOC_SHIFT_MAX?  Or is
that still too big?

> At first glance it may appear like there is a similar issue with
> doi_def->map.std->lvl.cipso_size, but that one looks restricted to a
> saner limit of CIPSO_V4_MAX_REM_LVLS == 255. It's probably better to
> double check both in case I read this wrong.

This one is a bit easier, that limit is defined by the CIPSO protocol
and we really shouldn't change that.

FWIW, I would expect that we would have a similar issue with the
CIPSO_V4_MAX_LOC_CATS check in the same function.  My initial thinking
is that we can solve it in the same manner as the
CIPSO_V4_MAX_LOC_LVLS fix.

-- 
paul moore
www.paul-moore.com





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux