Re: [RFC PATCH v2 0/2] Randomization of address chosen by mmap.

Ilya Smith <blackzert@xxxxxxxxx> · Tue, 27 Mar 2018 16:51:08 +0300

> On 27 Mar 2018, at 10:24, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> 
> On Mon 26-03-18 22:45:31, Ilya Smith wrote:
>> 
>>> On 26 Mar 2018, at 11:46, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
>>> 
>>> On Fri 23-03-18 20:55:49, Ilya Smith wrote:
>>>> 
>>>>> On 23 Mar 2018, at 15:48, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>>>>> 
>>>>> On Thu, Mar 22, 2018 at 07:36:36PM +0300, Ilya Smith wrote:
>>>>>> Current implementation doesn't randomize address returned by mmap.
>>>>>> All the entropy ends with choosing mmap_base_addr at the process
>>>>>> creation. After that mmap build very predictable layout of address
>>>>>> space. It allows to bypass ASLR in many cases. This patch make
>>>>>> randomization of address on any mmap call.
>>>>> 
>>>>> Why should this be done in the kernel rather than libc?  libc is perfectly
>>>>> capable of specifying random numbers in the first argument of mmap.
>>>> Well, there is following reasons:
>>>> 1. It should be done in any libc implementation, what is not possible IMO;
>>> 
>>> Is this really so helpful?
>> 
>> Yes, ASLR is one of very important mitigation techniques which are really used 
>> to protect applications. If there is no ASLR, it is very easy to exploit 
>> vulnerable application and compromise the system. We can’t just fix all the 
>> vulnerabilities right now, thats why we have mitigations - techniques which are 
>> makes exploitation more hard or impossible in some cases.
>> 
>> Thats why it is helpful.
> 
> I am not questioning ASLR in general. I am asking whether we really need
> per mmap ASLR in general. I can imagine that some environments want to
> pay the additional price and other side effects, but considering this
> can be achieved by libc, why to add more code to the kernel?

I believe this is the only one right place for it. Adding these 200+ lines of 
code we give this feature for any user - on desktop, on server, on IoT device, 
on SCADA, etc. But if only glibc will implement ‘user-mode-aslr’ IoT and SCADA 
devices will never get it.

>>> 
>>>> 2. User mode is not that layer which should be responsible for choosing
>>>> random address or handling entropy;
>>> 
>>> Why?
>> 
>> Because of the following reasons:
>> 1. To get random address you should have entropy. These entropy shouldn’t be 
>> exposed to attacker anyhow, the best case is to get it from kernel. So this is
>> a syscall.
> 
> /dev/[u]random is not sufficient?

Using /dev/[u]random makes 3 syscalls - open, read, close. This is a performance
issue.

> 
>> 2. You should have memory map of your process to prevent remapping or big
>> fragmentation. Kernel already has this map.
> 
> /proc/self/maps?

Not any system has /proc and parsing /proc/self/maps is robust so it is the 
performance issue. libc will have to do it on any mmap. And there is a possible 
race here - application may mmap/unmap memory with native syscall during other 
thread reading maps.

>> You will got another one in libc.
>> And any non-libc user of mmap (via syscall, etc) will make hole in your map.
>> This one also decrease performance cause you any way call syscall_mmap 
>> which will try to find some address for you in worst case, but after you already
>> did some computing on it.
> 
> I do not understand. a) you should be prepared to pay an additional
> price for an additional security measures and b) how would anybody punch
> a hole into your mapping? 
> 

I was talking about any code that call mmap directly without libc wrapper.

>> 3. The more memory you use in userland for these proposal, the easier for
>> attacker to leak it or use in exploitation techniques.
> 
> This is true in general, isn't it? I fail to see how kernel chosen and
> user chosen ranges would make any difference.

My point here was that libc will have to keep memory representation as a tree 
and this tree increase attack surface. It could be hidden in kernel as it is right now.

> 
>> 4. It is so easy to fix Kernel function and so hard to support memory
>> management from userspace.
> 
> Well, on the other hand the new layout mode will add a maintenance
> burden on the kernel and will have to be maintained for ever because it
> is a user visible ABI.

Thats why I made this patch as RFC and would like to discuss this ABI here. I 
made randomize_va_space parameter to allow disable randomisation per whole 
system. PF_RANDOMIZE flag may disable randomization for concrete process (or 
process groups?). For architecture I’ve made info.random_shift = 0 , so if your 
arch has small address space you may disable shifting. I also would like to add 
some sysctl to allow process/groups to change this value and allow some 
processes to have shifts bigger then another. Lets discuss it, please.

> 
>>>> 3. Memory fragmentation is unpredictable in this case
>>>> 
>>>> Off course user mode could use random ‘hint’ address, but kernel may
>>>> discard this address if it is occupied for example and allocate just before
>>>> closest vma. So this solution doesn’t give that much security like 
>>>> randomization address inside kernel.
>>> 
>>> The userspace can use the new MAP_FIXED_NOREPLACE to probe for the
>>> address range atomically and chose a different range on failure.
>>> 
>> 
>> This algorithm should track current memory. If he doesn’t he may cause
>> infinite loop while trying to choose memory. And each iteration increase time
>> needed on allocation new memory, what is not preferred by any libc library
>> developer.
> 
> Well, I am pretty sure userspace can implement proper free ranges
> tracking…

I think we need to know what libc developers will say on implementing ASLR in 
user-mode. I am pretty sure they will say ‘nether’ or ‘some-day’. And problem 
of ASLR will stay forever.

Thanks,
Ilya