Re: [PATCH 2/2] DSPBRIDGE: add checking 128 byte alignment for dsp cache line size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 13, 2009 at 5:18 PM, Hiroshi DOYU <Hiroshi.DOYU@xxxxxxxxx> wrote:
> From: ext Felipe Contreras <felipe.contreras@xxxxxxxxx>
> Subject: Re: [PATCH 2/2] DSPBRIDGE: add checking 128 byte alignment for dsp cache line size
> Date: Wed, 13 May 2009 14:50:26 +0200
>
> [...]
>
>> >> >> If we are not going to have a read-only/write-only flag then I'm
>> >> >> against adding the alignment check. It will only force user-space to
>> >> >> do memory copies unnecessarily. That would kill performance
>> >> >> drastically. NAK!
>> >> >
>> >> > At least, better than allowing user to crash kernel.
>> >>
>> >> The kernel would crash even with that check because it's the software
>> >> running in the *DSP* side that is causing the corruption.
>> >>
>> >> It sounds you are still not convinced by that so I'll write a test
>> >> application to make sure my assumption is correct.
>> >
>> > You misunderstood.
>> >
>> > There are 2 patterns when kernel crashes from ARM cache flush and DSP
>> > cache flush(*1).
>>
>> Let's suppose that's true, if you enforce ARM cache flushes to be
>> aligned (pattern 1) the DSP cache flushes (pattern 2) can still
>> corrupt the memory. Therefore even with your check the potential for
>> memory corruption is still there.
>>
>> But it doesn't seem it's true; I modified my dsp-dummy test to detect
>
> We are not talking about the specific application program/case, but
> the general risk that kernel can be crashed by userland with incorrect
> parameters in dspbridge.

The purpose of the test application is to quickly test different
scenarios... tell me which scenario will trigger the corruption you
are talking about, and I'll modify the test app to reproduce it.

>> memory corruption by constantly modifying the data before and after
>> the buffer sent to the DSP. Map, Flush and Invalidate are unaligned in
>> the ARM side, and not surprisingly there's memory corruption. *But* if
>> I modify the DSP socket node to don't flush/invalidate the first/last
>> 128 bytes of the buffer then there's no memory corruption.
>>
>> So that's proof Flush/Invalidate on ARM side don't cause memory
>> corruption.
>
> No, the above doesn't prove that ARM side doesn't cause memory
> corruption(pattern 1).
>
> Currently dspbridge doesn't have any parameter checking/validation
> from "bridge_ioctl()" to "v7_dma_*_range()". *Any* value/address can
> be passed to "v7_dma_*_range()" from userland. It means that any data
> which resides on cache can be , for exmaple, invalidated by userland
> spontaneously. That is why I mentioned the necessity of "VMA" checking
> to prevent this.

If user-space manually invalidates some random memory areas
spontaneously what's the worst that could happen? Bad performance?

The problem with corruption is not the cache invalidation, it's the
contents of the main memory.

> For pattern 2, if a buffer is aligned on DSP cache size, then just
> adding VMA checking can prevent userland from corrupting other
> process/kernel address space. Also same kind of VMA checking for DSP
> mmaping can prevent other cases too. So I think that, at least, we can
> protect kernel address from crash with having VMA and alignment
> checking. I'll work on that if we can agree on this.

pattern 2 is triggered by the software running on the DSP. Why are you
talking about VMA stuff that happens in the linux kernel on *ARM*
side?

>> Do you want to see the code?
>
> Yes, I want!! it would be quite helpful to test something easily;)

Ok, I pushed a branch:
http://github.com/felipec/dsp-dummy/tree/corruption-test

Now, I ran a few more tests and it turns out 'pattern 2' is not enough
to trigger the corruption, but neither is 'pattern 1'. The corruption
happens only when both happen at the same time.

So, the DSP writes on a wrong memory area, then flushes the cache, so
it goes the main memory, but the ARM side already updated the memory
on it's cache and it's different from the main memory. Then ARM side
invalidates it's cache and *bang* the corruption happens.

Now is ARM cache invalidation the problem? No, it's the DSP cache
flush. Even if there isn't a manual ARM cache invalidation the cache
will receive the main memory update at some point, wouldn't it? Maybe
a cache miss.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Arm (vger)]     [ARM Kernel]     [ARM MSM]     [Linux Tegra]     [Linux WPAN Networking]     [Linux Wireless Networking]     [Maemo Users]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux