Re: interrupt/tasklet issue in custom driver on recent kernels

Peter Teoh <htmldeveloper@xxxxxxxxx> · Mon, 28 Sep 2009 12:08:44 -0400

On Mon, Sep 28, 2009 at 11:10 AM, Mulyadi Santosa
<mulyadi.santosa@xxxxxxxxx> wrote:
> On Mon, Sep 28, 2009 at 2:32 PM, Jason Nymble <jason.nymble@xxxxxxxxx> wrote:
>> 2.6.25 Changelog :
>> commit 9af993a92623e022c176459fa6607a564b9a7eaf
>> Author: Ingo Molnar <mingo@xxxxxxx>
>> Date:   Wed Jan 30 13:34:09 2008 +0100
>>
>>    x86: make ioremap() UC by default
>>

if UC (uncacheable) is default....then there will be less contention
problem, as multicore conflict can be resolved via putting memory
barrier.   but tradeoff is performance degradation.   so if u are sure
how to do those SMP/multicore stuff, and its behavior is well
understood, then WB is recommended, to fully utilize the L2 cache.

Sometimes, application level have a third feature called
write-combining (WC) is used - here neither L1/L2/L3 is used, but  a
separate memory buffer for combining the data before bursting it to
the IO devices.   Eg, for Matrox (in Documentation/fb/matroxfb.txt:)

    170 nomtrr   - disables write combining on frame buffer. This
slows down driver but
    171            there is reported minor incompatibility between GUS
DMA and XFree
    172            under high loads if write combining is enabled
(sound dropouts).
    173 mtrr     - enables write combining on frame buffer. It speeds
up video accesses
    174            much. It is default. You must have MTRR support
enabled in kernel
    175            and your CPU must have MTRR (f.e. Pentium II have them).

and many other graphics drivers:

./gpu/drm/i915/i915_gem.c:
	drm_core_ioremap_wc(&ring->map, dev);

./gpu/drm/i915/i915_dma.c:
		drm_core_ioremap_wc(&dev_priv->ring.map, dev);
	drm_core_ioremap_wc(&dev_priv->hws_map, dev);

./gpu/drm/i915/intel_fb.c:
	info->screen_base = ioremap_wc(dev->agp->base + obj_priv->gtt_offset,

./gpu/drm/radeon/r600_cp.c:
		drm_core_ioremap_wc(dev_priv->cp_ring, dev);
		drm_core_ioremap_wc(dev_priv->ring_rptr, dev);
		drm_core_ioremap_wc(dev->agp_buffer_map, dev);
		drm_core_ioremap_wc(&dev_priv->gart_info.mapping, dev);

./gpu/drm/radeon/radeon_cp.c:
		drm_core_ioremap_wc(dev_priv->cp_ring, dev);
		drm_core_ioremap_wc(dev_priv->ring_rptr, dev);
		drm_core_ioremap_wc(dev->agp_buffer_map, dev);
			drm_core_ioremap_wc(&dev_priv->gart_info.mapping, dev);

./gpu/drm/r128/r128_cce.c:
		drm_core_ioremap_wc(dev_priv->cce_ring, dev);
		drm_core_ioremap_wc(dev_priv->ring_rptr, dev);
		drm_core_ioremap_wc(dev->agp_buffer_map, dev);

./net/myri10ge/myri10ge.c:
	mgp->sram = ioremap_wc(mgp->iomem_base, mgp->board_span);

>>    Yes! A mere 120 c_p_a() fixing and rewriting patches later,
>>    we are now confident that we can enable UC by default for
>>    ioremap(), on x86 too.
>>
>>    Every other architectures was doing this already. Doing so
>>    makes Linux more robust against MTRR mixups (which might go
>>    unnoticed if BIOS writers test other OSs only - where PAT
>>    might override bad MTRRs defaults).

Both PAT and MTRR are used to specify the memory attributes for caching etc.

cat /proc/mtrr
reg00: base=0x080000000 ( 2048MB), size= 2048MB, count=1: uncachable
reg01: base=0x000000000 (    0MB), size= 4096MB, count=1: write-back
reg02: base=0x100000000 ( 4096MB), size= 1024MB, count=1: write-back

and in dmesg:

     22 [    0.000000] MTRR default type: uncachable
     23 [    0.000000] MTRR fixed ranges enabled:
     24 [    0.000000]   00000-9FFFF write-back
     25 [    0.000000]   A0000-DFFFF uncachable
     26 [    0.000000]   E0000-EFFFF write-through
     27 [    0.000000]   F0000-FFFFF write-protect
     28 [    0.000000] MTRR variable ranges enabled:
     29 [    0.000000]   0 base 080000000 mask F80000000 uncachable
     30 [    0.000000]   1 base 000000000 mask F00000000 write-back
     31 [    0.000000]   2 base 100000000 mask FC0000000 write-back
     32 [    0.000000]   3 disabled
     33 [    0.000000]   4 disabled
     34 [    0.000000]   5 disabled
     35 [    0.000000]   6 disabled
     36 [    0.000000]   7 disabled
     37 [    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406,
new 0x7010600070106
     38 [    0.000000] e820 update range: 0000000080000000 -
0000000100000000 (usable) ==> (reserved)
     39 [    0.000000] last_pfn = 0x7ff90 max_arch_pfn = 0x400000000
     40 [    0.000000] initial memory mapped : 0 - 20000000
     41 [    0.000000] init_memory_mapping: 0000000000000000-000000007ff90000
     42 [    0.000000]  0000000000 - 007ff90000 page 4k

PAT is per-page, whereas MTRR is per address range, but only a small
limited number available (8???).   MTRR is going away, to be
completely replaced by PAT.   But since both are independent, they may
conflict each other.

>>
>>    Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>>    Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>
> Could somebody shed a light on what "more robust against MTRR mixups"
> here means?
>
>

-- 
Regards,
Peter Teoh

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ