On May 14, 2007, at 8:56 AM, Erik Mouw wrote:
On Sat, May 12, 2007 at 10:46:21AM -0400, Michael Cashwell wrote:
On May 12, 2007, at 4:49 AM, Erik Mouw wrote:
It *is* running in kernel mode, and that's also what the Oops
says: SVC_32 is ARM supervisor mode (aka kernel mode). It also
says that the problem is accessing user mode memory. Without
having seen the source, I guess you're probably directly
dereferencing a userland address without using copy_from_user().
Interesting. The faulting instruction is the driver attempting to
read the endianess mode from a CPU peripheral register. The
address is 0xfffeec34 and the code (after preprocessing; and from
memory) is essentially: unsigned int emode = *((volatile unsigned
int *)
0xfffeec34);
That sounds like a broken driver to me. ARM CPU endianess is a
compile time decision, not a run time one. Trying to detect it at
runtime is broken: if the module is compiled LE, testing for that
will always tell you the system runs LE because a LE module will
not even run on a BE system.
I wasn't clear. This is an OMAP processor. That's an ARM core
combined with a DSP and a unit called the Traffic Controller (a
memory controller that marshals accesses by them to a unified memory
space). The endianess in question is a mode in the Traffic Controller
that controls a byte/word-swapping engine to allow the ARM to be
little-endian while the DSP is big-endian such that each "sees"
physical memory natively. The ARM core itself in these processors is
LE only. I assume this was to decrease its gate count.
Just remove the test.
It then just faults a few lines later when it tries to activate the
DSP clock which is another near-by peripheral register. I'd already
tried that and it behaved predictably.
Oh, btw the use of volatile is considered harmful.
Sorry, but that's not universally true.
It is true when dealing with memory shared across multiple threads of
execution (and perhaps multiple processors). Volatile is not
sufficient to protect such shared data and once one has put proper
protection in place (mutexes) adding volatile is then extraneous and
performance-robbing. All quite true.
However, your reference:
See http://lkml.org/lkml/2007/5/11/193 (this patch or an improved
version will likely be merged in the future).
opts out my case. As I noted in my original post, the addresses I'm
accessing are for memory-mapped hardware that isn't bound by the
critical regions created by the various means to hold off preemption
(which mutexes and the like all boil down to).
When dealing with hardware it's necessary to tell gcc not to optimize
away dereferences that look superfluous. Applying volatile to such a
pointer tells the compiler that it is not to infer the location's
content based on previous reads or writes and that it is required to
emit code to actually perform each dereference as written. Getting
the memory accesses out to the buss in program order can require
further trickery on superscalar CPUs [eg: eieio on PPC], but
convincing the compiler not to eliminate any is still a required
first step.
...
I've looked there and will pursue it but my problem is that the
driver code is written with the expectation that these addresses
(constants) can be accessed as virtual addresses. The 1:1 mapping
with the physical addresses (and the no-cache attribute) exist and
work and seem to agree with how the rest of the mach-omap & ARM
code works in the kernel. Indeed, this driver can rely on these
mappings but only when running a kernel thread.
There is no 1:1 mapping in kernel and there was also never a 1:1
mapping in 2.4 kernels.
We must be looking at different kernels or exist in different
universes. "No" and "never" are big words that would appear to be
unsupported.
Consider arch/arm/mach-omap1/io.c which clearly creates such a
mapping for the DSP (which happens to be the peripheral I'm working
with). The defines for OMAP16XX_DSP_BASE and _START in include/asm-
arm/arch-omap/omap16xx.h explicitly make VA == PA. And it used to be
that a direct dereference worked and was done routinely by kernel
code in 2.4.
There were some changes in the ARM kernel mapping during 2.5
development, it could be that your CPU registers were 1:1 mapped in
2.4 and are no longer right now. Most ARM platforms have special
macro's to access CPU registers that deal with the mapping.
Ah! Now we're getting somewhere! Thanks! This prompted me to
investigate further how the 2.6 platform-specific code did this. More
below.
...If I were controlling private hardware then a mapping where VA !
= PA would be OK. But as noted the driver code was written
expecting to have access to the CPU-peripherals at their native
addresses and this assumption is totally in keeping with the other
CPU-specific parts of the kernel.
That assumption is flawed. There is and never has been a 1:1 VA:PA
mapping in the kernel.
As noted above, that's demonstrably incorrect for at least some
peripherals, including the one I'm working on.
Wrapping all of those accesses such that they use a computed base
VA instead would be a substantial rewrite. I'd be willing to do
that, but all of this worked under 2.4.20 so I really just keep
coming back to the conclusion that I'm missing something fundamental.
I think it worked in 2.4.20 by sheer luck.
True and had the 2.4 kernel code itself not done direct accesses it's
likely the driver I'm porting to 2.6 would not have originally been
written similarly.
But following your notes above about wrapper macros I looked deeper
into the 2.6 source (in areas where I hadn't needed to touch) and
found the magic set of accessor calls. These seem to boil down to
memory dereferences but with an offset added that was not there (or
was not needed or used rigorously) in 2.4. I've seen claims since 2.4
that accessors were always used and direct access has always been
frowned upon but that's plainly not the case in the 2.4 ARM source.
That aside, having discovered the accessors and after significant
work to retool the driver source to use them (in particular dealing
with it's penchant for assigning to a dereferenced pointer as an lval
but the accessor needs to have function call syntax) I finally was
able to get the code to work.
It's interesting to note that the implementation of these accessors
still uses volatile. I'm perfectly content for it to vanish from my
code but it's still in there.
Thanks for the assistance.
-Mike
--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ