Re: [PATCH 0/10] Sparse linker

"Alexey Zaytsev" <alexey.zaytsev@xxxxxxxxx> · Fri, 5 Sep 2008 13:49:29 +0400

On Fri, Sep 5, 2008 at 1:24 AM, Christopher Li <sparse@xxxxxxxxxxx> wrote:
> On Thu, Sep 4, 2008 at 1:21 PM, Alexey Zaytsev <alexey.zaytsev@xxxxxxxxx> wrote:
>> Mostly ack here, but I still think the C code has two advantages over
>> binaries: It's easy to read, and it's an easy way to get the shared
>> library filled with the data, see below.
>
> It does not stop you to have some parsing tool to generate readable
> format from the object dump. But using the C source as primary way to
> dump object is letting the tail whack the dog. The on disk format should
> be optimized towards easy for checker rather than human to read it.
>
>> The huge disadvantage is the time and the memory it takes to compile
>> the C code.
>
> And the run time dependency of gcc.
>
>> Here I have to disagree. Loading the data from an .so might actually the
>> most evfficient method. See, the bulk of data of the .so is simply mmap'ed
>> read-only, with only the GOT being read-write, and when mapping with
>> RTLD_LAZY, the pointers are resolved only when you follow them, completely
>> transparently to us. You don't need the fine-grained control, the OS just does
>> the right thing for you. And if the checker needs to look at the bulk
>> of the data,
>
> Are you sure?
>
> Quote the man page:
> ===================
> RTLD_LAZY
>    Perform lazy binding. Only resolve symbols as the code that
> references them is executed. If the symbol is never referenced, then
> it is never resolved. (Lazy binding is only performed for function
> references; references to variables are always immediately bound when
> the library is loaded.)
> ===================
>
> Your symbol is store as DATA  nodes. Not functions. You never EXECUTE
> your sparse object code. The RTLD_LAZY has ZERO effect on them. All the symbol
> has to be immediately bounded. How can you tell which data pointer is lazy bound
> given that all the data value is possible in the pointer?
>

Confirmed, I was wrong.

>> it cat dlopen with RTLD_NOW. When multiple different checkers are being run
>> over the .so, the bulk of memory is shared between the processes, which I
>> think matters a lot. The memory is cheap, but now the number of cores
>> is growing.
>> E.g. if you've got 4 cores and 4 gigs of RAM, it's only one gig per
>> core, and wasting
>> 300 megabytes per process just to load the data doasn't look like a good idea.
>
> Even they are mmaped. Every symbol have to be touch up. So they need
> to swap in and COW. The COW memory can't be shared between process
> at all.  This is against the tradition of sparse being a small and neat tools.

And also here.
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html