Re: [PATCH v3] ld.so.8: Describe glibc Hardware capabilities

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 24/10/23 04:59, Stefan Puiu wrote:
> Hi Adhemerval,
> 
> On Fri, Oct 20, 2023 at 8:50 PM Adhemerval Zanella Netto
> <adhemerval.zanella@xxxxxxxxxx> wrote:
>>
>>
>>
>> On 19/10/23 16:10, Stefan Puiu wrote:
>>> Hi and sorry to jump in so late in the discussion,
>>
>> Thanks for the review!
> 
> Well, I'm glad to nitpi... um, help :).
> 
>>
>>>
>>> On Thu, Oct 19, 2023 at 8:23 PM Adhemerval Zanella
>>> <adhemerval.zanella@xxxxxxxxxx> wrote:
>>>>
>>>> It was added on glibc 2.33 as a way to improve path search, since
>>>> legacy hardware capabilities combination scheme do not scale
>>>> properly with new hardware support.  The legacy support was removed
>>>> on glibc 2.37, so it is the only scheme currently supported.
>>>
>>> I would suggest (caveat: non-native English speaker here):
>>>
>>> s/It was added on glibc/The feature was added in glibc/
>>> s/improve path search/improve the path search/
>>> s/since legacy/since the legacy/
>>> s/hardware capabilities combination scheme do not/hardware
>>> capabilities scheme does not/
>>> s/was removed on glibc/was removed in glibc/
>>
>> Ack.
>>
>>>
>>>>
>>>> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@xxxxxxxxxx>
>>>> ---
>>>>  man8/ld.so.8 | 48 +++++++++++++++++++++++++++++++++++++++++++++++-
>>>>  1 file changed, 47 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/man8/ld.so.8 b/man8/ld.so.8
>>>> index cf03cb85e..5b02ae88f 100644
>>>> --- a/man8/ld.so.8
>>>> +++ b/man8/ld.so.8
>>>> @@ -208,6 +208,14 @@ The objects in
>>>>  .I list
>>>>  are delimited by colons.
>>>>  .TP
>>>> +.BI \-\-glibc-hwcaps-mask " list"
>>>> +only search built-in subdirectories if in
>>>> +.IR list .
>>>> +.TP
>>>> +.BI \-\-glibc-hwcaps-prepend " list"
>>>> +Search glibc-hwcaps subdirectories in
>>>> +.IR list .
>>>> +.TP
>>>>  .B \-\-inhibit\-cache
>>>>  Do not use
>>>>  .IR /etc/ld.so.cache .
>>>> @@ -808,7 +816,7 @@ as a temporary workaround to a library misconfiguration issue.)
>>>>  .I lib*.so*
>>>>  shared objects
>>>>  .SH NOTES
>>>> -.SS Hardware capabilities
>>>> +.SS Legacy Hardware capabilities (from glibc 2.5 to glibc 2.37)
>>>>  Some shared objects are compiled using hardware-specific instructions which do
>>>>  not exist on every CPU.
>>>>  Such objects should be installed in directories whose names define the
>>>> @@ -843,6 +851,44 @@ z900, z990, z9-109, z10, zarch
>>>>  .B x86 (32-bit only)
>>>>  acpi, apic, clflush, cmov, cx8, dts, fxsr, ht, i386, i486, i586, i686, mca, mmx,
>>>>  mtrr, pat, pbe, pge, pn, pse36, sep, ss, sse, sse2, tm
>>>> +.SS glibc Hardware capabilities (from glibc 2.33)
>>>> +The legacy hardware capabilities combinations has the drawback where
>>>> +each feature name incur in
>>>> +cascading extra paths added on the search path list,
>>>
>>> IMO, this part could use some rephrasing. How about:
>>> The legacy hardware capabilities support has the drawback that each
>>> feature grows the number of paths added to the search list.
>>
>> The main problem was it did not only grows, but it did quadratically since
>> the new capability is combined with the others.  It was minimized because
>> the actually used capabilities was filtered out by the processor/kernel
>> advertise features. So maybe:
>>
>>   The legacy hardware capabilities support has the drawback
>>   it requires multiple search paths due the combined supported capabilities,
>>   and each new feature grows the number of paths added to the search list
>>   in quadratic manner.
> 
> Maybe:
> The legacy hardware capabilities support has the drawback that each
> new feature added grows the search path exponentially, because it has
> to be added to every combination of the other existing features.
> 

Ok, it works for me.

> And then you have the sse2 example below, which I think clarifies the
> point very well.
> 
>>
>>>
>>> Also, maybe this would better fit under the legacy capabilities section?
>>
>> Indeed, I will move to there.
>>
>>>
>>>> +adding a lot of overhead on
>>> I think "adding a lot of overhead to" sounds better here. "Adding to"
>>> instead of "adding on".
>>>
>>>> +.B ld.so
>>>> +during library resolution.
>>>> +For instance, on x86 32-bit, if the hardware
>>>> +supports
>>>> +.B i686
>>>> +and
>>>> +.BR sse2
>>>> +, the resulting search path will be
>>>> +.BR i686/sse2:i686:sse2:. .
>>>> +A new capability
>>>> +.B newcap
>>>> +will set the search path to
>>>> +.BR newcap/i686/sse2:newcap/i686:newcap/sse2:newcap:i686/sse2:i686:sse2: .
>>>> +.PP
>>>> +glibc 2.33 added a new hardware capability scheme,
>>>> +where each ABI can define
>>>
>>> Maybe s/each ABI/each ABI version/? I'm not familiar with the feature,
>>> just guessing from the examples below; they were very helpful, IMO.
>>
>> I think it would be better to use 'architecture' here.
> 
> Maybe 'architecture level' could work, based on Florian's mail to
> llvm-dev? Architecture is too broad, I think, and Florian says that
> the levels don't correspond to micro-architectures, since in the
> x84-64 the progression is not so clear.

Ack, architecture level I think works better here indeed.

> 
> Anyway, the links were very helpful. Maybe you can leave a pointer to
> them somewhere, at least in the change description?

I will add them as a comment.

> 
>>
>>
>>>
>>>> +a set of paths based on expected hardware support.
>>>
>>> a set of paths where to find the expected hardware support?
>>>
>>> This is based on how I (mis)understood the feature, though, if that's
>>> wrong, then the above might also be wrong. I guess the x86-64 glibc
>>> would install the x86-64-v2, -v3 and -v4 directories on disk; when
>>> running a program, glibc can then check what the current CPU supports
>>> - say if it supports x86-64-v3, it loads the contents of that folder?
>>
>> They are search paths, so glibc itself does not install them.  The
>> glibc Hardware capabilities search paths are constructed based on
>> pre-defined list (where only a handful architecture actually define
>> them) that are matched against the supported one by the hardware.
>> The initial patchset that actually implemented this feature has
>> more details [1].
> 
> By "glibc installs them", I meant that I would expect them to be part
> of glibc, is that not the case?

Not really, there are extra paths the loader might consulting when
loading shared libraries (either at startup or by dlopen).  It is
expected that package manager of something analogous to build and
install the optimized libraries on the expected folders.

> 
>>
>> It has the advantage that each glibc-hwcap path are not combined
>> with each other, so there is no quadratic increase when a new
>> patch is added.
>>
>> So, let's say you have a x86_64-v3 chip (Haswell or Excavator) [2].
>> The result search patch will be
>>
>>   glibc-hwcaps/x86-64-v3:glibc-hwcaps/x86-64-v2:
>>
>> It is also supported by ldconfig, so it will also check on all paths
>> defined on ld.so.conf along with combined one with glibc-hwcap.  So
>> let's say you have "/lib/x86_64-linux-gnu" on ld.so.conf, ldconfig will
>> check all possible subpaths based on the glibc-hwcap list:
>>
>>   /lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v4
>>   /lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v3
>>   /lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v2
>>   /lib/x86_64-linux-gnu/
>>
>> And only adds a possible candidate iff the file exists.  For instance
>> if you have:
>>
>>   /lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v4/libsomething.so
>>   /lib/x86_64-linux-gnu/glibc-hwcaps/x86-64-v2/libsomething.so
>>   /lib/x86_64-linux-gnu/libsomething.so
>>
>> The ldconfig will setup a ld.so.cache with all the entries, but
>> ld.so will only select the x86-64-v4/libsomething.so if the CPU supports
>> the x86-64-v4 (Skylake, Zen4); otherwise it will fallback to
>> x86-64-v2/libsomething.so if the cpu support x86_64-v2 (Nehalem, Jaguar),
>> or libsomething.so as the x86_64 baseline.
> 
> Thanks, that makes it much clearer.
> 
>>
>> So maybe a better description would be
>>
>>   glibc 2.33 added a new hardware capability scheme,
>>   where each architecture might define
>>   a set of paths based on expected hardware support.
>>   Each path is added on the search path list
>>   depending of the hardware of the machine.
>>   Each path is independent and not combined together,
>>   so it does have the drawbacks of legacy scheme.
> 
> How about:
> glibc 2.33 added a new hardware capability scheme, where under each
> CPU architecture, certain levels can be defined, grouping support for
> certain features or special instructions. Each architecture level has
> a fixed set of paths that it adds to the dynamic linker search list,
> depending on the hardware of the machine. Since each new architecture
> level is not combined with previously existing ones, the new scheme
> does not have the drawback of growing the dynamic linker search list
> uncontrollably.

Works for me, thanks.

> 
>>   .PP
>>   For instance, on x86 64-bit, if the hardware supports
>>   .B x86_64-v3
>>   (for instance Intel Haswell or AMD Excavator)
>>   , the resulting search path will be
>>   .BR glibc-hwcaps/x86-64-v3:glibc-hwcaps/x86-64-v2:.
> 
> I think it would be useful if the existing levels are defined
> somewhere; maybe Alex can suggest the proper manpage.

For x86_64, these are in fact defined by the ABI itself [1].
The powerpc and s390x names based on the IBM chips names,
but that also maps to specific ISA as well.  I will add a 
comment about it.

https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/low-level-sys-info.tex

> 
> Regards,
> Stefan.
> 
>>   The following paths are currently supported, in priority order.
>>   .TP
>>   .B PowerPC (64-bit little-endian only)
>>   power10, power9
>>   .TP
>>   .B s390 (64-bit only)
>>   z16, z15, z14, z13
>>   .TP
>>   .B x86 (64-bit only)
>>   x86-64-v4, x86-64-v3, x86-64-v2
>>   .PP
>>   glibc 2.37 removed support for the legacy hardware capabilities.
>>
>>>
>>>> +Each path is added depending of the hardware of the machine,
>>>> +and they are not combined together.
>>>> +They also have priority over the legacy hardware capabilities.
>>>> +The following paths are currently supported.
>>>> +.TP
>>>> +.B PowerPC (64-bit little-endian only)
>>>> +power9, power10
>>>> +.TP
>>>> +.B s390 (64-bit only)
>>>> +z13, z14, z15, z16
>>>> +.TP
>>>> +.B x86 (64-bit only)
>>>> +x86-64-v2, x86-64-v3, x86-64-v4
>>>> +.PP
>>>> +The glibc 2.37 removed support for the legacy hardware capabilities.
>>> s/The glibc/glibc
>>>
>>> Regards,
>>> Stefan.
>>>
>>>> +.
>>>>  .SH SEE ALSO
>>>>  .BR ld (1),
>>>>  .BR ldd (1),
>>>> --
>>>> 2.34.1
>>>>
>>
>> [1] https://sourceware.org/pipermail/libc-alpha/2020-June/115250.html
>> [2] https://lists.llvm.org/pipermail/llvm-dev/2020-July/143289.html



[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux