Re: BN_MUL_MONT for ARM64 v8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andy,
   1:2.5 is pretty in my opinion for ARM !  

   We  will check out Mongoose.

   Hmm - will try to get to the bottom of those cache misses (at a lower priority).

Thanks,
-vijay

   

On Tue, Feb 7, 2017 at 11:07 AM, Andy Polyakov <appro@xxxxxxxxxxx> wrote:
> A72 is running 1GHz compared to x86 at 2.1Ghz. So that should hopefully
> get down to -1:5.

And Mongoose will take you to ~1:2.5 (scaled to same frequency that is).
Which I'd say is a fair result. Well, still could have been a bit
better, but it's not unreasonable given ISA differences. Keep in mind
that presented x86_64 result is for code utilizing Intel-specific code
extensions.

> There is no L3 cache on the A72 eval board and performance counters do
> show 9x more DRAM accesses for ARM compared to x86.

This is unexpected, because it takes *less* references to memory to
perform it on ARMv8. Because it has larger register bank. And cache
requirement is not that high for L3 to kick in... But at any case memory
is not bottleneck here...

--
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users

-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux