Kamlesh Gurudasani <kamlesh@xxxxxx> writes: ... > Hi Eric, thanks for your detailed and valuable inputs. > > As per your suggestion, we did some profiling. > > Use case is to calculate crc32/crc64 for file input from user space. > > Instead of directly implementing PMULL based CRC64, we made first comparison between > Case 1. > CRC32 (splice() + kernel space SW driver) > https://gist.github.com/ti-kamlesh/5be75dbde292e122135ddf795fad9f21 > > Case 2. > CRC32(mmap() + userspace armv8 crc32 instruction implementation) > (tried read() as well to get contents of file, but that lost to mmap() so not mentioning number here) > https://gist.github.com/ti-kamlesh/002df094dd522422c6cb62069e15c40d > > Case 3. > CRC64 (splice() + MCRC64 HW) > https://gist.github.com/ti-kamlesh/98b1fc36c9a7c3defcc2dced4136b8a0 > > > Overall, overhead of userspace + af_alg + driver in (Case 1) and > ( Case 3) is ~0.025s, which is constant for any file size. > This is calculated using real time to calculate crc - > driver time (time spend inside init() + update() +final()) = overhead ~0.025s > > > > +-------------------+-----------------------------+-----------------------+------------------------+------------------------+ > | | | | | | > | File size | 120mb(ideal size for us) | 20mb | 15mb | 5mb | > +===================+=============================+=======================+========================+========================+ > | | | | | | > | CRC32 (Case 1) | Driver time 0.155s | Driver time 0.0325s | Driver time 0.019s | Driver time 0.0062s | > | | real time 0.18s | real time 0.06s | real time 0.04s | real time 0.03s | > | | overhead 0.025s | overhead 0.025s | overhead 0.021s | overhead ~0.023s | > +-------------------+-----------------------------+-----------------------+------------------------+------------------------+ > | | | | | | > | CRC32 (Case 2) | Real time 0.30s | Real time 0.05s | Real time 0.04s | Real time 0.02s | > +-------------------+-----------------------------+-----------------------+------------------------+------------------------+ > | | | | | | > | CRC64 (Case 3) | Driver time 0.385s | Driver time 0.0665s | Driver time 0.0515s | Driver time 0.019s | > | | real time 0.41s | real time 0.09s | real time 0.08s | real time 0.04s | > | | overhead 0.025s | overhead 0.025s | overhead ~0.025s | overhead ~0.021s | > +-------------------+-----------------------------+-----------------------+------------------------+------------------------+ > > Here, if we consider similar numbers for crc64 PMULL implementation as > crc32 (case 2) , we save good number of cpu cycles using mcrc64 > in case of files bigger than 5-10mb as most of the time is being spent in HW offload. > > Regards, > Kamlesh Hi Eric, Please let me know if above numbers make sense to you and I should send next revision. Regards, Kamlesh