Re: [PATCH] raid6: fix the input of raid6 algorithm

"liuzhengyuan" <liuzhengyuan@xxxxxxxxxx> · Wed, 24 Aug 2016 15:58:14 +0800

Oh, get_random_*() is really expensive. Thanks for your tips. The boot log on my aarch64 showed bellow
told it taked about 0.6 second to fill with disk data. 

  [    0.172831] DMA: preallocated 256 KiB pool for atomic allocations
  [    0.788664] raid6: int64x1  gen()   121 MB/s
  [    0.856613] raid6: int64x1  xor()    74 MB/s
  [    0.924665] raid6: int64x2  gen()   166 MB/s
  [    0.992846] raid6: int64x2  xor()    95 MB/s
  [    1.060681] raid6: int64x4  gen()   290 MB/s
  [    1.128774] raid6: int64x4  xor()   160 MB/s
  [    1.196933] raid6: int64x8  gen()   238 MB/s
  [    1.264937] raid6: int64x8  xor()   148 MB/s
  [    1.332878] raid6: neonx1   gen()   256 MB/s
  [    1.400975] raid6: neonx1   xor()   130 MB/s
  [    1.468951] raid6: neonx2   gen()   333 MB/s
  [    1.537085] raid6: neonx2   xor()   181 MB/s
  [    1.605042] raid6: neonx4   gen()   451 MB/s
  [    1.673121] raid6: neonx4   xor()   289 MB/s
  [    1.741143] raid6: neonx8   gen()   452 MB/s
  [    1.809151] raid6: neonx8   xor()   277 MB/s
  [    1.809154] raid6: using algorithm neonx8 gen() 452 MB/s
  [    1.809157] raid6: .... xor() 277 MB/s, rmw enabled
  [    1.809160] raid6: using intx1 recovery algorithm

 I replaced get_random_* with a local PRNG based on well-know 
"linear congruential bit". The patch was like this:

  +/* use the linear congruential bit. */
  +static int32_t get_random_number_by_lcb(void)
  +{
  +        static int32_t seed = 1;
  +        int32_t ret = 0;
  +        ret = ((seed * 1103515245) + 12345) & 0x7fffffff;
  +        seed = ret;
  +        return ret;
  +}

   /* Try to pick the best algorithm */
   /* This code uses the gfmul table as convenient data set to abuse */
  @@ -229,8 +238,8 @@ int __init raid6_select_algo(void)
          for (i = 0; i < disks-2; i++) {
                  dptrs[i] = disk_ptr + PAGE_SIZE*i;
  -               for (j = 0; j < PAGE_SIZE; j++)
  -                       get_random_bytes(dptrs[i]+j, 1);
  +               for (j = 0; j < PAGE_SIZE; j = j + 4)
  +                       *(int32_t *)(dptrs[i]+j) = get_random_number_by_lcb();
          }

          dptrs[disks-2] = disk_ptr + PAGE_SIZE*(disks-2);

The boot log with this patch was showd bellow, it taked about 0.08 second.

  [    0.172858] DMA: preallocated 256 KiB pool for atomic allocations
  [    0.256673] raid6: int64x1  gen()   121 MB/s
  [    0.324484] raid6: int64x1  xor()    73 MB/s
  [    0.392606] raid6: int64x2  gen()   166 MB/s
  [    0.460309] raid6: int64x2  xor()    92 MB/s
  [    0.528368] raid6: int64x4  gen()   290 MB/s
  [    0.596401] raid6: int64x4  xor()   156 MB/s
  [    0.664601] raid6: int64x8  gen()   238 MB/s
  [    0.732609] raid6: int64x8  xor()   148 MB/s
  [    0.800523] raid6: neonx1   gen()   256 MB/s
  [    0.868730] raid6: neonx1   xor()   129 MB/s
  [    0.936741] raid6: neonx2   gen()   334 MB/s
  [    1.004717] raid6: neonx2   xor()   202 MB/s
  [    1.072692] raid6: neonx4   gen()   451 MB/s
  [    1.140763] raid6: neonx4   xor()   260 MB/s
  [    1.208842] raid6: neonx8   gen()   452 MB/s
  [    1.276887] raid6: neonx8   xor()   277 MB/s
  [    1.276890] raid6: using algorithm neonx8 gen() 452 MB/s
  [    1.276894] raid6: .... xor() 277 MB/s, rmw enabled
  [    1.276897] raid6: using intx1 recovery algorithm
  [    1.276941] ACPI: Interpreter disabled.

I'm not familiar with  spurious D$ conflicts and CPU cache behavior. How do you 
think this PRNG or anything else I need to do?

------------------ Original ------------------
From:  "H. Peter Anvin"<hpa@xxxxxxxxx>;
Date:  Tue, Aug 23, 2016 11:53 AM
To:  "liuzhengyuan"<liuzhengyuan@xxxxxxxxxx>;
Cc:  "shli"<shli@xxxxxxxxxx>; "linux-raid"<linux-raid@xxxxxxxxxxxxxxx>; "fenghua.yu"<fenghua.yu@xxxxxxxxx>; "linux-kernel"<linux-kernel@xxxxxxxxxxxxxxx>; "liuzhengyuang521"<liuzhengyuang521@xxxxxxxxx>;
Subject:  Re: [PATCH] raid6: fix the input of raid6 algorithm

Do you have any idea how long this takes to run?  People are already complaining about the boot time penalty.  get_random_*() is quite expensive and is overkill...
-- 
Sent from my Android device with K-9 Mail. Please excuse brevity and formatting.��.n��������+%������w��{.n�����{����w��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f