On Thu, Jul 9, 2020 at 6:27 AM Yufen Yu <yuyufen@xxxxxxxxxx> wrote: > > > > On 2020/7/9 7:55, Song Liu wrote: > > On Wed, Jul 8, 2020 at 6:15 AM Yufen Yu <yuyufen@xxxxxxxxxx> wrote: > >> > >> > >> > >> On 2020/7/3 7:00, Song Liu wrote: > >>> On Thu, Jul 2, 2020 at 5:05 AM Yufen Yu <yuyufen@xxxxxxxxxx> wrote: > >>>> > >>>> Hi, all > >>>> > >>>> For now, STRIPE_SIZE is equal to the value of PAGE_SIZE. That means, RAID5 > >>>> will issue each bio to disk at least 64KB when PAGE_SIZE is 64KB in arm64. > >>>> However, filesystem usually issue bio in the unit of 4KB. Then, RAID5 may > >>>> waste resource of disk bandwidth. > >>>> > >>>> To solve the problem, this patchset try to set stripe_size as a configuare > >>>> value. The default value is 4096. We will add a new sysfs entry and set it > >>>> by writing a new value, likely: > >>>> > >>>> echo 16384 > /sys/block/md1/md/stripe_size > >>> > >>> Higher level question: do we need to support page size that is NOT 4kB > >>> times power > >>> of 2? Meaning, do we need to support 12kB, 20kB, 24kB, etc. If we only > >>> supports, 4kB, > >>> 8kB, 16kB, 32kB, etc. some of the logic can be simpler. > >> > >> Yeah, I think we just support 4kb, 8kb, 16kb, 32kb... is enough. > >> But Sorry that I don't know what logic can be simpler in current implementation. > >> I mean it also need to allocate page, and record page offset. > > > > I was thinking about replacing multiplication/division with bit > > operations (shift left/right). > > But I am not very sure how much that matters in modern ARM CPUs. Would you mind > > running some benchmarks with this? > > To test multiplication/division and bit operation, I write a simple test case: > > $ cat normal.c > > int page_size = 65536; > int stripe_size = 32768; //32KB > > int main(int argc, char *argv[]) > { > int i, j, count; > int page, offset; > > if (argc != 2) > return -1; > > count = atol(argv[1]); > > for (i = 0; i < count; i++) { > for (j = 0; j < 4; j++) { > page = page_size / stripe_size; > offset = j * stripe_size; > } > } > } > > $ cat shift.c > > int page_shift = 16; //64KB > int stripe_shift = 15; //32KB > > int main(int argc, char *argv[]) > { > int i, j, count; > int page, offset; > > if (argc != 2) > return -1; > > count = atol(argv[1]); > > for (i = 0; i < count; i++) { > for (j = 0; j < 4; j++) { > page = 1 << (page_shift - stripe_shift); > offset = j << stripe_shift; > } > } > } > > Test them on a arm64 server, the result show there is a minor > performance gap between multiplication/division and shift operation. > > [root@localhost shift]# time ./normal 104857600 > > real 0m1.199s > user 0m1.198s > sys 0m0.000s > > [root@localhost shift]# time ./shift 104857600 > > real 0m1.166s > user 0m1.166s > sys 0m0.000s > > For our implementation, page address and page offset are just computed > when allocate stripe_size. After that, we just use the recorded value > in sh->dev[i].page and sh->dev[i].offset. So, I think current implementation > may not cause much overhead. Sounds good. Let's keep this part as-is. Thanks, Song