AW: raid5: direct IO and md_wakeup_thread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Von: NeilBrown [neilb@xxxxxxx]
> Gesendet: Montag, 8. September 2014 08:46
> An: Markus Stockhausen
> Cc: linux-raid@xxxxxxxxxxxxxxx
> Betreff: Re: raid5: direct IO and md_wakeup_thread
> 
> On Fri, 5 Sep 2014 17:52:53 +0000 Markus Stockhausen
> <stockhausen@xxxxxxxxxxx> wrote:
> 
> > Hi Neil,
> >
> > I hope you are doing better after the flu. Maybe you find some time
> > to explain the following codings in raid5.c to me. At least one of
> > the patchs seems to be signed off by you. The RAID5 I'm using has
> > default settings.
> >
> > 1) Generally speaking of md_wakup_thread() calls. Does it make
> > any sense to call them the from within raid5d()? Or when being
> > implemented at locations that might be called from the
> > make_request() path too - for simplicity no further caller destinction
> > was programmed.
> 
> I doesn't make sense to all md_wakeup_thread() from somewhere that is *only*
> called within raid5d().  However most of raid5d() is handle_stripe() and
> handle_stripe() is called from other places.  So it could make sence to call
> md_wakeup_thread from within handle_stripe().
> 
> make_request() very likely wants to call md_wakeup_thread().... though I'm
> wondering if I understand that part of the question properly.
> 
> >
> > 1) In case of a single direct I/O writer to the /dev/md0 device the
> > following coding will always fire the md_wakeup_thread(). That
> > is one reason for alternating raid5d loops one with data to
> > process and the next without. Is that corner case wanted?
> 
> Is this a problem?  We often do things just in case they are needed,
> providing the cost isn't too great.
> 
> If there was a measurable cost it might make sense to
> 
>     clear_bit(THREAD_WAKEUP, &mddev->thread->flags);
> 
> in raid5d() at the top of the while loop in raid5d().  We would need to move
> the md_check_recovery() call to the end of the function then.
> 
> >
> > static void handle_stripe(struct stripe_head *sh)
> >         ...
> >         ops_run_io(sh, &s);
> >
> >         if (s.dec_preread_active) {
> >                 /* We delay this until after ops_run_io so that if make_request
> >                  * is waiting on a flush, it won't continue until the writes
> >                  * have actually been submitted.
> >                  */
> >                 atomic_dec(&conf->preread_active_stripes);
> >                 if (atomic_read(&conf->preread_active_stripes) <
> >                     IO_THRESHOLD)
> >                         md_wakeup_thread(conf->mddev->thread); */
> >         }
> >
> > 2) The wakeup_thread() in the following preread path seems
> > to be unnecesary or will a double call wake the thread twice?
> 
> Unnecessary in some cases, necessary in others.  Tracking exactly when it is
> necessary would be complex and error prone.
> 
> Are you just trying to understand the code, or is there something that you
> think is a problem that could be fixed?

HI Neil,

thanks for the explanations. I tried to understand the reason for the
"bad" benchmark numbers during my direct IO write tests for my rmw
patch. I only choose it to avoid caching side effects. Even with 64 
threads there is still a large gap to the theoretical maximum. So I 
reduced the test to 1 writer and took some time to have a look how 
raid5.c handles sync/direct writes. See my other post with only 5MB/s
4K direct writes to ramdisk backed raid5 md. 

With that insight I started to compare it to raid1.c. After some 
iterations it came to my mind that the empty raid5d() runs between
two sync writes might give extra latency and so I looked for corners 
where md_wakeup_thread() might be unnecessary. The result are
the questions above.

I'll give the clear_bit(THREAD_WAKEUP,...) a try.

Markus
****************************************************************************
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

�ber das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497

****************************************************************************

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux