Hi Neil, I wanted to ask how the mentoring you're willing to offer would work, on your opinion? I had much fun developing for that *other* operating system kernel, and I am looking forward to do things for Linux kernel too some day. My pain at present, is the lack of kernel printouts during various config sequences. I have been trying to follow, for example, the ADD_NEW_DISK sequence and the slot_store() sequence, to understand how it detects whether a full-sync or not will be required. And there are many possibilities before you figure out the right "if" (non-persistent arrays, containers, external metadata....). I am used to heavily print-around all config (non-IO-path) sequences (such as ADD_NEW_DISK). This also helps if a particular step in the sequence takes a long time, due to a slow IO or something like that. What will be the pros/cons of doing that in md on your opinion? Thanks! Alex. On Sun, Sep 25, 2011 at 12:10 PM, NeilBrown <neilb@xxxxxxx> wrote: > On Sat, 24 Sep 2011 23:57:58 +0200 Aapo Laine <aapo.laine@xxxxxxxxxxxxx> > wrote: > >> On 09/23/11 11:15, NeilBrown wrote: >> > On Fri, 23 Sep 2011 02:09:36 -0600 Thomas Fjellstrom<thomas@xxxxxxxxxxxxx> >> > wrote: >> >> I forgot to say, but: Thank you very much :) for the help, and your tireless >> >> work on md. >> >> >> > >> > You've very welcome .... but I felt I needed to respond to that word >> > "tireless". >> > The truth is that I am getting rather tired of md .... if anyone knows anyone >> > who wants to get into kernel development and is wondering where to start - >> > please consider whispering 'the md driver' in their ear. Plenty to do, great >> > mentoring possibilities, and competent linux kernel engineers with good >> > experience are unlikely to have much trouble finding a job ;-) >> > >> > NeilBrown >> >> Whoa this is shocking news! > > Hopefully not too shocking... I'm not planning on leaving md any time soon. > I do still enjoy working on it. > But it certainly isn't as fresh and new as it was 10 years again. It would > probably do both me and md a lot of good to have someone with new enthusiasm > and new perspectives... > > >> >> Firstly then, thank you so much for your excellent work up to now. Linux >> has what I believe to be the best software raid of all operating systems >> thanks to you. Excellent in both features, and reliability i.e. quality >> of code. >> >> And also the support through the list was great. I found so many >> problems solved already just by looking at the archives... so many >> people with their arse saved by you. >> >> I think everybody here is sorry to see you willing to go. >> >> Now the bad news... Regarding the MD takeover, there I think I see a >> problem... >> The MD code is very tersely commented, compared to its complexity! > > That is certainly true, but seems to be true across much of the kernel, and > probably most code in general (though I'm sure such a comment will lead to > people wanting to tell me their favourite exceptions ... so I'll start with > "TeX"). > > This is one of the reasons I offered "mentoring" to any likely candidate. > > > >> >> - there is not much explanation of overall strategies, or the structure >> of code. Also the flow of data between the functions is not much >> explained most of the times. >> >> - It's not obvious to me what is the entry point for kernel processes >> related to MD arrays, how are they triggered and where do they run... >> E.g. in the past I tried to understand how did resync work, but I >> couldn't. I thought there was a kernel process controlling resync >> advancement, but I couldn't really find the boundaries of code inside >> which it was executing. > > md_do_sync() is the heart of the resync process. it calls into the > personality's sync_request() function. > > The kernel thread is started by md_check_recovery() if it appears to be > needed. md_check_recovery() is regularly run by each personality's main > controlling thread. > >> >> - it's not clear what the various functions do or in what occasion they >> are called. Except from their own name, most of them have no comments >> just before the definition. > > How about this: > - you identify some functions for which the purpose or use isn't clear > - I'll explain to you when/how/why they are used > - You create a patch which adds comments which explains it all > - I'll apply that patch. > > deal?? > >> >> - the algoritms within the functions are very long and complex, but only >> rarely they are explained by comments. I am now seeing pieces having 5 >> levels of if's nested one inside the other, and there are almost no >> comments. > > I feel your pain. I really should factor out the deeply nested levels into > separate functions. Sometimes I have done that but there is plenty more do > to. Again, I would be much more motivated to do this if I were working with > someone who would be directly helped by it. So if you identify specific > problems, it'll be a lot easier for me to help fix them. > > >> >> - last but not least, variables have very short names, and for most of >> them, it is not explained what they mean. This is mostly for local >> variables, but sometimes even for the structs which go into metadata >> e.g. in struct r1_private_data_s most members do not have an >> explanation. This is pretty serious, to me at least, for understanding >> the code. > > Does this help? > > diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h > index e0d676b..feb44ad 100644 > --- a/drivers/md/raid1.h > +++ b/drivers/md/raid1.h > @@ -28,42 +28,67 @@ struct r1_private_data_s { > mddev_t *mddev; > mirror_info_t *mirrors; > int raid_disks; > + > + /* When choose the best device for a read (read_balance()) > + * we try to keep sequential reads one the same device > + * using 'last_used' and 'next_seq_sect' > + */ > int last_used; > sector_t next_seq_sect; > + /* During resync, read_balancing is only allowed on the part > + * of the array that has been resynced. 'next_resync' tells us > + * where that is. > + */ > + sector_t next_resync; > + > spinlock_t device_lock; > > + /* list of 'r1bio_t' that need to be processed by raid1d, whether > + * to retry a read, writeout a resync or recovery block, or > + * anything else. > + */ > struct list_head retry_list; > - /* queue pending writes and submit them on unplug */ > - struct bio_list pending_bio_list; > > - /* for use when syncing mirrors: */ > + /* queue pending writes to be submitted on unplug */ > + struct bio_list pending_bio_list; > > + /* for use when syncing mirrors: > + * We don't allow both normal IO and resync/recovery IO at > + * the same time - resync/recovery can only happen when there > + * is no other IO. So when either is active, the other has to wait. > + * See more details description in raid1.c near raise_barrier(). > + */ > + wait_queue_head_t wait_barrier; > spinlock_t resync_lock; > int nr_pending; > int nr_waiting; > int nr_queued; > int barrier; > - sector_t next_resync; > - int fullsync; /* set to 1 if a full sync is needed, > - * (fresh device added). > - * Cleared when a sync completes. > - */ > - int recovery_disabled; /* when the same as > - * mddev->recovery_disabled > - * we don't allow recovery > - * to be attempted as we > - * expect a read error > - */ > > - wait_queue_head_t wait_barrier; > + /* Set to 1 if a full sync is needed, (fresh device added). > + * Cleared when a sync completes. > + */ > + int fullsync > > - struct pool_info *poolinfo; > + /* When the same as mddev->recovery_disabled we don't allow > + * recovery to be attempted as we expect a read error. > + */ > + int recovery_disabled; > > - struct page *tmppage; > > + /* poolinfo contains information about the content of the > + * mempools - it changes when the array grows or shrinks > + */ > + struct pool_info *poolinfo; > mempool_t *r1bio_pool; > mempool_t *r1buf_pool; > > + /* temporary buffer to synchronous IO when attempting to repair > + * a read error. > + */ > + struct page *tmppage; > + > + > /* When taking over an array from a different personality, we store > * the new thread here until we fully activate the array. > */ > >> >> - ... maybe more I can't think of right now ... >> >> Your code is of excellent quality, as I wrote, I wish there were more >> programmers like you, but if you now want to leave, THEN I start to be >> worried! Would you please comment it (much) more before leaving? Fully >> understanding your code I think is going to take other people a lot of >> time otherwise, and you might not find a replacement easily and/or s/he >> might do mistakes. > > I'm not planning on leaving - not for quite some time anyway. > But I know the code so well that it is hard to see which bits need > documenting, and what sort of documentation would really help. > I would love it if you (or anyone) would review the code and point to parts > that particularly need improvement. > > >> >> There were times in the past when I had ideas and I wanted to contribute >> code, but when I looked inside MD and tried to understand where should I >> put my changes, I realized I wasn't able to understand what current code >> was doing. Maybe I am not a good enough C programmer, but I was able to >> change things in other occasions. > > Don't be afraid to ask... But sometimes you do need a bit of persistence > though. :-) Not always easy to find time for that. > > >> >> >> I hope you won't get these critiques bad... > > Not at all. > >> and thanks for all your efforts, really, in the name of, I think, everybody. >> Aapo L. > > Thanks for your valuable feedback. > Being able to see problems is of significant value. One of the reasons that > I pay close attention to this list is because it shows me where the problems > with md and mdadm are. People often try things that I would never even dream > of trying (because I know they won't work). See this helps me know where the > code and be improved - either so what they try does work, or so it fails more > gracefully and helpfully. > > Thanks, > NeilBrown > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html