On Thu, 24 Apr 2014 10:33:31 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > On Thu, Apr 24, 2014 at 1:02 AM, NeilBrown <neilb@xxxxxxx> wrote: > > On Thu, 24 Apr 2014 00:38:01 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> > > wrote: > > > >> On Thu, Apr 24, 2014 at 12:24 AM, NeilBrown <neilb@xxxxxxx> wrote: > >> > On Wed, 23 Apr 2014 23:18:49 -0700 Dan Williams <dan.j.williams@xxxxxxxxx> > >> > wrote: > >> > > >> >> Initial md / block boilerplate for the Intel (R) Smart Response > >> >> Technology compatibility driver. Supports reading the packed metadata > >> >> and parsing it into a cache lookup tree. > >> >> > >> >> Cc: Dave Jiang <dave.jiang@xxxxxxxxx> > >> >> Cc: Artur Paszkiewicz <artur.paszkiewicz@xxxxxxxxx> > >> >> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > >> > > >> > It would really help in reviewing this to have a glossary. > >> > > >> > There are frames and segments and sectors and pages. > >> > > >> > I hope sectors are 512 bytes and pages are PAGE_SIZE, which may or may not be > >> > 4096. > >> > > >> > And there are 16 sectors per frame, so I guess space is allocated in the > >> > cache in 8K aligned frames ?? > >> > > >> > There are 64 segments per page so if pages did happen to be 4096 bytes, that > >> > makes 64 bytes per segment. What are they? > >> > > >> > There is a list somewhere of 32byte frame descriptors which is read into a > >> > single vmalloced region (why? you keep page pointers, so why not read it into > >> > separate pages?) > >> > How is this organised? I might be able to work that out from the code, but > >> > I'd rather not. > >> > > >> > Please don't make me guess, I'm not good at it. > >> > > >> > I guess it didn't help that diff out the header after the code. I got bored > >> > before I got there and didn't read all to words, so maybe some answers are in > >> > there. They don't really stand out though. > >> > >> No, they don't. Let me throw together a proper cheat sheet. > > > > Thanks. > > > > > >> > >> > You've chosen '8' for the 'level' number. > >> > >> Hmm, ok. I use -12 in the mdadm bits, I neglected to go back and fix > >> up the kernel. > >> > >> > As this is an array which doesn't have redundancy, I'd rather a number <= 0. > >> > I think there are places where I assume >=1 has redundancy and understands > >> > spares etc. > >> > > >> > Should conf->count be a kref??? Just a thought, not a requirement. > >> > >> Doesn't kref == 0 imply object destroyed? It's a count of pending > >> metadata events. > > > > kref means that in a kobject. Elsewhere it means whatever you want. > > > > mpb_read_endio would kref_put(&conf->ref, release) > > > > where release would get the conf and wake_up(&conf->eventq); > > > > It probably isn't a big win.. > > Yes, but I should train my brain that kref != object lifetime, it's > just a ref... > > ...or is it? > static inline void kref_get(struct kref *kref) > { > /* If refcount was 0 before incrementing then we have a race > * condition when this kref is freeing by some other thread right now. > * In this case one should use kref_get_unless_zero() > */ > WARN_ON_ONCE(atomic_inc_return(&kref->refcount) < 2); > } > > Seems we would need kref_get_zero_ok(), right? That doesn't sound like a good idea, the only value in using kref would be if it made the code more obvious. That wouldn't :-( So I looked more closely at the code, and now wonder why 'count' is in 'conf' at all. You don't really need a global count of requests at all. There two places where it is used. One is in isrt_mpb_read() where a single bio is submitted and waited for. That could use submit_bio_wait() (not 'should', just 'could'). The other is in isrt_read_packed_md() where multiple bios are submitted and then waited for. In both cases the counter (and wait queue) could be on the stack (like the completion is in submit_bio_wait(). struct multi_complete { atomic_t count; wait_queue_head_t wait; unsigned long state; }; static void multi_read_endio(struct bio *bio, int error) { struct multi_complete *mc = bio->private; if (error || !test_bit(BIO_UPTODATE(&bio->bi_flags))) set_bit(ISRT_ERROR, &mc->state); if (atomic_dec_and_test(&mc->count)) wake_up(&mc->wait); bio_put(bio); } isrt_read_packed_md(struct mddev *mdev) { ... struct multi_complete mc = { ATOMIC_INIT(0), __WAIT_QUEUE_HEAD_INITIALIZER(mc.wait), 0}; .... atomic_inc(&mc,count); bio->bi_private = &mc; submit_bio(READ, bio); ..... wait_event(&mc.wait, atomic_read(&mc.count)==0); .... } It isn't really a big improvement, so maybe it isn't worth it. But it does make it obvious that we are only waiting for the reads that we just submitted. With the current code I see conf->count and wonder what else we could be waiting for. Up to you - maybe just leave it as it is. (But please use 'READ', not '0' as the first arg to submit_bio). NeilBrown
Attachment:
signature.asc
Description: PGP signature