Hi jan, > >Hi, > >On Mon 22-10-18 10:34:48, Fisher wrote: >> Recently I was testing my storage's performance and found that there >> were periodic performance drops when I ran sequential write benchmark. >> After profiling the duration in each step, I found that the dropping >> performance was due to wait_on_bit_io(&bh->b_state, BH_Shadow, >> TASK_UNINTERRUPTIBLE) in do_get_write_access(). And this is what made >> me confused, if my understanding was right, I thought >> buffer_shadow(bh) stands for buffer not being copied-out that's why we >should wait. > >No buffer_shadow() just means that some version of the buffer is being written >to the journal. > >> But why don't we do copy-out in jbd2_journal_write_metadta_buffer()? >> and if we do do the copy-out, does that mean we don't have to >> set_buffer_shadow because it refers to buffer not copied-out? > >The code in do_get_write_access() checks whether the buffer already has >frozen data (by checking jh->b_frozen_data) and if not, if it is just being written >to the journal (buffer_shadow() check in there). If it is, we have to wait because >we cannot modify the version of the buffer that goes to the journal. > >> I made a test, when a buffer_head goes into >> jbd2_journal_write_metadta_buffer(), as long as it belongs to >> metadata, then force it to do copy-out and do not set_buffer_shadow, >> then there will be no periodic performance drops. Is this test reasonable? > >Well, it does trade of CPU overhead (you know copy buffer for each >transaction) for the latency (no need to wait for buffer writeout when >redirtying buffer). Usually I wouldn't consider this worth it but obviously it >depends on the workload... I also run into this issue. In do_get_write_access(), there are such comments: /* * There is one case we have to be very careful about. If the * committing transaction is currently writing this buffer out to disk * and has NOT made a copy-out, then we cannot modify the buffer * contents at all right now. The essence of copy-out is that it is * the extra copy, not the primary copy, which gets journaled. If the * primary copy is already going to disk then we cannot do copy-out ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I wonder why we can not do copy-out here? We can not visit this buffer while it's going to disk? * here. */ if (buffer_shadow(bh)) { JBUFFER_TRACE(jh, "on shadow: sleep"); jbd_unlock_bh_state(bh); wait_on_bit_io(&bh->b_state, BH_Shadow, TASK_UNINTERRUPTIBLE); goto repeat; } Regards, Xiaoguang Wang > > Honza >-- >Jan Kara <jack@xxxxxxxx> >SUSE Labs, CR