Hi, On Sat 30-03-19 14:49:46, Steve Magnani wrote: > On 3/25/19 11:42 AM, Jan Kara wrote: > > Hi! > > > > On Sat 23-03-19 15:14:05, Steve Magnani wrote: > > > I have been hunting a UDF bug that occasionally results in generation > > > of an Allocation Extent Descriptor with an incorrect tagLocation. So > > > far I haven't been able to see a path through the code that could > > > cause that. But, I noticed some inconsistency in locking during > > > AED generation and wonder if it could result in random corruption. > > > > > > The function udf_update_inode() has this general pattern: > > > > > > bh = udf_tgetblk(...); // calls sb_getblk() > > > lock_buffer(bh); > > > memset(bh->b_data, 0, inode->i_sb->s_blocksize); > > > // <snip>other code to populate FE/EFE data in the block</snip> > > > set_buffer_uptodate(bh); > > > unlock_buffer(bh); > > > mark_buffer_dirty(bh); > > > > > > This I can understand - the lock is held for as long as the buffer > > > contents are being assembled. > > > > > > In contrast, udf_setup_indirect_aext(), which constructs an AED, > > > has this sequence: > > > > > > bh = udf_tgetblk(...); // calls sb_getblk() > > > lock_buffer(bh); > > > memset(bh->b_data, 0, inode->i_sb->s_blocksize); > > > > > > set_buffer_uptodate(bh); > > > unlock_buffer(bh); > > > mark_buffer_dirty_inode(bh); > > > > > > // <snip>other code to populate AED data in the block</snip> > > > > > > In this case the population of the block occurs without > > > the protection of the lock. > > > > > > Because the block has been marked dirty, does this mean that > > > writeback could occur at any point during population? > > Yes. Thanks for noticing this! > > > > > There is one path through udf_setup_indirect_aext() where > > > mark_buffer_dirty_inode() gets called again after population is > > > complete, which I suppose could heal a partial writeout, but there is > > > also another path in which the buffer does not get marked dirty again. > > Generally, we add new extents to the created indirect extent which dirties > > the buffer and that should fix the problem. But you are definitely right > > that the code is suspicious and should be fixed. Will you send a patch? > > I did a little archaeology to see how the code evolved to this point. It's > been like this a long time. > > I also did some research to understand why filesystems use lock_buffer() > sometimes but not others. For example, the FAT driver never calls it. I ran > across this thread from 2011: > > https://lkml.org/lkml/2011/5/16/402 > > ...from which I conclude that while it is correct in a strict sense to hold > a lock on a buffer any time its contents are being modified, performance > considerations make it preferable (or at least reasonable) to make some > modifications without a lock provided it's known that a subsequent write-out > will "fix" any potential partial write out before anyone else tries to read > the block. Understood but UDF (and neither FAT) are really that performance critical. If you look for performance, you'd certainly pick a different filesystem. UDF is mainly for data interchange so it should work reasonably for copy-in copy-out style of workloads, the rest isn't that important. So there correctness and simplicity is preferred over performance. > I doubt that UDF sees common use with DIF/DIX block devices, > which might make a decision in favor of performance a little easier. Since > the FAT driver doesn't contain Darrick's proposed changes I assume a > decision was made that performance was more important there. > > Certainly the call to udf_setup_indirect_aext() from udf_add_aext() meets > those criteria. But udf_table_free_blocks() may not dirty the AED block. > > So if this looks reasonable I will resend as a formal patch: > > --- a/fs/udf/inode.c 2019-03-30 11:28:38.637759458 -0500 > +++ b/fs/udf/inode.c 2019-03-30 11:33:00.357761250 -0500 > @@ -1873,9 +1873,6 @@ int udf_setup_indirect_aext(struct inode > return -EIO; > lock_buffer(bh); > memset(bh->b_data, 0x00, sb->s_blocksize); > - set_buffer_uptodate(bh); > - unlock_buffer(bh); > - mark_buffer_dirty_inode(bh, inode); > aed = (struct allocExtDesc *)(bh->b_data); > if (!UDF_QUERY_FLAG(sb, UDF_FLAG_STRICT)) { > @@ -1890,6 +1887,9 @@ int udf_setup_indirect_aext(struct inode > udf_new_tag(bh->b_data, TAG_IDENT_AED, ver, 1, block, > sizeof(struct tag)); > + set_buffer_uptodate(bh); > + unlock_buffer(bh); > + > nepos.block = neloc; > nepos.offset = sizeof(struct allocExtDesc); > nepos.bh = bh; > @@ -1913,6 +1913,8 @@ int udf_setup_indirect_aext(struct inode > } else { > __udf_add_aext(inode, epos, &nepos.block, > sb->s_blocksize | EXT_NEXT_EXTENT_ALLOCDECS, 0); > + /* Make sure completed AED gets written out */ > + mark_buffer_dirty_inode(nepos.bh, inode); Why do you mark the buffer as dirty only here? I'd just mark it dirty after unlocking. If __udf_add_aext() or udf_write_aext() modify the buffer, they will mark it as dirty as well... Thanks! Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR