Hi! [added linux-fsdevel@xxxxxxxxxxxxxxx to CC since that's more appropriate than LKML] On Sun 21-01-18 13:08:56, Pali Rohár wrote: > Hi! In attachment you can find two minimalistic UDF filesystem images. > Strictly speaking they are not compliant to Ecma-167 which requires to > have at least 258 blocks for filesystem, but Linux kernel has no problem > to read also such small UDF filesystems. First one is "udf_with_td", has > 89 blocks and udf.ko reads it without problem. Second one is > "udf_without_td", has 88 blocks and udf.ko reject it with error message: > > UDF-fs: linux/fs/udf/misc.c:223:udf_read_tagged: location mismatch block 1, tag 0 != 1 > UDF-fs: warning (device loop1): udf_fill_super: No fileset found > > The only difference between these two images is presence/absence of > Terminating Descriptor (TD). > > Looking at the code in function udf_process_sequence(), I found out that > udf.ko does not read Partition Descriptor when filesystem does not > contain Terminating Descriptor. This is of course wrong as without > reading Partition Descriptor it is not possible to locale root directory > of the UDF filesystem. > > In that function is following code: > > if (vds[VDS_POS_PARTITION_DESC].block) { > /* > * We rescan the whole descriptor sequence to find > * partition descriptor blocks and process them. > */ > for (block = vds[VDS_POS_PARTITION_DESC].block; > block < vds[VDS_POS_TERMINATING_DESC].block; > block++) { > ret = udf_load_partdesc(sb, block); > if (ret < 0) > return ret; > } > } > > Array vds is initialized to zeros, so when Partition Descriptor is not > present then vds[VDS_POS_TERMINATING_DESC].block is zero and so above > for loop is immediately stopped. > > As a quick fix I applied following patch: > > --- a/fs/udf/super.c > +++ b/fs/udf/super.c > @@ -1620,6 +1620,7 @@ static noinline int udf_process_sequence( > unsigned int indirections = 0; > > memset(vds, 0, sizeof(struct udf_vds_record) * VDS_POS_LENGTH); > + vds[VDS_POS_TERMINATING_DESC].block = lastblock; > > /* > * Read the main descriptor sequence and find which descriptors > > which forces above loop to call udf_load_partdesc() also when TD is not > available. Yes, but this will not do the right thing in case there's VDP (volume descriptor pointer) descriptor. More on this below. > Note that this problem happen also with large enough UDF filesystems > without Terminating Descriptor, not only for those small. > > And I have not found any restrictions in UDF or ECMA-167 specification > that Terminating Descriptor is required. You are right that TD does not seem to be required and thus the code is buggy. > ========== > > But I'm not sure if above for loop is correct at all. > > In function udf_process_sequence() is code for handling TAG_IDENT_VDP, > which allow to scanning nested Volume Descriptor Sequences. And each > identifier descriptor handles (increasing) volDescSeqNum except > TAG_IDENT_PD which is for Partition Descriptor. Partition Descriptor is > always used one which was processed first: > > case TAG_IDENT_PD: /* ISO 13346 3/10.5 */ > curr = &vds[VDS_POS_PARTITION_DESC]; > if (!curr->block) > curr->block = block; > break; > > But only the last processed Terminating Descriptor is used: > > case TAG_IDENT_TD: /* ISO 13346 3/10.9 */ > ... > vds[VDS_POS_TERMINATING_DESC].block = block; > ... > break; > > And if nested Volume Descriptor Sequences "jumps" to block with smaller > number and in that nested sequence is Terminating Descriptor, it would > have smaller block number as Partition Descriptor block number (read > before nested jump) and so loop for calling udf_load_partdesc() would be > stopped at beginning too. Yeah, strictly speaking this is possible but that would have to be really evil udf image creator ;) But yes, we ought to handle that. > Also question is, which Partition Descriptor should be loaded? First > processed? Last processed? One with smallest or biggest Sequence Number? > Or all which are found, including all in nested sequence? So partition descriptors are "special". You can have multiple partition descriptiors in the descriptor sequence. "Prevailing descriptors" (this is the term from ECMA 167 specification - it means descriptors that should be taken into account) are those with highest sequence number found for a partition with a particular partition number. So prevailing descriptor for partition 0 can have different sequence number than prevailing descriptor for partition 1. But still you are correct that processing of PD descriptors is hosed both wrt descriptor sequence numbers, VPD descriptor, and missing TD descriptors. I'll fix that up (but probably get to it only after my vacation next week). Thanks for spotting this! Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR