Hi Sergei, On 16 Oct 2014, at 21:03, Sergei Antonov <saproj@xxxxxxxxx> wrote: > On 16 October 2014 17:36, Anton Altaparmakov <anton@xxxxxxxxxx> wrote: >> On 16 Oct 2014, at 16:07, Sergei Antonov <saproj@xxxxxxxxx> wrote: >>> I looked at your code and do not quite understand the handling of >>> initialized size. You read it into a local copy while holding the >>> lock, but then after the lock is released it may change and, I'm >>> afraid, cause the same problems you mentioned. >> >> Correct. That is why FIBMAP is a stupid interface! However initialized_size can only ever go up and not down so in the worst case you will get result zero, i.e. hole instead of the now real sector number you could access. >> >> The only time it can shrink is when the file size shrinks and I am afraid if anyone is using FIBMAP against a file that can be truncated they will corrupt the file system! > > I've tried to merge all your arguments in my head :) and concluded > that a practical importance of the check against initialized_size is > exaggerated :). I am afraid you are showing a serious lack of understanding of how NTFS works as a file system! I completely disagree with your statement. It is perfectly valid for an application to open() an UNCHANGING file and then run FIBMAP against it and then read from it directly (example applications: lilo and grub). It is completely NOT ok for the FIBMAP to return uninitialized sectors and the application to then read random garbage from disk instead of the zeroes that are meant to be there because you did not pay attention to the fact that initialized_size is less than file size and thus failed to return 0 instead of the sector number! It has nothing to do with the fact that it might change. You must read zero beyond initialized_size. The whole point of it is that the area above it has never been written thus must be considered to be zero on disk no matter what old disk contents there might be from when the sectors were in use for something else/by some other file or directory or other metadata or even a previous file system that might have been there before. > It is not a reliable protection (API safety issues) implemented in a > way that it may not always work (limited locking scope). Can be an > indicator of programmer's diligence, though - "you see, I care about > it!". Sorry for this ramble. Your patch will do the job. The locking is there because initialized_size is 64-bit and this cannot be read on 32-bit without locking. Surely you knew that already? If not you need to do some more studying of 64-bit memory accesses on 32-bit systems... (-; E.g. have a look at i_size_read()/i_size_write() and inode_{get,add,sub,set}_bytes()... And if you really don't understand it: a 32-bit CPU has to use two separate "read 32-bit memory location" instructions to read a 64-bit value so reading the lower 32-bits and higher 32-bits can happen a long time apart (if process is preempted between the two read instructions for example) and in the meantime a different process can write a new value in which case when the second read instruction finally executes it reads the high 32-bits from the new value and suddenly you have read random junk - the low 32-bits of old value and high 32-bits of new value and you think that is a valid 64-bit value which it is not. Thus you must lock to prevent this from happening hence why all 64-bit accesses of shared data are locked in Linux kernel (at least on 32-bit CPUs). >>> What is an example of a platform with 32-bit sector_t? Interesting to know. >> >> Any 32-bit system that does not configure the kernel for 64-bit sector_t. Sorry can't remember the .config option name at the moment. Would need to look it up. > > Thanks. Found it. It is CONFIG_LBDAF. So it is, thanks for finding it. It is not an intuitive name! >>> Is it really possible that ntfs_bmap() is called for non-data >>> attributes? In other words, what other attributes are accessible as >>> files? Just curious. >> >> I admit in current kernel driver it cannot. It might in future - reparse points for example are very odd in many ways and though they are often indicated as directories rather than files they could not always be and what attributes a reparse point has is anyones guess - depends on the reparse point. >> >> Then there are system files that do not have an unnamed $DATA attribute at all so in a way the check should probably be extended to ensure the attribute is $DATA _and_ unnamed but again anyone using FIBMAP on a system file is asking to cause problems... > > Aren't those files handled by 'ntfs_mst_aops' (the one without a bmap > function) rather than 'ntfs_normal_aops'? Only directories and attributes containing B-trees are handled by mst_aops. Not all system files are B-trees and often there are both B-trees and data attributes and the latter are usually not mst protected thus normal_aops are used. Best regards, Anton -- Anton Altaparmakov <anton at tuxera.com> (replace at with @) Senior Kernel Engineer, Tuxera Inc., http://www.tuxera.com/ Linux NTFS maintainer -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html