On 2022/11/28 23:15, Jan Kara wrote: > On Mon 28-11-22 21:01:07, Zhang Yi wrote: >> On 2022/11/28 18:11, Jan Kara wrote: >>> On Thu 24-11-22 21:57:44, Zhang Yi wrote: >>>> The block layer will check and suppress flush bio if the device write >>>> cache is not enabled, so the journal barrier will not go into effect >>>> even if uer specify 'barrier=1' mount option. It's dangerous if the >>>> write cache state is false negative, and we cannot distinguish such >>>> case easily. So just give an info and an inquire interface to let >>>> sysadmin know the barrier is suppressed for the case of write cache is >>>> not enabled. >>>> >>>> Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx> >>> >>> Hum, so have you seen a situation when write cache information is incorrect >>> in the block layer? Does it happen often enough that it warrants extra >>> sysfs file? >>> >> >> Thanks for response. Yes, It often happens on some SCSI devices with RAID >> card, the disks below the RAID card enabled write cache, but the RAID driver >> declare the write cache was disabled when probing, and the RAID card seems >> cannot guarantee data writing back to disk medium on power failure. So the >> ext4 filesystem will probably be corrupted at the next startup. It's >> difficult to distinguish it's a hardware or an software problem. >> I am not familiar with the RAID card. So I don't know why the cache state >> is incorrect (maybe incorrect configured or firmware bug). > > OK, thanks for info. I believe usually you're expected to disable write > cache on the disks themselves and leave caching to the RAID card. But I'm > not an expert here and it's a bit besides the point anyway ;) > >>> After all you should be able to query what the block layer thinks about the >>> write cache - you definitely can for SCSI devices, I'm not sure about >>> others. So you can have a look there. Providing this info in the filesystem >>> seems like doing it in the wrong layer - I don't see anything jbd2/ext4 >>> specific here... >>> >> >> Yes, the best way is to figure out the RAID card problem. >> This patch is not to aim to fix something in ext4. The reason why I want to add >> this in ext4 is just give a hint from the fs barrier's point of view, it show the >> barrier's running state at mount time, could help us to delimit the cache problem >> more easily when we found ext4 corruption after power failure. Before this patch, >> we could do that through SCSI probing info and /sys/block/sda/queue/write_cache >> (maybe some others?), it's not quite clear. >> >> [ 2.520176] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA >> >> [root@localhost ~]# cat /sys/block/sda/queue/write_cache >> write back > > Yes. /sys/block/<device>/queue/write_cache is what you should query to find > whether barriers will be ignored or not. My point is - you need this for > ext4, now if you start using XFS filesystem you'd need similar patch for > XFS and then if you transition to btrfs you'd need this for btrfs as well > and all this duplication is there because you are querying through the > filesystem a property of the underlying block device. So why not ask the > block device directly? > > I understand it may be more *convenient* to grab the information from the > filesystem given the infrastructure you have for gathering filesystem > information. But carrying around various sysfs files has its cost as well. > OK, it's fine, let's keep querying the block layer. Thanks, Yi.