Re: hfsplus BUG(), kmap and journalling.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Hin-Tak,

On Thu, 2012-10-18 at 17:55 +0100, Hin-Tak Leung wrote:
> Hi,
>
> While looking at a few of the older BUG() traces I have consistently
> running du on a somewhat large directory with lots of small files and
> small directories, I noticed that it tends to have two sleeping "?
> hfs_bnode_read()" towards the top. As it is a very small and simple
> function which just reads a b-tree node record - sometimes only a few
> bytes between a kmap/kunmap, I see that it might just be the number of
> simultaneous kmap() being run. So I put a mutex around it just to make
> sure only one copy of hfs_bnode_read() is run at a time.

Yeah, you touch very important problem. It needs to rework hfsplus
driver from using kmap()/kunmap() because kmap() is slow, theoretically
deadlocky and is deprecated. The alternative is kunmap_atomic() but it
needs to dive more deeply in every case of kmap() using in hfsplus
driver.

The mutex is useless. It simply hides the issue.

> This seems to make it much harder to get a BUG() - I needed to run du
> a few times over and over to get it again. Of course it might just be
> a mutex slowing the driver down to make it less likely to get
> confused, but as I read that the number of simultaneous kmap() in the
> kernel is limited, I think I might be on to something.
> Also this shifts the problem onto multiple copies of "?
> hfsplus_bmap()". (which also kmap()/kunmap()'s, but much more
> complicated).

Namely, the mutex hides the issue.

> I thought of doing hfsplus_kmap()/etc(which seems to exist a long time
> ago but removed!) , but this might cause dead locks since some of the
> hfsplus code is kmapping/kunmapping all the time, and recursively. So
> a better way might be just to make sure only one instance of some of
> the routines are only run one at a time. i.e. multiple mutexes.
> This is both ugly and sounds like voodoo though. Also I am not sure
> why the existing mutex'es, which protects some of the internal
> structures, doesn't protect against too many kmap's. (maybe they
> protect "writes", but not against too many simultaneous reads).
> So does anybody has an idea how many kmaps are allowed and how to tell
> that I am close to my machine's limit?

As I can understand, the hfsplus_kmap() doesn't do something useful. It
really needs to rework kmap()/kunmap() using instead of mutex using.

Could you try to fix this issue? :-)

> Also a side note on the Netgear journalling code: I see that it
> jounrnals the volume header, some of the special files (the catalog,
> allocation bitmap, etc), but (1) it has some code to journal the
> attribute file, but it was actually non-functional, since without
> Vyacheslav's recent patches, the linux kernel doesn't even read/write
> that correctly, let alone doing *journalled* read/write correctly, (2)
> there is a part which tries to do data-page journalling, but it seems
> to be wrong - or at least, not quite working. (this I found while I
> was looking at some curious warning messages and how they come about).
> Luckily that codes just bails out when it gets confused - i.e. it does
> non-journalled writes, rather than writing wrong journal to disk. So
> it doesn't harm data under routine normal use. (i.e. mount/unmount
> cleanly).
> But that got me worrying a bit about inter-operability: it is probably
> unsafe to use Linux to replay the journal written by Mac OS X, and
> vice versa. i.e. if you have a dual boot machine, or a portable disk
> that you use between two OSes, if it disconnects/unplugs/crashes under
> one OS, it is better to plug it right back and let the same OS
> replaying the journal then unmount cleanly before using it under the
> other OS.

The journal should be replayed during every mount in the case of
presence of valid transactions. A HFS+ volume shouldn't be mounted
without journal replaying. Otherwise, it is possible to achieve
corrupted partition. Just imagine, you have mounted HFS+ partition with
not empty journal then add some data on volume. It means that you modify
metadata. If you will mount such HFS+ volume under Mac OS X then journal
will be replayed and metadata will be corrupted.

With the best regards,
Vyacheslav Dubeyko.

> I'll be interested on hearing any tips on finding out kmap's limit at
> run time, if anybody has any idea...
> 
> Hin-Tak


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux