Hi, [CCing more people] 2015-01-16 23:39 GMT+09:00 Paul Osmialowski <p.osmialowsk@xxxxxxxxxxx>: > This enhancement of i2c API is designed to address following problem > caused by circular lock dependency: > > -> #1 (prepare_lock){+.+.+.}: > [ 2.730502] [<c0061e50>] __lock_acquire+0x3c0/0x8a4 > [ 2.735970] [<c0062868>] lock_acquire+0x6c/0x8c > [ 2.741090] [<c04a2724>] mutex_lock_nested+0x68/0x464 > [ 2.746733] [<c0395e84>] clk_prepare_lock+0x40/0xe8 > [ 2.752201] [<c0397698>] clk_unprepare+0x18/0x28 > [ 2.757409] [<c034cbb0>] s3c24xx_i2c_xfer+0xc8/0x124 > [ 2.762964] [<c03457e0>] __i2c_transfer+0x74/0x8c > [ 2.768259] [<c0347234>] i2c_transfer+0x78/0xec > [ 2.773380] [<c02a177c>] regmap_i2c_read+0x48/0x64 > [ 2.778761] [<c029d5c0>] _regmap_raw_read+0xa8/0xfc > [ 2.784230] [<c029d920>] _regmap_bus_read+0x28/0x48 > [ 2.789699] [<c029ce00>] _regmap_read+0x74/0xdc > [ 2.794819] [<c029d0ec>] _regmap_update_bits+0x24/0x70 > [ 2.800549] [<c029e348>] regmap_update_bits+0x40/0x5c > [ 2.806190] [<c024c3c4>] _regulator_do_disable+0x68/0x7c > [ 2.812093] [<c024f4d8>] _regulator_disable+0x90/0x12c > [ 2.817822] [<c024f5a8>] regulator_disable+0x34/0x60 > [ 2.823377] [<c0363070>] mmc_regulator_set_ocr+0x74/0xdc > [ 2.829279] [<c03783e8>] sdhci_set_power+0x38/0x20c > [ 2.834748] [<c0379c10>] sdhci_do_set_ios+0x180/0x450 > [ 2.840389] [<c0379f00>] sdhci_set_ios+0x20/0x2c > [ 2.845597] [<c0364978>] mmc_set_initial_state+0x5c/0xbc > [ 2.851500] [<c0364f48>] mmc_power_off+0x2c/0x60 > [ 2.856708] [<c0365c00>] mmc_rescan+0x268/0x27c > [ 2.861829] [<c003a128>] process_one_work+0x18c/0x3f8 > [ 2.867471] [<c003a400>] worker_thread+0x38/0x2f8 > [ 2.872766] [<c003f66c>] kthread+0xe4/0x104 > [ 2.877540] [<c000f0a8>] ret_from_fork+0x14/0x2c > [ 2.882749] > -> #0 (&map->mutex){+.+...}: > [ 2.886828] [<c0061224>] validate_chain.isra.25+0x3bc/0x548 > [ 2.892990] [<c0061e50>] __lock_acquire+0x3c0/0x8a4 > [ 2.898459] [<c0062868>] lock_acquire+0x6c/0x8c > [ 2.903580] [<c04a2724>] mutex_lock_nested+0x68/0x464 > [ 2.909222] [<c029ce9c>] regmap_read+0x34/0x5c > [ 2.914257] [<c039a994>] max_gen_clk_is_prepared+0x1c/0x38 > [ 2.920333] [<c0396ec4>] clk_unprepare_unused_subtree+0x64/0x98 > [ 2.926842] [<c0396f78>] clk_disable_unused+0x80/0xd8 > [ 2.932484] [<c00089d0>] do_one_initcall+0xac/0x1f0 > [ 2.937953] [<c068bd44>] do_basic_setup+0x90/0xc8 > [ 2.943248] [<c068be00>] kernel_init_freeable+0x84/0x120 > [ 2.949150] [<c0491248>] kernel_init+0x8/0xec > [ 2.954097] [<c000f0a8>] ret_from_fork+0x14/0x2c > [ 2.959307] > [ 2.959307] other info that might help us debug this: > [ 2.959307] > [ 2.967293] Possible unsafe locking scenario: > [ 2.967293] > [ 2.973194] CPU0 CPU1 > [ 2.977708] ---- ---- > [ 2.982221] lock(prepare_lock); > [ 2.985520] lock(&map->mutex); > [ 2.991248] lock(prepare_lock); > [ 2.997063] lock(&map->mutex); > [ 3.000276] > [ 3.000276] *** DEADLOCK *** > > Apparently regulator and clock try to acquire lock which is protecting the > same regmap. Communication over i2c requires clock to be started. Both things > require access to the same regmap in order to complete. I stumbled upon this issue (and reported it) quite long time ago already, but apparently nobody cared too much (including myself, unfortunately). Please see [1] for details. [1] https://lkml.org/lkml/2014/7/2/171 We have here a typical ABBA deadlock scenario, between I2C clock providers and other (logical) devices on the same (physical) I2C device, for which a regmap is used to share the registers between drivers. Remaining factor here is I2C controller driver, which must perform clock gating in I2C ops to trigger this deadlock. The problem is that any operation on such I2C clock will first try to acquire clk_prepare_mutex and then, through driver's callback, access the regmap, acquiring its mutex (then an I2C transfer will happen, but it irrelevant in this context). On opposite side we have drivers for other functionality exposed by this I2C device, which will access the regmap, acquiring its mutex and causing I2C transfers to happen. The key here is that I2C transfers might require some clocks to be prepared, so clk_prepare() might get called from this context and cause a deadlock, because clk_prepare_mutex might have been already acquired by another context, waiting for regmap mutex, which has been already acquired by this context. Now, for the solution, the approach proposed by Paul, as far as I could understand it by reading the code (it's definitely lacking a cover letter with detailed explanation), should solve the issue by enforcing correct locking order at regmap level. However I wonder if we really need a heavy solution like this or we could just make I2C drivers not require clock preparation in I2C transfers, as suggested by Peter De Schrijver in [1], which should fix the issue as well. So, the question is, do we actually have hardware that _really_ requires _actual_ preparation or all the clk_prepare_enable()s in I2C drivers (at least in i2c-s3c2410) are just to simplify the code? Best regards, Tomasz -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html