> On May 14, 2018, at 17:14, Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > > On Mon, May 14, 2018 at 11:05 AM, Yan, Zheng <zyan@xxxxxxxxxx> wrote: >> >>> On May 14, 2018, at 16:37, Ilya Dryomov <idryomov@xxxxxxxxx> wrote: >>> >>> On Sat, May 12, 2018 at 1:38 AM, Yan, Zheng <zyan@xxxxxxxxxx> wrote: >>>> >>>> >>>>> On May 11, 2018, at 20:06, Ilya Dryomov <idryomov@xxxxxxxxx> wrote: >>>>> >>>>> On Fri, May 11, 2018 at 11:12 AM, Yan, Zheng <zyan@xxxxxxxxxx> wrote: >>>>>> this avoid force umount getting stuck at ceph_osdc_sync() >>>>>> >>>>>> Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx> >>>>>> --- >>>>>> fs/ceph/super.c | 1 + >>>>>> include/linux/ceph/osd_client.h | 5 ++++- >>>>>> net/ceph/osd_client.c | 43 ++++++++++++++++++++++++++++++++++++----- >>>>>> 3 files changed, 43 insertions(+), 6 deletions(-) >>>>>> >>>>>> diff --git a/fs/ceph/super.c b/fs/ceph/super.c >>>>>> index 3c1155803444..40664e13cc0f 100644 >>>>>> --- a/fs/ceph/super.c >>>>>> +++ b/fs/ceph/super.c >>>>>> @@ -793,6 +793,7 @@ static void ceph_umount_begin(struct super_block *sb) >>>>>> if (!fsc) >>>>>> return; >>>>>> fsc->mount_state = CEPH_MOUNT_SHUTDOWN; >>>>>> + ceph_osdc_abort_requests(&fsc->client->osdc, -EIO); >>>>>> ceph_mdsc_force_umount(fsc->mdsc); >>>>>> return; >>>>>> } >>>>>> diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h >>>>>> index b73dd7ebe585..f61736963236 100644 >>>>>> --- a/include/linux/ceph/osd_client.h >>>>>> +++ b/include/linux/ceph/osd_client.h >>>>>> @@ -347,6 +347,7 @@ struct ceph_osd_client { >>>>>> struct rb_root linger_map_checks; >>>>>> atomic_t num_requests; >>>>>> atomic_t num_homeless; >>>>>> + int abort_code; >>>>> >>>>> Why osdc->abort_code and all __submit_request() hunks are needed? >>>>> If we are in a forced umount situation, no new I/Os should be accepted >>>>> anyway. >>>> >>>> No code guarantees that ceph_writepages_start()/writepage_nounlock() are >>>> not being executed when user does forced umount. They may start new >>>> osd requests after forced umount. >>> >>> I haven't traced through forced umount steps, but it seems like there >>> must be a point where we stop accepting requests and attempt to quiesce >>> the state. >>> >> >> To support this, cephfs code needs to introduce a rwsem, read lock the rwsem before calling ceph_osdc_wait_request(). Besides, the rwsem can not be used in the cases of blocking osdc functions (such as ceph_osdc_readpages). So I think it’s better to implement this in libceph. >> >> >>> The patch talks about avoiding getting stuck in ceph_osdc_sync(). >>> Is it guaranteed that no new OSD requests can be started after it >>> completes? >>> >> >> No, it doesn’t. > > What is the point of calling ceph_osdc_sync() in the first place then? Sorry, I was wrong about where he hang occurs. It’s at [<0>] io_schedule+0xd/0x30 [<0>] wait_on_page_bit_common+0xc6/0x130 [<0>] __filemap_fdatawait_range+0xbd/0x100 [<0>] filemap_fdatawait_keep_errors+0x15/0x40 [<0>] sync_inodes_sb+0x1cf/0x240 [<0>] sync_filesystem+0x52/0x90 [<0>] generic_shutdown_super+0x1d/0x110 [<0>] ceph_kill_sb+0x28/0x80 [ceph] [<0>] deactivate_locked_super+0x35/0x60 [<0>] cleanup_mnt+0x36/0x70 [<0>] task_work_run+0x79/0xa0 [<0>] exit_to_usermode_loop+0x62/0x70 [<0>] do_syscall_64+0xdb/0xf0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] 0xffffffffffffffff > > Thanks, > > Ilya -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html