Re: [v3.13][v3.14][Regression] kthread: makekthread_create()killable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oleg Nesterov wrote:
> If we need the urgent hack to fix the regression, then I suggest to change
> scsi_host_alloc() temporary until mptsas (or whatever) is fixed.

Device initialization taking longer than 30 seconds is possible and is not a
hang up. It is systemd which needs to be fixed.

> --- x/drivers/scsi/hosts.c
> +++ x/drivers/scsi/hosts.c
> @@ -447,8 +447,18 @@ struct Scsi_Host *scsi_host_alloc(struct
>  	dev_set_name(&shost->shost_dev, "host%d", shost->host_no);
>  	shost->shost_dev.groups = scsi_sysfs_shost_attr_groups;
>  
> -	shost->ehandler = kthread_run(scsi_error_handler, shost,
> -			"scsi_eh_%d", shost->host_no);
> +	/*
> +	 * HUGE COMMENT. and kthread_create() needs s/ENOMEM/EINTR/.
> +	 */
> +	for (;;) {
> +		shost->ehandler = kthread_run(scsi_error_handler, shost,
> +						"scsi_eh_%d", shost->host_no);
> +		if (!IS_ERR(shost->ehandler) || PTR_ERR(shost->ehandler) != -EINTR)
> +			break;
> +		clear_thread_flag(TIF_SIGPENDING);
> +	}
> +	recalc_sigpending();
> +
>  	if (IS_ERR(shost->ehandler)) {
>  		printk(KERN_WARNING "scsi%d: error handler thread failed to spawn, error = %ld\n",
>  			shost->host_no, PTR_ERR(shost->ehandler));
> 
> 

I think we need a bit different version, in order to take TIF_MEMDIE flag into
account at the caller of kthread_create(), for the purpose of commit 786235ee
is "try to die as soon as possible if chosen by the OOM killer".

	for (;;) {
		shost->ehandler = kthread_run(scsi_error_handler, shost,
					      "scsi_eh_%d", shost->host_no);
		if (PTR_ERR(shost->ehandler) != -EINTR ||
		    test_thread_flag(TIF_MEMDIE))
			break;
		clear_thread_flag(TIF_SIGPENDING);
	}
	recalc_sigpending();

But I have two worrying points.

  (1) Changing return code from -ENOMEM to -EINTR may not be sufficient.

      If kmalloc(GFP_KERNEL) in kthread_create_on_node() does something that
      calls recalc_sigpending(), TIF_SIGPENDING will be set on the second call
      to kthread_run(). This will make wait_for_completion_killable() return
      -EINTR immediately because the second call to kthread_run() happens only
      when current thread already received SIGKILL (by other than the OOM
      killer). This may form an infinite busy loop.

      As I think it is difficult to prove that kmalloc(GFP_KERNEL) never sets
      TIF_SIGPENDING flag, we would need to call
      clear_thread_flag(TIF_SIGPENDING) immediately before
      wait_for_completion_killable() and call recalc_sigpending() immediately
      after wait_for_completion_killable(). Is this better than taking care of
      SIGKILL (by other than the OOM killer) on the first call to
      kthread_run() ?

  (2) I don't like scattering around test_thread_flag(TIF_MEMDIE), for there
      might be other drivers who receive SIGKILL by systemd's 30 seconds
      timeout.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux