TCMU timeout cause kernel panic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andy,

I've got several kernel panic reports when userspace connected, then
disconnected, and connect again after a while.

So I am looking into how TCMU handle expired commands, but cannot find
the proper way to do it.

I assume tcmu_device_timedout() would be triggered if one command has
been pending for 30 seconds(which defined in TCMU_TIME_OUT). In it, it
would:
1. Handle all userspace completed request, through tcmu_handle_completions.
2. Run tcmu_check_expired_cmd() for every existing command.

I noticed that there was a bug in TCMU code that the deadline of
command was wrongly compared to jiffies, result in cleanup code never
involved. I start to see kernel panic after fixing the bug, which
seems expose the bug.

In tcmu_check_expired_cmd(), which would be involved for every
commands in tcmu_device_timedout():

static int tcmu_check_expired_cmd(int id, void *p, void *data)
{
        struct tcmu_cmd *cmd = p;

        if (test_bit(TCMU_CMD_BIT_EXPIRED, &cmd->flags))
                return 0;

        if (!time_after(jiffies, cmd->deadline))
                return 0;

        set_bit(TCMU_CMD_BIT_EXPIRED, &cmd->flags);
        target_complete_cmd(cmd->se_cmd, SAM_STAT_CHECK_CONDITION);
        cmd->se_cmd = NULL;

        kmem_cache_free(tcmu_cmd_cache, cmd);

        return 0;
}

Here TCMU_CMD_BIT_EXPIRED was set, then cmd was freed? Also cmd would
remain in udev->commands as well, means it can be access by
tcmu_handle_completions() next time. So it's possible userspace would
come back, referring to a freed command. Then bit will be checked in
tcmu_handle_completion(), which can result in accessing already freed
memory.

I cannot figure out how things should be work:

1. I think command should be freed by kernel if expired, otherwise
they can be left and take the ring space forever.

2. But userspace may still have access to the cmd ring and data ring
parts allocated to the command. How should we clean it up? Seems we
cannot simply either wait for userspace(since it may never use it) or
clean it by ourselves(since userspace may use it).

Any thoughts?

--Sheng
--
To unsubscribe from this list: send the line "unsubscribe target-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SCSI]     [Kernel Newbies]     [Linux SCSI Target Infrastructure]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Device Mapper]

  Powered by Linux