Andrew Vasquez wrote:
On Thu, 31 Jul 2008, Vladislav Bolkhovitin wrote:
The changes above are large (170k diffs so far), and at this point are
being run-through our testing. The hope is to get the changes
upstream during one of the next two merge windows.
Given the infrustructure mods and our focus on that front, if there's
something small and contained you can offer above what I've proposed
we'll be interested in reviewing any patches you'd push forward.
Then, I believe, my patch should go in as a temporal measure. I don't think
we should crash users for 2 more major releases.
So the 'online' check concerns you? So, add an 'unloading' flag, set
it on remove_one(), save a copy of dpc_thread at qla2xxx_wake_dpc(),
then wake_up() saved value if not 'unloading'. Of course the direct
wake_up() in request_acqusition should be modded to call
qla2xxx_wake_dpc(). We're not talking about some huge window here.
My main concern is not about the 'online' flag, but about that your
patch doesn't fix the race, which it's intended to fix, because:
1. It doesn't handle possibility of CPU to reorder commands. Hence, it
could be possible that other CPUs can see 'online' flag set to 0 *after*
dpc_thread set to NULL, despite in the code "ha->flags.online = 0"
precede "ha->dpc_thread = NULL". To address that the corresponding
memory barriers around ha->flags.online assignment and reading are needed.
2. More importantly, it doesn't handle possibility for task_struct, to
which dpc_thread field refers, to be destroyed exactly between lines
if (ha->flags.online && t)
wake_up_process(t);
or inside wake_up_process(). Hence, already freed and possibly reused
data will be accessed by wake_up_process() with all corresponding bad
consequences.
The same is true for my
other patch "Proposed protection of fcports field of struct scsi_qla_host"
as well, because without it there should be no big problems to crash the
driver via sysfs.
We're still looking through your patches... So how exactly would it
be 'be no big problems to crash the driver via sysfs.'???
Example scenario: one or many applications read in a loop any sysfs
entry, which go over fcports list, and at the same time DPC thread adds
to it new entries. If CPU can reorder commands (most modern CPUs can) it
is possible that list_for_each_entry() will access partially inserted
entry => NULL pointer dereference.
To see better what I mean look at the difference between list_add() and
list_add_rcu() as well as between list_for_each_entry() and
list_for_each_entry_rcu().
Vlad
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html