Re: remark and RFC

Nix <nix@xxxxxxxxxxxxx> · Thu, 17 Aug 2006 00:43:57 +0100

On 16 Aug 2006, Molle Bestefich murmured woefully:
> Peter T. Breuer wrote:
>> > The comm channel and "hey, I'm OK" message you propose doesn't seem
>> > that different from just hot-adding the disks from a shell script
>> > using 'mdadm'.
>>
>> [snip speculations on possible blocking calls]
> 
> You could always try and see.
> Should be easy to simulate a network outage.

Blocking calls are not the problem. Deadlocks are.

The problem is that forking a userspace process necessarily involves
kernel memory allocations (for the task struct, userspace memory map,
possibly text pages if the necessary pieces of mdadm are not in the page
cache), and if your swap is on the remote RAID array, you can't
necessarily carry out those allocations.

Note that the same deadlock situation is currently triggered by
sending/receiving network packets, which is why swapping over NBD is a
bad idea at present: however, this is being fixed at this moment because
until it's fixed you can't reliably have a machine with all storage on
iSCSI, for instance. However, the deadlock is only fixable for kernel
allocations, because the amount of storage that'll need is bounded in
several ways: you can't fix it for userspace allocations.  So you can
never rely on userspace working in this situation.

-- 
`We're sysadmins. We deal with the inconceivable so often I can clearly 
 see the need to define levels of inconceivability.' --- Rik Steenwinkel
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html