在 2017/6/14 21:08, John Garry 写道: > On 14/06/2017 10:04, wangyijing wrote: >>>> static void notify_ha_event(struct sas_ha_struct *sas_ha, enum ha_event event) >>>> >> { >>>> >> + struct sas_ha_event *ev; >>>> >> + >>>> >> BUG_ON(event >= HA_NUM_EVENTS); >>>> >> >>>> >> - sas_queue_event(event, &sas_ha->pending, >>>> >> - &sas_ha->ha_events[event].work, sas_ha); >>>> >> + ev = kzalloc(sizeof(*ev), GFP_ATOMIC); >>>> >> + if (!ev) >>>> >> + return; >>> > GFP_ATOMIC allocations can fail and then no events will be queued *and* we >>> > don't report the error back to the caller. >>> > >> Yes, it's really a problem, but I don't find a better solution, do you have some suggestion ? >> > > Dan raised an issue with this approach, regarding a malfunctioning PHY which spews out events. I still don't think we're handling it safely. Here's the suggestion: > - each asd_sas_phy owns a finite-sized pool of events > - when the event pool becomes exhausted, libsas stops queuing events (obviously) and disables the PHY in the LLDD > - upon attempting to re-enable the PHY from sysfs, libsas first checks that the pool is still not exhausted > > If you cannot find a good solution, then let us know and we can help. Hi John and Dan, what's event you found on malfunctioning PHY, if the event is PORTE_BROADCAST_RCVD, since every PORTE_BROADCAST_RCVD libsas always call sas_revalidate_domain(), what about keeping a broadcast waiting(not queued in workqueue) and discard others. If the event is other types, things may become knotty. > > John > > > . >