On 01/19/2017 12:34 AM, Song Liu wrote: > > Media health monitoring is very important for large scale distributed storage systems. > Traditionally, enterprise storage controllers maintain event logs for attached storage > devices. However, these controller managed logs do not scale well for large scale > distributed systems. > > While designing a more flexible and scalable event logging systems, we think it is better > to build the log in block layer. Block level event logging covers all major storage media > (SCSI, SATA, NVMe), and thus minimizes redundant work for different protocols. > > In this LSF/MM, we would like to discuss the following topics with the community: > 1. Mechanism for drivers report events (or errors) to block layer. > Basically, we will need a traceable function for the drivers to report errors > (most likely right before calling end_request or bio_endio). > > 2. What mechanism (ftrace, BPF, etc.) is mostly preferred for the event logging? > > 3. How should we categorize different events? > Currently, there are existing code that translates ATA error (ata_to_sense_error) > and NVMe error (nvme_trans_status_code) to SCSI sense code. So we can > leverage SCSI Key Code Qualifier for event categorizations. > > 4. Detailed discussions on data structure for event logging. > > We will be able to show a prototype implementation during LSF/MM. > Very good topic; I'm very much in favour of it. That ties in rather nicely with my multipath redesign, where I've added a notifier chain for block events. Cheers, Hannes -- Dr. Hannes Reinecke Teamlead Storage & Networking hare@xxxxxxx +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg) -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html