On Wed, Mar 9, 2016 at 8:55 PM, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Thu, 2016-02-25 at 11:38 +0800, Ming Lei wrote: >> On Thu, Feb 25, 2016 at 7:28 AM, John David Anglin < >> dave.anglin@xxxxxxxx> wrote: >> > On 2016-02-24, at 4:36 PM, Helge Deller wrote: >> > >> > > Maybe Dave has more luck, otherwise I'll continue to try to get >> > > some info. >> > >> > I tried your patch on the commit in linux-block which first failed >> > to boot. As with Helge, the >> > system crashed and no useful data was output on console. I then >> > applied following patch >> > to give some extra segments and tired again: >> > >> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >> > index b1a2631..b421f03 100644 >> > --- a/drivers/scsi/scsi_lib.c >> > +++ b/drivers/scsi/scsi_lib.c >> > @@ -595,6 +595,11 @@ static int scsi_alloc_sgtable(struct >> > scsi_data_buffer *sdb, int nents, bool mq) >> > >> > BUG_ON(!nents); >> > >> > + /* Provide extra entries in case of split. */ >> > + nents += 8; >> > + if (nents > SCSI_MAX_SG_SEGMENTS) >> > + nents = SCSI_MAX_SG_SEGMENTS; >> > + >> >> Yeah, this is needed for sake of safety. >> >> > if (mq) { >> > if (nents <= SCSI_MAX_SG_SEGMENTS) { >> > sdb->table.nents = nents; >> > >> > The attached file shows the crash in first boot. The second boot >> > was successful and various output >> > was generated by your check code. >> >> From the following log(just select one simple, and looks all are >> similar) in >> 2nd boot, the bi_phys_segments is figured out as one by block core >> , which is wrong because the max segment size is 64k according to >> your investigation in the below link, but the whole req/bio is >> 192k(4k*48). >> >> http://www.spinics.net/lists/linux-parisc/msg06749.html >> >> Looks weird, it shouldn't have happened because >> blk_bio_segment_split() >> does respect the max segment size limit. >> >> BTW, what is the scsi driver for the device? >> >> blk_rq_map_sg: merge bug: 3 1, extra_len 0, dma_drain 0 >> check_bvec: dump bvec for 000000007e53c5f0(f:24490000, t:1) >> 0: 0 4096 245852 000000007e2c4c40 >> 1: 0 4096 245853 000000007e2c4c40 >> 2: 0 4096 245854 000000007e2c4c40 >> 3: 0 4096 245855 000000007e2c4c40 >> 4: 0 4096 245856 000000007e2c4c40 >> 5: 0 4096 245857 000000007e2c4c40 >> 6: 0 4096 245858 000000007e2c4c40 >> 7: 0 4096 245859 000000007e2c4c40 >> 8: 0 4096 245860 000000007e2c4c40 >> 9: 0 4096 245861 000000007e2c4c40 >> 10: 0 4096 245862 000000007e2c4c40 >> 11: 0 4096 245863 000000007e2c4c40 >> 12: 0 4096 245864 000000007e2c4c40 >> 13: 0 4096 245865 000000007e2c4c40 >> 14: 0 4096 245866 000000007e2c4c40 >> 15: 0 4096 245867 000000007e2c4c40 >> 16: 0 4096 245868 000000007e2c4c40 >> 17: 0 4096 245869 000000007e2c4c40 >> 18: 0 4096 245870 000000007e2c4c40 >> 19: 0 4096 245871 000000007e2c4c40 >> 20: 0 4096 245872 000000007e2c4c40 >> 21: 0 4096 245873 000000007e2c4c40 >> 22: 0 4096 245874 000000007e2c4c40 >> 23: 0 4096 245875 000000007e2c4c40 >> 24: 0 4096 245876 000000007e2c4c40 >> 25: 0 4096 245877 000000007e2c4c40 >> 26: 0 4096 245878 000000007e2c4c40 >> 27: 0 4096 245879 000000007e2c4c40 >> 28: 0 4096 245880 000000007e2c4c40 >> 29: 0 4096 245881 000000007e2c4c40 >> 30: 0 4096 245882 000000007e2c4c40 >> 31: 0 4096 245883 000000007e2c4c40 >> 32: 0 4096 245884 000000007e2c4c40 >> 33: 0 4096 245885 000000007e2c4c40 >> 34: 0 4096 245886 000000007e2c4c40 >> 35: 0 4096 245887 000000007e2c4c40 >> 36: 0 4096 245888 000000007e2c4c40 >> 37: 0 4096 245889 000000007e2c4c40 >> 38: 0 4096 245890 000000007e2c4c40 >> 39: 0 4096 245891 000000007e2c4c40 >> 40: 0 4096 245892 000000007e2c4c40 >> 41: 0 4096 245893 000000007e2c4c40 >> 42: 0 4096 245894 000000007e2c4c40 >> 43: 0 4096 245895 000000007e2c4c40 >> 44: 0 4096 245896 000000007e2c4c40 >> 45: 0 4096 245897 000000007e2c4c40 >> 46: 0 4096 245898 000000007e2c4c40 >> 47: 0 4096 245899 000000007e2c4c40 > > > We've provided all the information you asked for, what's the next step > on this, or do we have to unwind the bio splitting code with reverts > until it starts working? John, Helge, and I did discuss the problem for a while privately, and looks it is related with compiler. Last time, I sent one patch which can make the issue disappear, but the main change is just invovled with the below: struct bio_vec { struct page *bv_page; - unsigned int bv_len; + unsigned int bv_seg:8; + unsigned int bv_len:24; unsigned int bv_offset; }; Maybe John and Helge have some update recently? The logic in blk_bio_segment_split() is correct, and it does respect the max segment size limit. Thanks, Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html