Re: After memory pressure: can't read from tape anymore

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am Freitag, den 03.12.2010, 12:10 -0600 schrieb James Bottomley:
> On Fri, 2010-12-03 at 18:03 +0100, Lukas Kolbe wrote:
> > Am Freitag, den 03.12.2010, 09:06 -0600 schrieb James Bottomley:
> > > On Fri, 2010-12-03 at 16:59 +0200, Kai MÃkisara wrote:
> > > > On 12/03/2010 02:27 PM, FUJITA Tomonori wrote:
> > > > >
> > > > > Can we make enlarge_buffer friendly to the memory alloctor a bit?
> > > > >
> > > > > His problem is that the driver can't allocate 2 mB with the hardware
> > > > > limit 128 segments.
> > > > >
> > > > > enlarge_buffer tries to use ST_MAX_ORDER and if the allocation (256 kB
> > > > > page) fails, enlarge_buffer fails. It could try smaller order instead?
> > > > >
> > > > > Not tested at all.
> > > > >
> > > > >
> > > > > diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
> > > > > index 5b7388f..119544b 100644
> > > > > --- a/drivers/scsi/st.c
> > > > > +++ b/drivers/scsi/st.c
> > > > > @@ -3729,7 +3729,8 @@ static int enlarge_buffer(struct st_buffer * STbuffer, int new_size, int need_dm
> > > > >   		b_size = PAGE_SIZE<<  order;
> > > > >   	} else {
> > > > >   		for (b_size = PAGE_SIZE, order = 0;
> > > > > -		     order<  ST_MAX_ORDER&&  b_size<  new_size;
> > > > > +		     order<  ST_MAX_ORDER&&
> > > > > +			     max_segs * (PAGE_SIZE<<  order)<  new_size;
> > > > >   		     order++, b_size *= 2)
> > > > >   			;  /* empty */
> > > > >   	}
> > > > 
> > > > You are correct. The loop does not work at all as it should. Years ago,
> > > > the strategy was to start with as big blocks as possible to minimize the 
> > > > number s/g segments. Nowadays the segments must be of same size and the 
> > > > old logic is not applicable.
> > > > 
> > > > I have not tested the patch either but it looks correct.
> > > > 
> > > > Thanks for noticing this bug. I hope this helps the users. The question 
> > > > about number of s/g segments is still valid for the direct i/o case but 
> > > > that is optimization and not whether one can read/write.
> > > 
> > > Realistically, though, this will only increase the probability of making
> > > an allocation work, we can't get this to a certainty.
> > > 
> > > Since we fixed up the infrastructure to allow arbitrary length sg lists,
> > > perhaps we should document what cards can actually take advantage of
> > > this (and how to do so, since it's not set automatically on boot).  That
> > > way users wanting tapes at least know what the problems are likely to be
> > > and how to avoid them in their hardware purchasing decisions.  The
> > > corollary is that we should likely have a list of not recommended cards:
> > > if they can't go over 128 SG elements, then they're pretty much
> > > unsuitable for modern tapes.
> > 
> > Are you implying here that the LSI SAS1068E is unsuitable to drive two
> > LTO-4 tape drives? Or is it 'just' a problem with the driver?
> 
> The information seems to be the former.  There's no way the kernel can
> guarantee physical contiguity of memory as it operates.  We try to
> defrag, but it's probabalistic, not certain, so if we have to try to
> find a physically contiguous buffer to copy into for an operation like
> this, at some point that allocation is going to fail.
> 
> The only way to be certain you can get a 2MB block down to a tape device
> is to be able to transmit the whole thing as a SG list of fully
> discontiguous pages.  On a system with 4k pages, that requires 512 SG
> entries.  From what I've heard Kashyap say, that can't currently be done
> on the 1068 because of firmware limitations (I'm not entirely clear on
> this, but that's how it sounds to me ... if there is a way of making
> firmware accept more than 128 SG elements per SCSI command, then it is a
> fairly simple driver change).

Well, 2MB blocksizes actually do work - bacula is reporting a blocksize
of ~2MB for each drive while writing to it - only after there was memory
pressure and a new tape got inserted, it is *not* possible anymore to
write to the tape with these blocksizes, and dmesg tells me one of these
every time bacula tries to read from or write to a tape:

[101883.958351] st0: Can't allocate 2097152 byte tape buffer.
[103901.666608] st0: Can't allocate 10249541 byte tape buffer.

No idea why it's trying 10MB, though.

I tested with the patch from Fujita, and this messages from before
applying the patch: 

[158544.348411] st: append_to_buffer offset overflow.

do not appear anymore.
It didn't help on the not-being-able-to-write-after-memory-pressure
matter, though.

>  This isn't something we can work around
> in the driver because the transaction can't be split ... it has to go
> down as a single WRITE command with a single output data buffer.
> 
> The LSI 1068 is an upgradeable firmware system, so it's always possible
> LSI can come up with a firmware update that increases the size (this
> would also require a corresponding driver change), but it doesn't sound
> to be something that can be done in the driver alone.

If only LSI's website were a little more clear on where to find updated
firmware and what was the latest version :/.

-- 
Lukas


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux