Hello, Note that I see exactly your errors (in a non-Ceph environment) with both Samsung 845DC EVO and Intel DC S3610. Though I need to stress things quite a bit to make it happen. Also setting nobarrier did alleviate it, but didn't fix it 100%, so I guess something still issues flushes at some point. >From where I stand LSI/Avago are full of it. Not only does this problem NOT happen with any onboard SATA chipset I have access to, their task abort and reset is what actually impacts things (several seconds to recover), not whatever insignificant delay caused by the SSDs. Christian On Tue, 8 Sep 2015 11:35:38 +1200 Richard Bade wrote: > Thanks guys for the pointers to this Intel thread: > > https://communities.intel.com/thread/77801 > > It looks promising. I intend to update the firmware on disks in one > node tonight and will report back after a few days to a week on my > findings. > > I've also posted to that forum and will update there too. > > Regards, > > Richard > > > On 5 September 2015 at 07:55, Richard Bade <hitrich@xxxxxxxxx> wrote: > > > Hi Everyone, > > > > We have a Ceph pool that is entirely made up of Intel S3700/S3710 > > enterprise SSD's. > > > > We are seeing some significant I/O delays on the disks causing a “SCSI > > Task Abort” from the OS. This seems to be triggered by the drive > > receiving a “Synchronize cache command”. > > > > My current thinking is that setting nobarriers in XFS will stop the > > drive receiving a sync command and therefore stop the I/O delay > > associated with it. > > > > In the XFS FAQ it looks like the recommendation is that if you have a > > Battery Backed raid controller you should set nobarriers for > > performance reasons. > > > > Our LSI card doesn’t have battery backed cache as it’s configured in > > HBA mode (IT) rather than Raid (IR). Our Intel s37xx SSD’s do have a > > capacitor backed cache though. > > > > So is it recommended that barriers are turned off as the drive has a > > safe cache (I am confident that the cache will write out to disk on > > power failure)? > > > > Has anyone else encountered this issue? > > > > Any info or suggestions about this would be appreciated. > > > > Regards, > > > > Richard > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com