Re: Still havind major MVSAS issues.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 Hello all,

I have also got severe problems with mvsas, but have managed to at least make it usable. My config is a AOC-SASLP-MV8 with an HP SAS expander, with WD and Seagate sata disks connected to the expander using a norco 4020 case that does not have any on board expanders. Kernel version is 2.6.33 with latest srinivas patch. I also dont raid, the 13 disks have their own filesystem.

My experiences are:
The only filesystem that works is JFS. XFS and btrfs crash the controller making all the disks unreadable and that can only be solved by rebooting. (mkfs succeeds, crashes happen only after mounting or when fscking) JFS works ok as long as i access it via samba or make copies and moves with cp and mv. If I try file operations with Thunar file manager, the controller crashes. Smartctl has not caused any problems so far, and works ok.

Attached are some kernel logs captured from those crashes.

Kind regards,
Konstantinos Skarlatos





On 6/6/2010 3:13 μμ, Jelle de Jong wrote:
Dear Srini,

I spent a few weeks catering information and did some intensive
testing the last few days.

Srinivas Naga Venkatasatya Pasagadugula wrote, on 06-05-10 08:01:
1. Is this is the problem with only WD SATA drives? (I don't have WD SATA drives to reproduce this issue.)
2. Whether the problem is with direct attached SATA drives or drives connected in expanders also?
3. Could you please provide me the "dmesg" log or "/var/log/messages" log.
4. How much the capacity of SATA drives connected to controller?
5. Your HBA is having 6440 chipset?
With a few tricks I managed to boot my OS from the mvsas controller. I
got eleven different sata disks attached from a 4-port mini-sas
backplate without expander to the two Marvel 88SE63xx/64xx mvsas
controllers in my system.

I managed to add five mdadm raid1 arrays without added the actual
active sync device (so one disk for each array) I build my lvm systems
on top of this and did a lot of file transfers for testing.

This worked stable enough, there are HDIO_GET_IDENTITY errors during
boot and operation, but the hard disks seem to be working.

So to debug if issues are related to some brand of harddisk, I started
to add WDC, Hitachi, SAMSUNG, Maxtor and Seagate disks to the
respected raid1 array, the disks are of different sizes (320GB, 500GB,
1TB.)

The sync starts and will fail directly or a while later on failures in
the mvsas driver. I attached an failure example as attachment.

The failures are of grave severity, I have lost complete lvm2 volumes
and raid arrays during testing.

Do you also have a SuperMicro AOC-SASLP-MV8 controller for testing?

I would love to use the controllers for production, but they are
currently unstable. I hope this information helps to solve the mvsas
issues?

With kind regards,

Jelle de Jong

------------[ cut here ]------------
WARNING: at drivers/ata/libata-core.c:5186 ata_qc_issue+0x31f/0x330 [libata]()
Hardware name:
Modules linked in: ipv6 hwmon_vid jfs cpufreq_powersave fan cpufreq_ondemand edac_core powernow_k8 firewire_ohci psmouse firewire_core freq_table serio_raw pcspkr k8temp thermal crc_itu_t evdev edac_mce_amd skge processor button i2c_nforce2 sg forcedeth i2c_core fuse rtc_cmos rtc_core rtc_lib ext2 mbcache dm_crypt dm_mod ses enclosure sd_mod usb_storage ohci_hcd mvsas libsas sata_sil ehci_hcd scsi_transport_sas sata_nv usbcore pata_amd sata_via ata_generic pata_via pata_acpi libata scsi_mod
Pid: 3308, comm: smartctl Not tainted 2.6.33-ARCH #1
Call Trace:
 [<ffffffff810528c8>] warn_slowpath_common+0x78/0xb0
 [<ffffffff8105290f>] warn_slowpath_null+0xf/0x20
 [<ffffffffa002c14f>] ata_qc_issue+0x31f/0x330 [libata]
 [<ffffffffa0006fae>] ? scsi_init_sgtable+0x4e/0x90 [scsi_mod]
 [<ffffffffa0033cd0>] ? ata_scsi_pass_thru+0x0/0x2f0 [libata]
 [<ffffffffa00310c6>] ata_scsi_translate+0xa6/0x180 [libata]
 [<ffffffffa0000b10>] ? scsi_done+0x0/0x20 [scsi_mod]
 [<ffffffffa0000b10>] ? scsi_done+0x0/0x20 [scsi_mod]
 [<ffffffffa0034369>] ata_sas_queuecmd+0x139/0x2b0 [libata]
 [<ffffffffa00f3098>] sas_queuecommand+0x98/0x300 [libsas]
 [<ffffffffa0000c25>] scsi_dispatch_cmd+0xf5/0x230 [scsi_mod]
 [<ffffffffa0006ba2>] scsi_request_fn+0x322/0x3e0 [scsi_mod]
 [<ffffffff811b72bd>] __generic_unplug_device+0x2d/0x40
 [<ffffffff811bcbf8>] blk_execute_rq_nowait+0x68/0xb0
 [<ffffffff811bccc1>] blk_execute_rq+0x81/0xf0
 [<ffffffff811b4d0b>] ? blk_rq_bio_prep+0x2b/0xd0
 [<ffffffff811bc866>] ? blk_rq_map_kern+0xd6/0x150
 [<ffffffffa0007ee7>] scsi_execute+0xf7/0x160 [scsi_mod]
 [<ffffffffa0033167>] ata_cmd_ioctl+0x177/0x320 [libata]
 [<ffffffffa0033467>] ata_sas_scsi_ioctl+0x157/0x2b0 [libata]
 [<ffffffffa00f25f7>] sas_ioctl+0x47/0x50 [libsas]
 [<ffffffffa0002225>] scsi_ioctl+0xd5/0x390 [scsi_mod]
 [<ffffffffa0134d3e>] sd_ioctl+0xce/0xe0 [sd_mod]
 [<ffffffff811be35f>] __blkdev_driver_ioctl+0x8f/0xb0
 [<ffffffff811be82e>] blkdev_ioctl+0x22e/0x820
 [<ffffffff8114fdf7>] block_ioctl+0x37/0x40
 [<ffffffff81131ac8>] vfs_ioctl+0x38/0xd0
 [<ffffffff81131c70>] do_vfs_ioctl+0x80/0x560
 [<ffffffff811cff46>] ? __up_read+0xa6/0xd0
 [<ffffffff81077c29>] ? up_read+0x9/0x10
 [<ffffffff811321d1>] sys_ioctl+0x81/0xa0
 [<ffffffff8100a002>] system_call_fastpath+0x16/0x1b
---[ end trace 115ad6bf347654e7 ]---
 sdm: sdm1
 sdn: sdn1
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
INFO: task smbd:3348 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
smbd          D ffff88000180f948     0  3348   2949 0x00000000
 ffff88002dfe38b8 0000000000000086 0000000000000000 ffffffffa000692b
 000000013d124bd0 0000000000011250 ffff88003caff938 ffff88003cf36690
 000000010061522b ffff88002dfe3fd8 ffff88002dfe2000 ffff88002dfe2000
Call Trace:
 [<ffffffffa000692b>] ? scsi_request_fn+0xab/0x3e0 [scsi_mod]
 [<ffffffff810dbd00>] ? sync_page+0x0/0x50
 [<ffffffff8135970e>] io_schedule+0x6e/0xb0
 [<ffffffff810dbd3d>] sync_page+0x3d/0x50
 [<ffffffff81359d32>] __wait_on_bit_lock+0x52/0xb0
 [<ffffffff810dbce2>] __lock_page+0x62/0x70
 [<ffffffff81073090>] ? wake_bit_function+0x0/0x40
 [<ffffffff810dc3d9>] do_read_cache_page+0x159/0x180
 [<ffffffffa02e11e0>] ? metapage_readpage+0x0/0x180 [jfs]
 [<ffffffff810dc434>] read_cache_page_async+0x14/0x20
 [<ffffffff810dc449>] read_cache_page+0x9/0x20
 [<ffffffffa02e1d85>] __get_metapage+0x95/0x5a0 [jfs]
 [<ffffffffa02d4fb5>] diRead+0x155/0x200 [jfs]
 [<ffffffffa02c8d08>] jfs_iget+0x38/0x160 [jfs]
 [<ffffffffa02cb461>] jfs_lookup+0x71/0x140 [jfs]
 [<ffffffff81110000>] ? calculate_sizes+0x220/0x4a0
 [<ffffffff81359d53>] ? __wait_on_bit_lock+0x73/0xb0
 [<ffffffff8135a45d>] ? __mutex_lock_slowpath+0x26d/0x370
 [<ffffffff8112bafb>] do_lookup+0x1db/0x270
 [<ffffffff8112e127>] link_path_walk+0x6b7/0xf10
 [<ffffffff810e1b28>] ? free_hot_page+0x28/0x90
 [<ffffffff8112eb1c>] path_walk+0x5c/0xc0
 [<ffffffff8112ecb3>] do_path_lookup+0x53/0xa0
 [<ffffffff8112f8f2>] user_path_at+0x52/0xa0
 [<ffffffff8115f01e>] ? locks_free_lock+0x3e/0x60
 [<ffffffff8115fb74>] ? fcntl_setlk+0x64/0x350
 [<ffffffff811260a7>] vfs_fstatat+0x37/0x70
 [<ffffffff81126206>] vfs_stat+0x16/0x20
 [<ffffffff8112622f>] sys_newstat+0x1f/0x50
 [<ffffffff81131370>] ? sys_fcntl+0x160/0x5d0
 [<ffffffff8100a002>] system_call_fastpath+0x16/0x1b
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
sd 10:0:1:0: [sdc] Unhandled error code
sd 10:0:1:0: [sdc] Result: hostbyte=0x00 driverbyte=0x06
sd 10:0:1:0: [sdc] CDB: cdb[0]=0x28: 28 00 30 b4 cb ff 00 04 00 00
end_request: I/O error, dev sdc, sector 817155071
sd 10:0:12:0: [sdm] Unhandled error code
sd 10:0:12:0: [sdm] Result: hostbyte=0x00 driverbyte=0x06
sd 10:0:12:0: [sdm] CDB: cdb[0]=0x2a: 2a 00 6e 04 66 c8 00 04 00 00
end_request: I/O error, dev sdm, sector 1845782216
sd 10:0:2:0: [sdd] Unhandled error code
sd 10:0:2:0: [sdd] Result: hostbyte=0x00 driverbyte=0x06
sd 10:0:2:0: [sdd] CDB: cdb[0]=0x2a: 2a 00 ae 97 38 ff 00 00 08 00
end_request: I/O error, dev sdd, sector 2929146111
sd 10:0:2:0: [sdd] Unhandled error code
sd 10:0:2:0: [sdd] Result: hostbyte=0x00 driverbyte=0x06
sd 10:0:2:0: [sdd] CDB: cdb[0]=0x28: 28 00 68 6a 4d 57 00 00 40 00
end_request: I/O error, dev sdd, sector 1751797079
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux