Hi all, I am currently working on a project using TCMU in production, and got an issue related to both TCMU and loopback device recently. After connect to TCMU successfully, I immediately notice that the cmd ring and data ring was filled very quickly. I am using TCMU as backend and loopback device as frontend. If I understand right, hw_max_sectors should be the maximum sectors(in 512 or in block size? more likely in block size) one SCSI command can use, and hw_queue_depth is the maximum number of commands that hardware can queued at any given moment. Both parameter would affect the size of ring buffer in TCMU, since I assume ring buffer should able to hold all the commands/data sent from block layer. Currently we have: #define CMDR_SIZE (16 * 4096) #define DATA_SIZE (257 * 4096) #define TCMU_RING_SIZE (CMDR_SIZE + DATA_SIZE) So, cmd ring size is 64k and data ring size is about 1028k. And we define by default: dev->dev_attrib.hw_block_size = 512; dev->dev_attrib.hw_max_sectors = 128; dev->dev_attrib.hw_queue_depth = 128; So the data ring should be at least: 512 * 128 * 128 = 8M, or if I change block size to 4k, the data ring should be 64M, which is not quite acceptable. No wonder I always see data ring buffer full when enable `debug` for target_core_user.c. I've also measured scsi command's size. The largest size of tcmu_cmd I saw in the cmd ring buffer is 336 bytes, which result in 336 * 128 = ~43k, which seems good enough for 64k preallocation. But I still saw tons of `cmd ring full` and `data ring full` message after increase the data ring buffer to beyond 8M. So I begin to wonder what's really happening here. I tried to reduce hw_max_sectors and hw_queue_depth, but these two parameters seems not working. I've checked the code for these two parameters, but unable to find how they're used. Then it turns out these two parameters doesn't really control the size? After TCMU is connected to userspace, I checked the device's properties, it matches loopback driver's config perfectly: static struct scsi_host_template tcm_loop_driver_template = { .show_info = tcm_loop_show_info, .proc_name = "tcm_loopback", .name = "TCM_Loopback", .queuecommand = tcm_loop_queuecommand, .change_queue_depth = scsi_change_queue_depth, .eh_abort_handler = tcm_loop_abort_task, .eh_device_reset_handler = tcm_loop_device_reset, .eh_target_reset_handler = tcm_loop_target_reset, .can_queue = 1024, .this_id = -1, .sg_tablesize = 256, .cmd_per_lun = 1024, .max_sectors = 0xFFFF, .use_clustering = DISABLE_CLUSTERING, .slave_alloc = tcm_loop_slave_alloc, .module = THIS_MODULE, .use_blk_tags = 1, .track_queue_depth = 1, }; The really queue_depth is 1024, which means cmd ring need to be at least 344k, and max_sectors is ... 0xFFFF/64k, and it can be in 512 or block size. And with 1k commands in queue, ring buffer of TCMU need to be 64M * 512(or block size). So I tweaked the config of loopback driver instead, everything finally works as expected. My questions are: 1. Does hw_max_sectors and hw_queue_depth meant to be used to control the max_sectors and queue_depth of the device? If so, how they would work? And do we have a way to configure them in userspace if they works? It seems they have to be verified not beyond the ring buffer size. 2. Or it's the loopback device we should really configure in userspace? I can configure queue_depth in userspace through */sys/bus/scsi/devices/<SCSI device>/queue_depth*, though I don't know how to configure max_sectors yet. Probably try "/sys/block/xxx/queue/max_sectors_kb", but I don't know if that's expected. Please correct me if I got anything wrong. I really hope to get this piece of code work flawlessly, and would be happy to contribute. But I am not sure what's the correct way to fix it. Thanks in advance. --Sheng -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html