On Thu, May 26, 2022 at 12:30 PM Dan Carpenter <dan.carpenter@xxxxxxxxxx> wrote: > > Hello Oded Gabbay, > > The patch ac0ae6a96aa5: "habanalabs: add gaudi asic-dependent code" > from May 11, 2020, leads to the following Smatch static checker > warning: > > drivers/misc/habanalabs/gaudi/gaudi.c:5568 gaudi_parse_cb_mmu() > error: 'parser->user_cb_size' from user is not capped properly > > drivers/misc/habanalabs/gaudi/gaudi.c > 5526 static int gaudi_parse_cb_mmu(struct hl_device *hdev, > 5527 struct hl_cs_parser *parser) > 5528 { > 5529 u64 handle; > 5530 u32 patched_cb_size; > 5531 struct hl_cb *user_cb; > 5532 int rc; > 5533 > 5534 /* > 5535 * The new CB should have space at the end for two MSG_PROT pkt: > 5536 * 1. A packet that will act as a completion packet > 5537 * 2. A packet that will generate MSI interrupt > 5538 */ > 5539 if (parser->completion) > 5540 parser->patched_cb_size = parser->user_cb_size + > 5541 sizeof(struct packet_msg_prot) * 2; > 5542 else > 5543 parser->patched_cb_size = parser->user_cb_size; > 5544 > 5545 rc = hl_cb_create(hdev, &hdev->kernel_mem_mgr, hdev->kernel_ctx, > 5546 parser->patched_cb_size, false, false, > 5547 &handle); > 5548 > 5549 if (rc) { > 5550 dev_err(hdev->dev, > 5551 "Failed to allocate patched CB for DMA CS %d\n", > 5552 rc); > 5553 return rc; > 5554 } > 5555 > 5556 parser->patched_cb = hl_cb_get(&hdev->kernel_mem_mgr, handle); > 5557 /* hl_cb_get should never fail */ > 5558 if (!parser->patched_cb) { > 5559 dev_crit(hdev->dev, "DMA CB handle invalid 0x%llx\n", handle); > 5560 rc = -EFAULT; > 5561 goto out; > 5562 } > 5563 > 5564 /* > 5565 * The check that parser->user_cb_size <= parser->user_cb->size was done > 5566 * in validate_queue_index(). > > We are looking at cs_ioctl_default(). > > This comment is wrong. There is no check for this in validate_queue_index(). > There is a check in get_cb_from_cs_chunk() but that function is only > called when is_kernel_allocated_cb is true. > > I feel like we should check if "chunk->cb_size > cb->size" on all user > input. I think it is required. But even if it's not, it would make the > code easier for Smatch to understand. Hi Dan, The code is indeed confusing due to the move between common and asic-specific code. But actually we are protected, because you will never reach gaudi_parse_cb_mmu() if the CB was not allocated by the kernel. The reason is because for Gaudi, we only parse CBs that go to what we call "External Queues", which are assigned the queue type QUEUE_TYPE_EXT. If you will look at validate_queue_index(), it assigns the is_kernel_allocated_cb property based on the queue type. The logic there is a bit complex because it is dependent on the queue properties, but what happens is that if the user submitted a work for an external queue in Gaudi, is_kernel_allocated_cb will be assigned "true" in this function. And that will cause the get_cb_from_cs_chunk() to check the size. And in gaudi_cs_parser(), we only call gaudi_parse_cb_mmu() if it is an external queue (In Gaudi a queue is either QUEUE_TYPE_EXT or QUEUE_TYPE_INT). So the comment is wrong and I will fix it. But because this is the data-plane, I prefer not to add an additional check. I hope this makes sense. Thanks, Oded > > 5567 */ > --> 5568 memcpy(parser->patched_cb->kernel_address, > 5569 parser->user_cb->kernel_address, > 5570 parser->user_cb_size); > ^^^^^^^^^^^^^^^^^^^^ > Otherwise *boom* user controlled buffer overflow. > > 5571 > 5572 patched_cb_size = parser->patched_cb_size; > 5573 > 5574 /* Validate patched CB instead of user CB */ > 5575 user_cb = parser->user_cb; > 5576 parser->user_cb = parser->patched_cb; > 5577 rc = gaudi_validate_cb(hdev, parser, true); > 5578 parser->user_cb = user_cb; > 5579 > 5580 if (rc) { > 5581 hl_cb_put(parser->patched_cb); > 5582 goto out; > 5583 } > 5584 > 5585 if (patched_cb_size != parser->patched_cb_size) { > 5586 dev_err(hdev->dev, "user CB size mismatch\n"); > 5587 hl_cb_put(parser->patched_cb); > 5588 rc = -EINVAL; > 5589 goto out; > 5590 } > 5591 > 5592 out: > 5593 /* > 5594 * Always call cb destroy here because we still have 1 reference > 5595 * to it by calling cb_get earlier. After the job will be completed, > 5596 * cb_put will release it, but here we want to remove it from the > 5597 * idr > 5598 */ > 5599 hl_cb_destroy(&hdev->kernel_mem_mgr, handle); > 5600 > 5601 return rc; > 5602 } > > regards, > dan carpenter