Hi Brian, On Wed, Sep 02, 2020 at 01:38:59PM -0400, Brian Foster wrote: > On Wed, Sep 02, 2020 at 10:19:23PM +0800, Gao Xiang wrote: ... > > However, each log record could still have crafted > > h_len and cause log record buffer overrun. So let's > > check h_len for each log record as well instead. > > > > Is this something you've observed or attempted to reproduce, or is this > based on code inspection? Thanks for your review. based on code inspection, the logic seems straight-forward in xlog_do_recovery_pass() ... dbp = xlog_alloc_buffer(log, BTOBB(h_size)); ^ here uses h_size from the tail block if (!dbp) { kmem_free(hbp); return -ENOMEM; } if (tail_blk > head_blk) { while (blk_no < log->l_logBBsize) { xlog_bread xlog_valid_rec_header xlog_recover_process } } while (blk_no < head_blk) { xlog_bread xlog_valid_rec_header xlog_recover_process } in xlog_recover_process() crc = xlog_cksum(log, rhead, dp, be32_to_cpu(rhead->h_len)); ^here ... and also xlog_recover_process_data() end = dp + be32_to_cpu(rhead->h_len); ... while ((dp < end) && num_logops) { ohead = (struct xlog_op_header *)dp; (all things around dp/ohead if num_logops is crafted as well. ... } > > > - if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > INT_MAX)) > > + if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0)) > > Why is the second part of the check removed? if hlen <= hsize (hsize > 0) then hlen will be <= INT_MAX > > > return -EFSCORRUPTED; > > + > > + if (hsize && XFS_IS_CORRUPT(log->l_mp, > > + hsize < be32_to_cpu(rhead->h_size))) > > + return -EFSCORRUPTED; > > + hsize = be32_to_cpu(rhead->h_size); > > I'm a little confused why we take hsize as a parameter as well as read > it from the record header. If we're validating a particular record, > shouldn't we use the size as specified by that record? > > Also FWIW I think pulling bits of logic out of the XFS_IS_CORRUPT() > check makes this a little harder to read than just putting the entire > logic statement within the macro. It seems that is partially self-answered in the last part of the email. So move the response to the last of the email... > > > + > > + if (unlikely(hlen > hsize)) { > > I think we've made a point to avoid the [un]likely() modifiers in XFS as > they don't usually have a noticeable impact. I certainly wouldn't expect > it to in log recovery. Honestly, I really don't want to work on some topic about [un]likely, I did a long discussion with Dan Carpenter and a couple of other people last year, but *shrug* For this case just simply bacause XFS_IS_CORRUPT() has this annotation, and it seems xlog_valid_rec_header() logic will be changed in v3 if we leave the mkfs workaround logic as is. > > > + if (XFS_IS_CORRUPT(log->l_mp, hlen > log->l_mp->m_logbsize || > > + rhead->h_num_logops != cpu_to_be32(1))) > > + return -EFSCORRUPTED; > > + > > + xfs_warn(log->l_mp, > > + "invalid iclog size (%d bytes), using lsunit (%d bytes)", > > + hsize, log->l_mp->m_logbsize); > > + rhead->h_size = cpu_to_be32(log->l_mp->m_logbsize); > > I also find updating the header structure as such down in a "validation > helper" a bit obscured. also the same words at the last of the email... > > > + } > > + > > if (XFS_IS_CORRUPT(log->l_mp, > > blkno > log->l_logBBsize || blkno > INT_MAX)) > > return -EFSCORRUPTED; > ... > > @@ -3096,7 +3100,7 @@ xlog_do_recovery_pass( > > } > > rhead = (xlog_rec_header_t *)offset; > > error = xlog_valid_rec_header(log, rhead, > > - split_hblks ? blk_no : 0); > > + split_hblks ? blk_no : 0, h_size); > > if (error) > > goto bread_err2; > > > > @@ -3177,7 +3181,7 @@ xlog_do_recovery_pass( > > goto bread_err2; > > > > rhead = (xlog_rec_header_t *)offset; > > - error = xlog_valid_rec_header(log, rhead, blk_no); > > + error = xlog_valid_rec_header(log, rhead, blk_no, h_size); > > if (error) > > goto bread_err2; > > In these two cases we've already allocated the record header and data > buffers and we're walking through the log records doing recovery. Given > that, it seems like the purpose of the parameter is more to check the > subsequent records against the size of the current record buffer. That > seems like a reasonable check to incorporate, but I think the mkfs Yes > workaround logic is misplaced in a generic record validation helper. > IIUC that is a very special case that should only apply to the first > record in the log and only impacts the size of the buffer we allocate to > read in the remaining records. > > Can we rework this to leave the mkfs workaround logic as is and update > the validation helper to check that each record length fits in the size > of the buffer we've decided to allocate? I'd also suggest to rename the > new parameter to something like 'bufsize' instead of 'h_size' to clarify > what it actually means in the context of xlog_valid_rec_header(). Ok, that is fine. I will leave the mkfs workaround logic as is and rename to bufsize. Thanks, Gao Xiang > > Brian > > > > > -- > > 2.18.1 > > >