On 1/4/21 10:38 PM, Darrick J. Wong wrote:
On Fri, Dec 18, 2020 at 12:29:10AM -0700, Allison Henderson wrote:
Because xattrs can be over a page in size, we need to handle possible
krealloc errors to avoid warnings
Which warnings?
Sorry, I should have included it here. The warning is:
[ +0.000016] WARNING: CPU: 1 PID: 20255 at mm/page_alloc.c:3446
get_page_from_freelist+0x100b/0x1690
and if we look at that line number we have this snippet:
/*
* We most definitely don't want callers attempting to
* allocate greater than order-1 page units with __GFP_NOFAIL.
*/
WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));
Signed-off-by: Allison Henderson <allison.henderson@xxxxxxxxxx>
---
fs/xfs/xfs_log_recover.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 97f3130..295a5c6 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2061,7 +2061,10 @@ xlog_recover_add_to_cont_trans(
old_ptr = item->ri_buf[item->ri_cnt-1].i_addr;
old_len = item->ri_buf[item->ri_cnt-1].i_len;
- ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL | __GFP_NOFAIL);
+ ptr = krealloc(old_ptr, len + old_len, GFP_KERNEL);
Does the removal of NOFAIL increase the likelihood that log recovery
will fail instead of looping around looking for more memory?
I suppose it would? But better to return the error code than proceed
with a NULL pointer. I would think it would be quickly proceeded with
questions of what else is causing memory pressure to build though.
Hm, what /are/ we doing here, anyway? I guess someone logged a gigantic
xattri item, which gets split across multiple log records, and now we're
trying to staple all that back together? And perhaps the xattri item is
larger than a ... page(?) which causes dmesg warnings when combined with
NOFAIL?
Effectively yes, this is coming from one of the new test cases I came up
with to test the replay. It progressively sets larger and larger attrs
and pulls the error tag to see that it replays correctly. Up to 64k
which I think is where ATTR_MAX_VALUELEN is. I figured since we are
opening up a means of logging as much, its something that we should be
testing. :-)
Allison
--D
+ if (ptr == NULL)
+ return -ENOMEM;
+
memcpy(&ptr[old_len], dp, len);
item->ri_buf[item->ri_cnt-1].i_len += len;
item->ri_buf[item->ri_cnt-1].i_addr = ptr;
--
2.7.4