On Sat, Aug 1, 2020 at 1:42 PM David Wysochanski <dwysocha@xxxxxxxxxx> wrote: > > On Thu, Jul 30, 2020 at 5:07 PM David Wysochanski <dwysocha@xxxxxxxxxx> wrote: > > > > On Thu, Jul 30, 2020 at 4:03 PM David Howells <dhowells@xxxxxxxxxx> wrote: > > > > > > David Wysochanski <dwysocha@xxxxxxxxxx> wrote: > > > > > > > To be honest I'm not sure about needing a call to fscache_use/unuse_cookie() > > > > around the call to fscache_resize_cookie(). If the cookie has a > > > > refcount of 1 when it is created, and a file is never opened, so > > > > we never call fscache_use_cookie(), what might happen inside > > > > fscache_resize_cookie()? The header on use_cookie() says > > > > > > I've have afs_setattr() doing use/unuse on the cookie around resize. > > > > > > David > > > > > > > Got it and will be fixed in next series. Thanks! > > I am getting a reproducible use-after-free panic now. The panic > manifests itself as a random backtrace but > kasan report is below. > > Here is the patch I tried: > https://github.com/DaveWysochanskiRH/kernel/commit/2c9e6e3f14380e76fd8cb0232c6b7dbab14f26a2 > > Without that patch generic/014 passes as does most other xfstest > generic tests, only 2 tests are failing now. > > I added kasan and got the below report: > f32-node1 login: [ 116.724496] FS-Cache: Netfs 'nfs' registered for caching > [ 117.567384] Key type dns_resolver registered > [ 118.465342] NFS: Registering the id_resolver key type > [ 118.474332] Key type id_resolver registered > [ 118.476319] Key type id_legacy registered > [ 119.370158] run fstests generic/014 at 2020-08-01 13:27:08 > [ 121.548415] ================================================================== > [ 121.553037] BUG: KASAN: slab-out-of-bounds in > cachefiles_shorten_content_map+0x257/0x280 [cachefiles] > [ 121.556576] Read of size 1 at addr ffff8881db88e7c9 by task truncfile/5675 > [ 121.559207] > [ 121.559861] CPU: 1 PID: 5675 Comm: truncfile Kdump: loaded Not > tainted 5.8.0-rc3-d9c7f5201a4f-kasan+ #3 > [ 121.563505] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 > [ 121.565780] Call Trace: > [ 121.566905] dump_stack+0x91/0xc8 > [ 121.568301] print_address_description.constprop.0+0x1a/0x210 > [ 121.570616] ? _raw_spin_lock_irqsave+0x7d/0xc0 > [ 121.572429] ? _raw_write_unlock_bh+0x60/0x60 > [ 121.574143] ? cachefiles_shorten_content_map+0x257/0x280 [cachefiles] > [ 121.576717] kasan_report.cold+0x37/0x7c > [ 121.578328] ? __fscache_init_io_request+0x140/0x160 [fscache] > [ 121.580625] ? cachefiles_shorten_content_map+0x257/0x280 [cachefiles] > [ 121.583191] cachefiles_shorten_content_map+0x257/0x280 [cachefiles] FWIW, the alleged erroneous access [root@f32-node1 kernel]# eu-addr2line -e ./fs/cachefiles/cachefiles.ko cachefiles_shorten_content_map+0x257 fs/cachefiles/content-map.c:362:30 295 /* 296 * Expand the content map to a larger file size. 297 */ 298 void cachefiles_expand_content_map(struct cachefiles_object *object, loff_t i_size) 299 { 300 size_t size; 301 u8 *map, *zap; 302 303 size = cachefiles_map_size(i_size); 304 305 _enter("%llx,%zx,%x", i_size, size, object->content_map_size); 306 307 if (size <= object->content_map_size) 308 return; 309 310 map = kzalloc(size, GFP_KERNEL); 311 if (!map) 312 return; 313 314 write_lock_bh(&object->content_map_lock); 315 if (size > object->content_map_size) { 316 zap = object->content_map; 317 memcpy(map, zap, object->content_map_size); 318 object->content_map = map; 319 object->content_map_size = size; 320 } else { 321 zap = map; 322 } 323 write_unlock_bh(&object->content_map_lock); 324 325 kfree(zap); 326 } 327 328 /* 329 * Adjust the content map when we shorten a backing object. 330 * 331 * We need to unmark any granules that are going to be discarded. 332 */ 333 void cachefiles_shorten_content_map(struct cachefiles_object *object, 334 loff_t new_size) 335 { 336 struct fscache_cookie *cookie = object->fscache.cookie; 337 ssize_t granules_needed, bits_needed, bytes_needed; 338 339 if (object->fscache.cookie->advice & FSCACHE_ADV_SINGLE_CHUNK) 340 return; 341 342 write_lock_bh(&object->content_map_lock); 343 344 if (object->content_info == CACHEFILES_CONTENT_MAP) { 345 if (cookie->zero_point > new_size) 346 cookie->zero_point = new_size; 347 348 granules_needed = new_size; 349 granules_needed += CACHEFILES_GRAN_SIZE - 1; 350 granules_needed /= CACHEFILES_GRAN_SIZE; 351 bits_needed = round_up(granules_needed, 8); 352 bytes_needed = bits_needed / 8; 353 354 if (bytes_needed < object->content_map_size) 355 memset(object->content_map + bytes_needed, 0, 356 object->content_map_size - bytes_needed); 357 358 if (bits_needed > granules_needed) { 359 size_t byte = (granules_needed - 1) / 8; 360 unsigned int shift = granules_needed % 8; 361 unsigned int mask = (1 << shift) - 1; 362 object->content_map[byte] &= mask; /* KASAN access is here */ 363 } 364 } 365 366 write_unlock_bh(&object->content_map_lock); 367 } > [ 121.585686] cachefiles_resize_object+0xc8/0x160 [cachefiles] > [ 121.587946] __fscache_resize_cookie+0x10c/0x320 [fscache] > [ 121.590296] nfs_setattr_update_inode+0x910/0xdf0 [nfs] > [ 121.592407] nfs4_proc_setattr+0x352/0x450 [nfsv4] > [ 121.594321] nfs_setattr+0x2f0/0x690 [nfs] > [ 121.595962] notify_change+0x760/0xd50 > [ 121.597455] ? __down_timeout+0x20/0x20 > [ 121.598969] do_truncate+0xde/0x170 > [ 121.600362] ? file_open_root+0x1d0/0x1d0 > [ 121.601949] do_sys_ftruncate+0x1e5/0x2d0 > [ 121.603551] do_syscall_64+0x4d/0x90 > [ 121.604970] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 121.606941] RIP: 0033:0x7f45ace69bfb > [ 121.608346] Code: Bad RIP value. > [ 121.609615] RSP: 002b:00007ffd1a514988 EFLAGS: 00000202 ORIG_RAX: > 000000000000004d > [ 121.612539] RAX: ffffffffffffffda RBX: 000000000920470b RCX: 00007f45ace69bfb > [ 121.615298] RDX: 000000000920470b RSI: 000000000920470b RDI: 0000000000000003 > [ 121.618033] RBP: 0000000000000003 R08: 000000000000005b R09: 00007f45acf32a40 > [ 121.620766] R10: fffffffffffff115 R11: 0000000000000202 R12: 000000005f25a5ed > [ 121.623503] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 121.626229] > [ 121.626852] Allocated by task 5675: > [ 121.628252] save_stack+0x1b/0x40 > [ 121.629587] __kasan_kmalloc.constprop.0+0xc2/0xd0 > [ 121.631463] cachefiles_expand_content_map+0x70/0x1b0 [cachefiles] > [ 121.633856] cachefiles_shape_request+0x356/0x910 [cachefiles] > [ 121.636119] __fscache_shape_request+0xa1/0x180 [fscache] > [ 121.638211] fscache_read_helper+0x1e9/0x2200 [fscache] > [ 121.640263] fscache_read_helper_locked_page+0x6c/0x80 [fscache] > [ 121.642625] __nfs_readpage_from_fscache+0x138/0x4a0 [nfs] > [ 121.644768] nfs_readpage+0x651/0x970 [nfs] > [ 121.646431] nfs_write_begin+0x3ff/0x960 [nfs] > [ 121.648212] generic_perform_write+0x1b5/0x3e0 > [ 121.649960] nfs_file_write+0x36a/0x710 [nfs] > [ 121.651679] new_sync_write+0x361/0x5e0 > [ 121.653201] vfs_write+0x14e/0x440 > [ 121.654536] ksys_write+0xdd/0x1a0 > [ 121.655889] do_syscall_64+0x4d/0x90 > [ 121.657297] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 121.659253] > [ 121.659853] Freed by task 59: > [ 121.661016] save_stack+0x1b/0x40 > [ 121.662328] __kasan_slab_free+0x12d/0x170 > [ 121.663924] slab_free_freelist_hook+0x66/0x110 > [ 121.665684] kfree+0xa5/0x210 > [ 121.666885] process_one_work+0x64d/0x1030 > [ 121.668503] worker_thread+0x562/0xf50 > [ 121.669973] kthread+0x326/0x3f0 > [ 121.671292] ret_from_fork+0x22/0x30 > [ 121.672682] > [ 121.673291] The buggy address belongs to the object at ffff8881db88e780 > [ 121.673291] which belongs to the cache kmalloc-64 of size 64 > [ 121.677970] The buggy address is located 9 bytes to the right of > [ 121.677970] 64-byte region [ffff8881db88e780, ffff8881db88e7c0) > [ 121.682577] The buggy address belongs to the page: > [ 121.684492] page:ffffea00076e2380 refcount:1 mapcount:0 > mapping:0000000000000000 index:0x0 > [ 121.687686] flags: 0x17ffffc0000200(slab) > [ 121.689262] raw: 0017ffffc0000200 dead000000000100 dead000000000122 > ffff8881e8c0f600 > [ 121.692218] raw: 0000000000000000 0000000000200020 00000001ffffffff > 0000000000000000 > [ 121.695181] page dumped because: kasan: bad access detected > [ 121.697342] > [ 121.697961] Memory state around the buggy address: > [ 121.699825] ffff8881db88e680: 00 00 00 00 00 00 00 fc fc fc fc fc > fc fc fc fc > [ 121.702608] ffff8881db88e700: 00 00 00 00 00 00 fc fc fc fc fc fc > fc fc fc fc > [ 121.705391] >ffff8881db88e780: 00 00 00 00 00 00 00 00 fc fc fc fc > fc fc fc fc > [ 121.708160] ^ > [ 121.710337] ffff8881db88e800: 00 00 00 00 00 04 fc fc fc fc fc fc > fc fc fc fc > [ 121.713116] ffff8881db88e880: fb fb fb fb fb fb fb fb fc fc fc fc > fc fc fc fc > [ 121.715915] ================================================================== > [ 121.718695] Disabling lock debugging due to kernel taint -- Linux-cachefs mailing list Linux-cachefs@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cachefs