Thanks for your feedback and we really value your opinion. Our project aims at finding rare data races quickly so diagnosing is not yet in our scope for now, but it definitely shouldn't be an excuse. We did some further analysis and we would like to report it as well. Hope this could be helpful. ------------------------------------------ Interleaving We observed two different interleavings of two instructions, as illustrated below. Interleaving 1 Writer Reader if (in_group_p(inode->i_gid)) // read value = 0 (initial value) inode->i_gid = attr->ia_gid // write value == 0xee00 ... if ((mask & ~mode & (MAY_READ | MAY_WRITE | MAY_EXEC)) == 0) return 0; Interleaving 2 Writer Reader inode->i_gid = attr->ia_gid // write value == 0xee00 if (in_group_p(inode->i_gid)) // read value = 0xee00 ... if ((mask & ~mode & (MAY_READ | MAY_WRITE | MAY_EXEC)) == 0) return 0; However, in both cases, the function acl_permission_check() returns will value 0, so we were not able to find any explicit impact it may have. If this attribute is lock-free semantically, then it should be fine. In fact, we also found another writer site that could race with the reader. We noticed that both two writers are protected so we were curious about the reader. ------------------------------------------ Another writer /tmp/tmp.B7zb7od2zE-5.3.11/extract/linux-5.3.11/fs/attr.c:185 165 * @inode: the inode to be updated 166 * @attr: the new attributes 167 * 168 * setattr_copy must be called with i_mutex held. 169 * 170 * setattr_copy updates the inode's metadata with that specified 171 * in attr. Noticeably missing is inode size update, which is more complex 172 * as it requires pagecache updates. 173 * 174 * The inode is not marked as dirty after this operation. The rationale is 175 * that for "simple" filesystems, the struct inode is the inode storage. 176 * The caller is free to mark the inode dirty afterwards if needed. 177 */ 178 void setattr_copy(struct inode *inode, const struct iattr *attr) 179 { 180 unsigned int ia_valid = attr->ia_valid; 181 182 if (ia_valid & ATTR_UID) 183 inode->i_uid = attr->ia_uid; 184 if (ia_valid & ATTR_GID) ==> 185 inode->i_gid = attr->ia_gid; 186 if (ia_valid & ATTR_ATIME) 187 inode->i_atime = timespec64_trunc(attr->ia_atime, 188 inode->i_sb->s_time_gran); 189 if (ia_valid & ATTR_MTIME) 190 inode->i_mtime = timespec64_trunc(attr->ia_mtime, 191 inode->i_sb->s_time_gran); 192 if (ia_valid & ATTR_CTIME) 193 inode->i_ctime = timespec64_trunc(attr->ia_ctime, 194 inode->i_sb->s_time_gran); 195 if (ia_valid & ATTR_MODE) { 196 umode_t mode = attr->ia_mode; 197 198 if (!in_group_p(inode->i_gid) && 199 !capable_wrt_inode_uidgid(inode, CAP_FSETID)) 200 mode &= ~S_ISGID; 201 inode->i_mode = mode; 202 } 203 } Thanks, Sishuai > On Nov 30, 2020, at 1:23 PM, Theodore Y. Ts'o <tytso@xxxxxxx> wrote: > > On Mon, Nov 30, 2020 at 05:55:20PM +0000, Gong, Sishuai wrote: >> Hi, >> >> We found a data race in linux kernel 5.3.11 that we are able to >> reproduce in x86 under specific interleavings. Currently, we are not >> sure about the consequence of this race so we would like to confirm >> with the community if this can be a harmful bug. > > What do you mean by "data race" in this context? If you have one > process setting a file's group id, while another process is trying to > open that same file and that open is depending on the process's group > access vs the file's group id, the open might succeed or fail > depending on whether the chgrp(2) is executed before or after the > open(2). > > I could see data race if in some other context of the file open, the > group id is read, and so the group id might be inconsistent during the > course of operating on a particular system call. In which case, we > might need to cache the group id value, or take some kind of lock, > etc. > > But if your automated tool can't distinguish whether or not this is > the case, I'll gently suggest that it's not particuarly useful.... > Maybe this is something that should be the subject of further > research? The whole point of automated tools, after all, is to save > developers' time. And not asking them to chase down potential > questions like "so is this real or not"? > > Cheers, > > - Ted > >> >> ------------------------------------------ >> Writer site >> >> /tmp/tmp.B7zb7od2zE-5.3.11/extract/linux-5.3.11/fs/ext4/inode.c:5599 >> 5579 error = PTR_ERR(handle); >> 5580 goto err_out; >> 5581 } >> 5582 >> 5583 /* dquot_transfer() calls back ext4_get_inode_usage() which >> 5584 * counts xattr inode references. >> 5585 */ >> 5586 down_read(&EXT4_I(inode)->xattr_sem); >> 5587 error = dquot_transfer(inode, attr); >> 5588 up_read(&EXT4_I(inode)->xattr_sem); >> 5589 >> 5590 if (error) { >> 5591 ext4_journal_stop(handle); >> 5592 return error; >> 5593 } >> 5594 /* Update corresponding info in inode so that everything is in >> 5595 * one transaction */ >> 5596 if (attr->ia_valid & ATTR_UID) >> 5597 inode->i_uid = attr->ia_uid; >> 5598 if (attr->ia_valid & ATTR_GID) >> ==> 5599 inode->i_gid = attr->ia_gid; >> 5600 error = ext4_mark_inode_dirty(handle, inode); >> 5601 ext4_journal_stop(handle); >> 5602 } >> 5603 >> 5604 if (attr->ia_valid & ATTR_SIZE) { >> 5605 handle_t *handle; >> 5606 loff_t oldsize = inode->i_size; >> 5607 int shrink = (attr->ia_size < inode->i_size); >> 5608 >> 5609 if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) { >> 5610 struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); >> 5611 >> 5612 if (attr->ia_size > sbi->s_bitmap_maxbytes) >> 5613 return -EFBIG; >> 5614 } >> 5615 if (!S_ISREG(inode->i_mode)) >> 5616 return -EINVAL; >> 5617 >> 5618 if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) >> 5619 inode_inc_iversion(inode); >> >> ------------------------------------------ >> Reader site >> >> /tmp/tmp.B7zb7od2zE-5.3.11/extract/linux-5.3.11/fs/namei.c:306 >> 286 >> 287 return -EAGAIN; >> 288 } >> 289 >> 290 /* >> 291 * This does the basic permission checking >> 292 */ >> 293 static int acl_permission_check(struct inode *inode, int mask) >> 294 { >> 295 unsigned int mode = inode->i_mode; >> 296 >> 297 if (likely(uid_eq(current_fsuid(), inode->i_uid))) >> 298 mode >>= 6; >> 299 else { >> 300 if (IS_POSIXACL(inode) && (mode & S_IRWXG)) { >> 301 int error = check_acl(inode, mask); >> 302 if (error != -EAGAIN) >> 303 return error; >> 304 } >> 305 >> ==> 306 if (in_group_p(inode->i_gid)) >> 307 mode >>= 3; >> 308 } >> 309 >> 310 /* >> 311 * If the DACs are ok we don't need any capability check. >> 312 */ >> 313 if ((mask & ~mode & (MAY_READ | MAY_WRITE | MAY_EXEC)) == 0) >> 314 return 0; >> 315 return -EACCES; >> 316 } >> 317 >> ------------------------------------------ >> Writer calling trace >> >> - do_fchownat >> -- chown_common >> --- notify_change >> >> ------------------------------------------ >> Reader calling trace >> >> - do_execve >> -- __do_execve_file >> --- do_open_execat >> ---- do_filp_open >> ----- path_openat >> ------ link_path_walk >> ------- inode_permission >> -------- generic_permission >> >> >> >> Thanks, >> Sishuai >>