On Tue, Feb 8, 2011 at 10:54 AM, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > On Tue, Feb 8, 2011 at 12:24 AM, Ted Ts'o <tytso@xxxxxxx> wrote: >> On Mon, Feb 07, 2011 at 10:59:54PM +0200, Amir Goldstein wrote: >>> it says alloc_sem protects against lazy init of adjacent groups >>> and says nothing about protecting block group specific data structures... >>> >>> what am I missing??? >> >> You're missing ext4_mb_load_buddy(), which takes grp->alloc_sem, and >> which is released by ext4_mb_unload_buddy(). No, it's not the most >> obvious code in the world... >> > > OK Ted, you leave me no choice... I need to paste the code of mb_load_buddy(): > > 1157 e4b->alloc_semp = &grp->alloc_sem; > 1158 > 1159 /* Take the read lock on the group alloc > 1160 * sem. This would make sure a parallel > 1161 * ext4_mb_init_group happening on other > 1162 * groups mapped by the page is blocked > 1163 * till we are done with allocation > 1164 */ > 1165repeat_load_buddy: > 1166 down_read(e4b->alloc_semp); > 1167 > 1168 if (unlikely(EXT4_MB_GRP_NEED_INIT(grp))) { > 1169 /* we need to check for group need init flag > 1170 * with alloc_semp held so that we can be sure > 1171 * that new blocks didn't get added to the group > 1172 * when we are loading the buddy cache > 1173 */ > 1174 up_read(e4b->alloc_semp); > 1175 /* > 1176 * we need full data about the group > 1177 * to make a good selection > 1178 */ > 1179 ret = ext4_mb_init_group(sb, group); > 1180 if (ret) > 1181 return ret; > 1182 goto repeat_load_buddy; > 1183 } > 1184 > > ext4_mb_load_buddy() *only* takes down_read(grp->alloc_sem), > except for the first time after mount, in which ext4_mb_init_group() takes > down_write(grp->alloc_sem), releases it, and then repeat_load_buddy label > will re-take down_read(grp->alloc_sem). > > Essentially, this means that after time Ti(group), all users take only read > access to grp->alloc_sem, which is kind of futile... > > Your statement that alloc_sem is needed certainly makes sense, but I just don't > see it in the code. > As un-obvious as the code may be, you cannot protect data structures > without anyone taking write access to the semaphore on allocation routines. > Also, I believe that buddy data structures are modified in > ext4_mb_generate_buddy() > under the protection of ext4_lock_group(). > > So at the risk of having to buy you a beer on LFS I will repeat my > nagging question: > What am I missing??? > I found out what I was missing (it's in the comment in line 1169 above). I wrongly assumed the EXT4_GROUP_INFO_NEED_INIT_BIT is set only once in a lifetime of an ext4_group_info, but I was wrong. It may also be set when adding blocks to an existing group from ext4_group_extend(). Still, I think that the use cases in which down_read(alloc_sem) is needed are very unlikely() and can be covered with the following check: if (blocks_per_page > 2 || group == sbi->s_groups_count - 1) /* Synchronize init of adjacent group and adding of blocks to last group */ e4b->alloc_semp = &grp->alloc_sem; else e4b->alloc_semp = NULL; I will post a patch for review. Amir. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html