2012/2/23 Jeff Layton <jlayton@xxxxxxxxx>: > On Thu, 23 Feb 2012 10:56:40 +0400 > Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote: > >> 2012/2/22 Jeff Layton <jlayton@xxxxxxxxx>: >> > >> > Hmm...looking at my machine: >> > >> > (gdb) p sizeof(struct cifsLockInfo) >> > $1 = 144 >> > >> > So that's 144 bytes per struct...what's the upper limit on "count"? >> > >> > It seems like that could potentially be a large allocation, and we have >> > to do this in the oplock break codepath. >> > >> > Instead of doing this, would it be better to allocate these when >> > the VFS first calls ->lock, and attach them to a new member of >> > file_lock.fl_u? >> >> Every time we can have a different number of locks - so, can't >> pre-allocate it. > > You could preallocate these: > > 1/ Do the allocation in cifs_setlk > 2/ Attach it to a new fl_u member in the file_lock struct I am sure I got the idea. Do you mean to allocate the memory in inode->i_flock? > 3/ Add a fl_release_private op to free that object > > Of course, you will need to consider how to deal with lock splitting > and merging if you do that... Yes, how can we deal with this sort of things? We can calculate the lock number after every cifs_setlk but it's rather time consumption. If we use this struct: struct lock_to_push { __u64 offset; __u64 length; __u32 pid; __u16 netfid; __u8 type; }; it's sizeof is 24 bytes. If we have e.g 10^5 locks we end up with 2400000 bytes = 2.4 megabyte - don't think allocating this plays any big role in whole time we need to push 10^5 locks to the server. What's the problem with allocating it here? > >> I can create new struct for this only - without extra >> llist, blist, block_q fields, that can save enough memory. >> > > That sounds preferable. We really don't want to allocate more memory > than is needed. Plus, anytime you duplicate information at least one > copy is going to be wrong at some point in time. > >> > >> >> lock_flocks(); >> >> cifs_for_each_lock(cfile->dentry->d_inode, before) { >> >> @@ -950,35 +962,38 @@ cifs_push_posix_locks(struct cifsFileInfo *cfile) >> >> type = CIFS_RDLCK; >> >> else >> >> type = CIFS_WRLCK; >> >> - >> >> - lck = cifs_lock_init(flock->fl_start, length, type, >> >> - cfile->netfid); >> >> - if (!lck) { >> >> - rc = -ENOMEM; >> >> - goto send_locks; >> >> - } >> >> + lck = &lcks[i]; >> >> lck->pid = flock->fl_pid; >> >> - >> >> - list_add_tail(&lck->llist, &locks_to_send); >> >> + lck->netfid = cfile->netfid; >> >> + lck->length = length; >> >> + lck->type = type; >> >> + lck->offset = flock->fl_start; >> >> + i++; >> >> + if (i == count) >> >> + break; >> >> } >> >> - >> >> -send_locks: >> >> unlock_flocks(); >> >> >> >> - list_for_each_entry_safe(lck, tmp, &locks_to_send, llist) { >> >> + if (i == count) >> >> + cERROR(1, "Can't push all brlocks!"); >> >> + >> > >> > This sort of thing always makes me a little worried. Sure, you're >> > allocating a bunch of extra locks, but what's the potential effect >> > of not being able to push all the locks here? The user is going to >> > see this message and have no clue of what it means. >> > >> > Even I'm not sure what this means in practice. The VFS will have a >> > record of this lock, but the server won't... >> > >> >> I really think that there will be no such situations, because we hold >> inode->lock_mutex - the locking state should leave the same. >> > > I don't share that optimism. > > The lock_mutex protects the cinode->llist, but here you're walking the > inode->i_flock list. It's certainly possible that it could change in > between unlock_flocks and lock_flocks. > The lock_mutex now protects not only cinode->llist, but locks from the inode structure reached from cifs code as well. The only thing I missed is posix_lock_file_wait in the end of the cifs_setlk function - it should be surrounded with lock_mutex as well. According to this, there is no other way to increase POSIX lock number from the inode except locking/unlocking lock_mutex. -- Best regards, Pavel Shilovsky. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html