On Mon 12-04-10 18:08:10, Dmitry Monakhov wrote: > Jan Kara <jack@xxxxxxx> writes: > > >> Jan Kara <jack@xxxxxxx> writes: > >> > >> >> Eric Sandeen <sandeen@xxxxxxxxxx> writes: > >> >> > >> >> > Dmitry Monakhov wrote: > >> >> >> Eric Sandeen <sandeen@xxxxxxxxxx> writes: > >> >> >> > >> >> >>> Because we can badly over-reserve metadata when we > >> >> >>> calculate worst-case, it complicates things for quota, since > >> >> >>> we must reserve and then claim later, retry on EDQUOT, etc. > >> >> >>> Quota is also a generally smaller pool than fs free blocks, > >> >> >>> so this over-reservation hurts more, and more often. > >> >> >>> > >> >> >>> I'm of the opinion that it's not the worst thing to allow > >> >> >>> metadata to push a user slightly over quota. This simplifies > >> >> >>> the code and avoids the false quota rejections that result > >> >> >>> from worst-case speculation. > >> >> >> Hm.. Totally agree with issue description. And seem there is no another > >> >> >> solution except yours. > >> >> >> ASAIU alloc_nofail is called from places where it is impossible to fail > >> >> >> an allocation even if something goes wrong. > >> >> >> I ask because currently i'm working on EIO handling in alloc/free calls. > >> >> >> I've found that it is useless to fail claim/free procedures because > >> >> >> caller is unable to handle it properly. > >> >> >> It is impossible to fail following operation > >> >> >> ->writepage > >> >> >> ->dquot_claim_space (what to do if EIO happens?) > >> >> > > >> >> > Hm, if these start returning EIO then maybe my patch should be modified > >> >> > to treat EDQUOT differently than EIO ... assuming callers can handle > >> >> > the return at all. > >> >> > > >> >> > In other words, make NOFAIL really just mean "don't fail for EDQUOT" > >> >> Yes. agree So we have two types of errors > >> >> 1) expected errors: EDQUOT > >> >> 2) fatal errors: (EIO/ENOSPC/ENOMEM) > >> >> So we need two types of flags: > >> >> 1)FORCE (IMHO it is better name than you proposed) to allow exceed a > >> >> quota limit > >> >> 2)NOFAIL to allow ignore fatal errors. > >> >> > >> >> We still need NOFAIL, because for example if something is happens in > >> >> ->write_page() > >> >> ->dquot_claim() > >> >> update_quota() -> EIO /* update disk quota */ > >> >> update_bytes() /* update i_bytes count */ > >> >> It is obvious that write_page should fail because it is too late to > >> >> return the error to userspace, so data will probably lost which > >> >> is much more dramatic bug than quota inconsistency. > >> >> So the only options we have is to: > >> >> 1) Do not modify inode->i_bytes and return error which caller will > >> >> probably ignore. IMHO this is not good because result in > >> >> incorrect stat() > >> >> > >> >> 2) do as much as we can (as it happens for now), modify inode->i_bytes > >> >> and return positive error code to caller.(which signal what error > >> >> result in quota inconsystency only) > >> > Yes, agreed that 2) is a better solution. > >> > > >> >> This fatal errors handling logic i'll post on top of your patch-set. > >> >> But please change flag name from NOFAIL to FORCE. > >> > Hmm, do we really need to distinguish between your NOFAIL and FORCE? > >> > I mean there are places where we can handle quota failures (both EDQUOT > >> > or others) and places where we cannot and then we just want to go on as > >> > seamlessly as possible. So NOFAIL flag seems to be enough... > >> > Now I agree that in theory there can be some caller which might wish > >> > to seamlessly continue on EDQUOT and bail out on EIO but I'm not aware > >> > of such callsite currently so there's no immediate need for the flag. > >> > So Eric's patches seem to be fine to me as they are. What do you think? > >> FORCE but without NOFAIL will be useful for example in > >> write to fallocated area > >> extent split/convert (uninitialized=>initialized conversion) > >> It is not good idea to return EDQUOT from write to reserved area > >> due to metadata overhead, but it is easy to handle EIO from that method. > >> So IMHO two flags is not a fancy option, but reasonable design solution. > >> This design confirms to right for openvz's userbean counters. > >> The only thing i ask is to rename NOFAIL => FORCE. > > But as I wrote in my other email that callsite in ext4 really needs NOFAIL, > > not only FORCE as far as I can see. So Eric's patch is actually right, isn't > > it? > Ohh. Seems we are talking about the same thing but my explanation is > not clear (sorry for my crappy English) > The whole patch is definitely right. All places changed has NOFAIL > (it must succeed in 100% of cases)semantics. And it is reasonable to > accept it as is. > Later i'll add new FORCE flag and: > 1) convert all NOFAIL callers to (FORCE|NOFAIL) > 2) add new callers of FORCE. OK :) This is fine with me. I'd just slightly favour "NOLIMIT" instead of "FORCE" since it explains better what the flag does. Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html