Jan Kara wrote: > On Mon 12-04-10 18:08:10, Dmitry Monakhov wrote: >> Jan Kara <jack@xxxxxxx> writes: >> >>>> Jan Kara <jack@xxxxxxx> writes: >>>> >>>>>> Eric Sandeen <sandeen@xxxxxxxxxx> writes: >>>>>> >>>>>>> Dmitry Monakhov wrote: >>>>>>>> Eric Sandeen <sandeen@xxxxxxxxxx> writes: >>>>>>>> >>>>>>>>> Because we can badly over-reserve metadata when we >>>>>>>>> calculate worst-case, it complicates things for quota, since >>>>>>>>> we must reserve and then claim later, retry on EDQUOT, etc. >>>>>>>>> Quota is also a generally smaller pool than fs free blocks, >>>>>>>>> so this over-reservation hurts more, and more often. >>>>>>>>> >>>>>>>>> I'm of the opinion that it's not the worst thing to allow >>>>>>>>> metadata to push a user slightly over quota. This simplifies >>>>>>>>> the code and avoids the false quota rejections that result >>>>>>>>> from worst-case speculation. >>>>>>>> Hm.. Totally agree with issue description. And seem there is no another >>>>>>>> solution except yours. >>>>>>>> ASAIU alloc_nofail is called from places where it is impossible to fail >>>>>>>> an allocation even if something goes wrong. >>>>>>>> I ask because currently i'm working on EIO handling in alloc/free calls. >>>>>>>> I've found that it is useless to fail claim/free procedures because >>>>>>>> caller is unable to handle it properly. >>>>>>>> It is impossible to fail following operation >>>>>>>> ->writepage >>>>>>>> ->dquot_claim_space (what to do if EIO happens?) >>>>>>> Hm, if these start returning EIO then maybe my patch should be modified >>>>>>> to treat EDQUOT differently than EIO ... assuming callers can handle >>>>>>> the return at all. >>>>>>> >>>>>>> In other words, make NOFAIL really just mean "don't fail for EDQUOT" >>>>>> Yes. agree So we have two types of errors >>>>>> 1) expected errors: EDQUOT >>>>>> 2) fatal errors: (EIO/ENOSPC/ENOMEM) >>>>>> So we need two types of flags: >>>>>> 1)FORCE (IMHO it is better name than you proposed) to allow exceed a >>>>>> quota limit >>>>>> 2)NOFAIL to allow ignore fatal errors. >>>>>> >>>>>> We still need NOFAIL, because for example if something is happens in >>>>>> ->write_page() >>>>>> ->dquot_claim() >>>>>> update_quota() -> EIO /* update disk quota */ >>>>>> update_bytes() /* update i_bytes count */ >>>>>> It is obvious that write_page should fail because it is too late to >>>>>> return the error to userspace, so data will probably lost which >>>>>> is much more dramatic bug than quota inconsistency. >>>>>> So the only options we have is to: >>>>>> 1) Do not modify inode->i_bytes and return error which caller will >>>>>> probably ignore. IMHO this is not good because result in >>>>>> incorrect stat() >>>>>> >>>>>> 2) do as much as we can (as it happens for now), modify inode->i_bytes >>>>>> and return positive error code to caller.(which signal what error >>>>>> result in quota inconsystency only) >>>>> Yes, agreed that 2) is a better solution. >>>>> >>>>>> This fatal errors handling logic i'll post on top of your patch-set. >>>>>> But please change flag name from NOFAIL to FORCE. >>>>> Hmm, do we really need to distinguish between your NOFAIL and FORCE? >>>>> I mean there are places where we can handle quota failures (both EDQUOT >>>>> or others) and places where we cannot and then we just want to go on as >>>>> seamlessly as possible. So NOFAIL flag seems to be enough... >>>>> Now I agree that in theory there can be some caller which might wish >>>>> to seamlessly continue on EDQUOT and bail out on EIO but I'm not aware >>>>> of such callsite currently so there's no immediate need for the flag. >>>>> So Eric's patches seem to be fine to me as they are. What do you think? >>>> FORCE but without NOFAIL will be useful for example in >>>> write to fallocated area >>>> extent split/convert (uninitialized=>initialized conversion) >>>> It is not good idea to return EDQUOT from write to reserved area >>>> due to metadata overhead, but it is easy to handle EIO from that method. >>>> So IMHO two flags is not a fancy option, but reasonable design solution. >>>> This design confirms to right for openvz's userbean counters. >>>> The only thing i ask is to rename NOFAIL => FORCE. >>> But as I wrote in my other email that callsite in ext4 really needs NOFAIL, >>> not only FORCE as far as I can see. So Eric's patch is actually right, isn't >>> it? >> Ohh. Seems we are talking about the same thing but my explanation is >> not clear (sorry for my crappy English) >> The whole patch is definitely right. All places changed has NOFAIL >> (it must succeed in 100% of cases)semantics. And it is reasonable to >> accept it as is. >> Later i'll add new FORCE flag and: >> 1) convert all NOFAIL callers to (FORCE|NOFAIL) >> 2) add new callers of FORCE. > OK :) This is fine with me. I'd just slightly favour "NOLIMIT" instead > of "FORCE" since it explains better what the flag does. Chiming in at the end of a long thread, I'll just say I like NOLIMIT better, too :) -Eric > Honza -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html