On Fri, 2007-08-03 at 17:49 +0400, Evgeniy Polyakov wrote: > On Fri, Aug 03, 2007 at 02:27:52PM +0200, Peter Zijlstra (peterz@xxxxxxxxxxxxx) wrote: > > On Fri, 2007-08-03 at 14:57 +0400, Evgeniy Polyakov wrote: > > > > > For receiving situation is worse, since system does not know in advance > > > to which socket given packet will belong to, so it must allocate from > > > global pool (and thus there must be independent global reserve), and > > > then exchange part of the socket's reserve to the global one (or just > > > copy packet to the new one, allocated from socket's reseve is it was > > > setup, or drop it otherwise). Global independent reserve is what I > > > proposed when stopped to advertise network allocator, but it seems that > > > it was not taken into account, and reserve was always allocated only > > > when system has serious memory pressure in Peter's patches without any > > > meaning for per-socket reservation. > > > > This is not true. I have a global reserve which is set-up a priori. You > > cannot allocate a reserve when under pressure, that does not make sense. > > I probably did not cut enough details - my main position is to allocate > per socket reserve from socket's queue, and copy data there from main > reserve, all of which are allocated either in advance (global one) or > per sockoption, so that there would be no fairness issues what to mark > as special and what to not. > > Say we have a page per socket, each socket can assign a reserve for > itself from own memory, this accounts both tx and rx side. Tx is not > interesting, it is simple, rx has global reserve (always allocated on > startup or sometime way before reclaim/oom)where data is originally > received (including skb, shared info and whatever is needed, page is > just an exmaple), then it is copied into per-socket reserve and reused > for the next packet. Having per-socket reserve allows to have progress > in any situation not only in cases where single action must be > received/processed, and allows to be completely fair for all users, but > not only special sockets, thus admin for example would be allowed to > login, ipsec would work and so on... Ah, I think I understand now. Yes this is indeed a good idea! It would be quite doable to implement this on top of that I already have. We would need to extend the socket with a sock_opt that would reserve a specified amount of data for that specific socket. And then on socket demux check if the socket has a non zero reserve and has not yet exceeded said reserve. If so, process the packet. This would also quite neatly work for -rt where we would not want incomming packet processing to be delayed by memory allocations. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html