Re: [PATCH] netfilter: ipvs: Add Maglev consistent hashing scheduler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello, Julian

Thank you for your a lot of comments.

On Sun, Nov 19, 2017 at 06:04:35PM +0200, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Sun, 19 Nov 2017, Inju Song wrote:
> 
> > Implements the Google's Maglev hashing algorithm as a IPVS scheduler.
> > 
> > Basically it provides consistent hashing but offers some special
> > features about disruption and load balancing.
> > 
> >   1) minimal disruption: when the set of destinations changes,
> >      a connection will likely be sent to the same destination
> >      as it was before.
> > 
> >   2) load balancing: each destination will receive an almost
> >      equal number of connections.
> > 
> > Seel also for detail: [3.4 Consistent Hasing] in
> > https://static.googleusercontent.com/media/research.google.com/ko//pubs/archive/44824.pdf
> 
> 	 I see that the PDF shows implementation only for servers
> with same weight. It would be better to see implementation that
> considers dest->weight.
> 

I agree. In this case, weight value is same weight or zero.
So I will consider this case when I patch the source.

I have read your comments about memory allocation fail,
caching table, setting the entry table with NULLs and so on.

I will do fix it and send v2 patchsets.

> > Signed-off-by: Inju Song <inju.song@xxxxxxxxxxxxx>
> > ---
> >  net/netfilter/ipvs/ip_vs_mh.c | 402 ++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 402 insertions(+)
> >  create mode 100644 net/netfilter/ipvs/ip_vs_mh.c
> > 
> > diff --git a/net/netfilter/ipvs/ip_vs_mh.c b/net/netfilter/ipvs/ip_vs_mh.c
> > new file mode 100644
> > index 0000000..534a9f5
> > --- /dev/null
> > +++ b/net/netfilter/ipvs/ip_vs_mh.c
> 
> > +static inline int
> > +ip_vs_mh_populate(struct ip_vs_mh_state *s, struct ip_vs_service *svc,
> > +		  unsigned int **permutation)
> > +{
> > +	int i;
> > +	unsigned int *next;
> > +	struct ip_vs_mh_lookup *entry, *l;
> > +	struct list_head *p;
> > +	struct ip_vs_dest *dest;
> > +	int dcnt;
> > +	unsigned int n, c;
> > +
> > +	dcnt = svc->num_dests;
> 
> 	While calling del_dest svc->num_dests can be 0, this should
> be considered in ip_vs_mh_permutate and ip_vs_mh_populate. We should
> set the table with NULLs in such case.
> 
> > +	next = kcalloc(dcnt, sizeof(unsigned int), GFP_KERNEL);
> > +	entry = kcalloc(IP_VS_MH_LOOKUP_SIZE, sizeof(*entry),
> > +			GFP_KERNEL);
> 
> 	It is current drawback that add_dest, upd_dest and del_dest
> are considered as non-failing methods, we can not fail the operation
> if memory allocation fails, for example. This is bad for add_dest,
> not so for upd_dest and del_dest. In this regard, I think, the MH
> scheduler should not crash if allocation fails.
> 
> > +/* Assign all the hash buckets of the specified table with the service.
> > + */
> > +static int
> > +ip_vs_mh_reassign(struct ip_vs_mh_state *s, struct ip_vs_service *svc)
> > +{
> > +	int dcnt;
> > +	unsigned int **permutation;
> > +
> > +	/* flush all the hash entry before assigning mh entry */
> > +	ip_vs_mh_flush(s);
> 
> 	ip_vs_mh_reassign should not call ip_vs_mh_flush.
> ip_vs_mh_populate while copying the temp table to s->lookup[]
> should put the refcnt to old dests (as in SH:ip_vs_sh_reassign)
> and then to assign new value. NULL should be assigned if
> svc->num_dests = 0.
> 
> > +/* IPVS MH Scheduler structure */
> > +static struct ip_vs_scheduler ip_vs_mh_scheduler = {
> > +	.name =			"mh",
> > +	.refcnt =		ATOMIC_INIT(0),
> > +	.module =		THIS_MODULE,
> > +	.n_list	 =		LIST_HEAD_INIT(ip_vs_mh_scheduler.n_list),
> > +	.init_service =		ip_vs_mh_init_svc,
> > +	.done_service =		ip_vs_mh_done_svc,
> > +	.add_dest =		ip_vs_mh_dest_changed,
> > +	.del_dest =		ip_vs_mh_dest_changed,
> > +	.upd_dest =		ip_vs_mh_dest_changed,
> 
> 	As upd_dest is called only when weight changes, the SH and MH
> schedulers should not handle this method. Their goal is to fill
> the table with all added dests by ignoring health state (it is
> considered only on dest selection).
> 
> 	I guess, the correct configuration for SH and MH would be:
> 
> 1.1 Add all dests with specific weight, so that the schedulers can
> fill their table based on the weights.
> 1.2 Set weight to 0 for all servers that are not healthy. SH and MH
> should ignore these configuration events.
> 
> after the initial configuration:
> 
> 2. Traffic will ignore non-healthy servers, fallbacks to healthy ones.
> 3. If other servers become unavailable, change their weight to 0
> 
> Regards
> 
> --
> Julian Anastasov <ja@xxxxxx>

Regards

-- 
Inju Song
NAVER Corporation

--
To unsubscribe from this list: send the line "unsubscribe lvs-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Devel]     [Linux NFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]     [X.Org]

  Powered by Linux