Re: vblade-22-rc1 is first release candidate for version 22

Catalin Salgau <csalgau@xxxxxxxxxxxxxxxxxxxxx> · Sun, 15 Jun 2014 19:02:45 +0300



On 15/06/2014 6:40 PM, Killer{R} wrote:
> Hello Catalin,
>
> Sunday, June 15, 2014, 6:30:09 PM, you wrote:
>
> CS> Hi again!
> CS> I like my long emails, don't I?
> Yep :)
>
> About drop rate - its not my pure-theoretical assumption that drop
> rate must be minimized. I played with WinAOE variable named as
> OutstandingThreshold and found that IO performance best when it
> nearby buffers count specified in vblade's command line (or in case of
> FreeBSD - that value that actually means buffered packets count).
> Also there is other yet theoretical for me aspect - if there're lot of
> AoE targets/initiators sharing same wire it definately better to has
> lower possible resend rate.
That feature was not present in the old build, regrettably.
I'll post back on the drop-rate when I have a chance to test this on 
Monday maybe.
I was actually looking at adding more links or switching to 10G to get 
around this.

> CS> On 15/06/2014 4:22 PM, Killer{R} wrote:
>>> Hello Catalin,
>>>
>>> Sunday, June 15, 2014, 4:08:15 PM, you wrote:
>>>
>>>>>>>> I would like to request two changes before release.
>>>>>>>> - An option to restrict the size of packets over automatic detection of
>>>>>>>> MTU.
>>>>>>> You mean like if the MTU is 9000, you want the ability to tell the
>>>>>>> vblade to act like it's smaller, right?
>>>>> CS> Yes. That's the gist of it.
>>>>> CS> I believe there is some value in the ability to manually tweak the
>>>>> CS> maximum packet size used by vlade.
>>>>> But its all to initiator side. Actually for example WinAoE (and its
>>>>> forks ;) ) does MTU 'autodetection' instead of using Conf::scnt.
>>>>>
>>> CS> That's not entirely correct.
>>> CS> WinAoE indeed does a form of negotiation there - it will start at
>>> CS> (MTU/sector size) and will do reads of decreasing size, until it
>>> CS> receives a valid packet.
>>> CS> However! If you would kindly check ata.c:157 (on v22-rc1) any ATA
>>> CS> request for more than the supported packet size will be refused.
>>>
>>> That's also not entirely correct :) It increases sectors count from
>>> 1 to ether MTU limit, either any kind of error from target, including
>>> timeout.
> CS> You're probably right there. I haven't looked at it recently. In any
> CS> event, the observation stands.
> CS> Changing the supported MTU in vblade will limit packets to that size (I
> CS> wouldn't have bothered with the FreeBSD MTU detection code if that
> CS> wasn't the case)
>>> However in my investigation I found that its usefull for initiator to
>>> know also value called in vblade as 'buffers count' .. I mean such
>>> a count of packets initiator can send to target knowing that it will
>>> likely process them all. Because sending more request than this value
>>> as 'outstanding' sharply increases drops (and resends) rate.
>>> I implemented also kind of negotiation to detect this by sending
>>> 'congestion' extension command that does usleep(500000) and the
>>> responds for all commands received in buffer. Such approach by
>>> comparing with directly asking target for buffers count will
>>> detect also any implicit buffering between initiator and target
>>>
> CS> As per the AoE spec, messages in excess of Buffer Count are dropped.
> CS> Since vblade processes these synchronously, this happens at the network
> CS> buffer level. If using async I/O, you're responsible for that, in theory.
> CS> As far as I remember, WinAoE not only doesn't care about that, but
> CS> doesn't even request this information from the target.
> CS> Should WinAoE limit the number of floating packets, as the target says
> CS> it should, we wouldn't actually be talking about that, but that would
> CS> probably cause more latency, since the initiator would have to wait for
> CS> confirmation for at least one packet before sending another one in
> CS> excess of bufcnt (and as I remember, WinAoE does not apply limits to
> CS> sending packets)
> CS> This would probably reduce throughput and increase average response time
> CS> under even moderate load, but decrease drop-rate.
> CS> I'm not actually sure that the drop/resend rate is something to aim for.
> CS> It's clearly desirable to minimise these, but not for the sake of the
> CS> number.
>
> CS> Regarding your proposed extension, I could see something like this being
> CS> valuable in the event that the target can detect increased drop-rate and
> CS> inform the initiator to ease-off or resend packets faster than the
> CS> default timeout, but I since the target is not allowed to send
> CS> unsolicited packets to the initiator, a specific request would be
> CS> needed(say when a large number of packets are outstanding), but this
> CS> would raise the question - if those packets are being dropped, what is
> CS> there to stop the target's network stack from dropping our congestion
> CS> detection packet?
> CS> On that note, vblade could be thought to broadcast load status
> CS> periodically or on high-drop rate, and in initiators would notice that
> CS> and adapt, but I believe that this raises some security concerns and
> CS> also would slightly slow the target since it would need to yield to the
> CS> kernel for the drop-rate information every few requests.
>
> CS> @Ed
> CS> Now, thanks to Killer's reference to the Buffer Count, I remember that
> CS> the freebsd code does not actually use it to allocate the network buffers.
> CS> Under Linux, following setsockopt with the default bufcnt, the receive
> CS> buffer would end up 24000 bytes long for an MTU of 1500 bytes, and
> CS> 144000 for a 9K MTU.
> CS> Under FreeBSD the code defaults to a 64K buffer. That makes bufcnt 43 on
> CS> an 1500 byte MTU, but 7 on a 9K MTU.
> CS> This could cause an increase in dropped packets and explain the decrease
> CS> in throughput I mentioned in a previous mail. I did not check for this
> CS> when testing.
> CS> I was not concerned because multiple instances of vblade on the same
> CS> interface would saturate the channel anyway, but now I'm starting to worry:)
>
>
> CS> ------------------------------------------------------------------------------
> CS> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> CS> Find What Matters Most in Your Big Data with HPCC Systems
> CS> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> CS> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> CS> http://p.sf.net/sfu/hpccsystems
> CS> _______________________________________________
> CS> Aoetools-discuss mailing list
> CS> Aoetools-discuss@xxxxxxxxxxxxxxxxxxxxx
> CS> https://lists.sourceforge.net/lists/listinfo/aoetools-discuss
>
>
>


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
Aoetools-discuss mailing list
Aoetools-discuss@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/aoetools-discuss