Here is a follow up after discussing the issue with the support staff at callcentric.com. Their answer: "Hello, To be honest with you, what you're describing we'd consider a bug in whatever SIP stack is being used here. The behavior you're describing is very abnormal and not really the right way to implement this. When a registration happens, there's no reason for the UA to start a register "dialog" with 1 IP, and then continue it with another IP. To date I don't remember seeing another UA behave this way. This is especially true since when the initial REGISTER message is challenged, and the 407 received, and the REGISTER with authentication is sent, it is all part of the same dialog and includes the same Call-ID and sequential CSeq's. If you capture any common UA (hardware or software) such as one's from Counterpath, SJLabs, Linksys, Granstream, Snom, etc you will not see the behavior you are describing. What should happen is that the UA requests a DNS record for the registrar/proxy - first for DNS SRV, then falling back to A records, and cache's the IP address of the server(s) until the TTL expires. When the TTL expires it should again do a DNS query. The only reason for a UA to CHANGE the IP it is sending the requests to is: A. Timeout with no response from that IP; in which case it should pull (or actually already have cached) the next record (either the next for DNS SRV based on priority/weight, or the next in the list from A records - since every DNS server will return a randomly sorted list of A records); and send the request to the next IP. B. The TTL expires for the DNS records. Unless either A or B happens, there's no reason for a UA to switch to sending its requests to another IP. BTW, you mentioned also that: "Asterisk seems to implement the "right" behaviour: https://issues.asterisk.org/bug_view_advanced_page.php?bug_id... This is somewhat true... Asterisk does seem to be getting better at dealing with DNS SRV and A records. Previously Asterisk just completely ignored weight/priority in DNS SRV and just took the 1 record returned; it also ignored sending to the same IP and would re-register every time to a new IP, as well as "forgetting" which IP it had registered to previously which causes/caused problems with inbound calls. I wouldn't call Asterisk implementation ideal, but it does seem to be getting better; this is another reason we don't use Asterisk in-house. I haven't re-read the RFC to try find a specific sentence/section that contradicts your statement; but RFC's are unfortunately never as strict as industry practice; and to date we've never seen a device that acts in the way you are suggesting. I don't beleive it is written (and I don't remember it being) that each REGISTER request should perform a new DNS lookup, and this also doesn't make a lot of sense in general. Even if it is explicitly written this way, I don't think the industry has interpreted it this way when implementing any of the popular SIP stacks that are used by OEM's; so I don't think the onus is on us in this case to address this abnormal behavior. Thank you." As a temporary fix in our application (sflphone), I implemented a workaround where the user can choose to perform a DNS lookup before registering and then use that same IP address for the whole time the software is opened. Not pretty, but at least registration is no longer a probabilistic operation. 8 replies after I opened this thread, we haven't heard yet about what Benny Prijono thinks of this issue. I would have much interest in knowing it. I think that implementing an optional "non-RFC" mode (automatically or manually triggered by the user) would make PJSIP way more robust with a wide range of SIP proxies/registrars. On Mon, 2009-07-20 at 09:23 +0200, Klaus Darilion wrote: > > Pierre-Luc Bacon schrieb: > > I think that we are facing a dilemma here. > > > > Digressing the RFC would make PJSIP to work properly in this case (which > > might appear with a lot more VOIP providers doing load balancing). > > > > However, leaving it untouched makes the matter way more complex to deal > > with in the application. A possible workaround and maybe the only one I > > see, would be to resolve the host name initially, and use that IP from > > the moment the user launched the application to the end. Not so > > pretty ... > > > > I'm not familiar with SIP load balancing but should a "good" load > > balancer infrastructure be able to forward 407 challenges among > > themselves ? That way, a server which didn't send that challenge > > initially can answer back properly to that new REGISTER. > > I think it really depends on the used software and configuration. E.g. > if you use openser, you can configure it to allow nonce_reuse. Then, the > nonce is calculated stateless in all openser instances identical and any > proxy acecpts a nonce which was generated by another proxy. > > Probably if a SIP proxy calculates the nonce stateful and does not share > the nonce between the various proxies, it does not work. > > regards > klaus