On 2020-04-18 at 21:52, Daniel Lenski wrote: > On Sat, Apr 18, 2020 at 6:30 PM The Wanderer <wanderer@xxxxxxxxxxx> > wrote: > >> On 2020-04-18 at 20:14, Daniel Lenski wrote: >> >>> This is a useful log. Key interesting parts below. >> >> <large snip> >> >>> 5. This suggests a server misconfiguration. The server has sent >>> new ESP keys and started using them, but then it gets confused >>> and invalidates the ESP connection after the original interval >>> elapses. That's annoying, and we've seen it before… but what >>> OpenConnect should do to workaround it here is that it should >>> give up on ESP and attempt to fallback to HTTPS. Which it appears >>> to do: >>> >>> [2020-04-14 00:20:43] Connecting to HTTPS tunnel endpoint ... >>> [2020-04-14 00:20:43] SSL negotiation with foo.bar.baz >>> [2020-04-14 00:20:43] Connected to HTTPS on foo.bar.baz with ciphersuite >>> (TLS1.2)-(RSA)-(AES-256-GCM) >>> [2020-04-14 00:20:43] > GET >>> /ssl-tunnel-connect.sslvpn?authcookie=AUTHCOOKIE&user=foobar_user HTTP/1.1 >>> [2020-04-14 00:20:43] > >>> >>> … but what's completely puzzling here is that we don't see either of >>> the following messages prior to this attempted failover to HTTPS: >>> ESP detected dead peer >>> Failed to connect ESP tunnel; using HTTPS instead. >> >> That's because of a quirk of the way I invoked OpenConnect and did >> the redirection; those messages actually get printed to the >> terminal where the program was invoked from, and don't make it into >> the file. This is what I was referring to with > > I really don't understand how — For syntax: the example I gave below, with the dual shell scripts et cetera. For how this leads to that result: your guess is at least as good as mine. > or why — Pure accident. I don't actually think I care about splitting it apart that way at all. > you're logging this so that *some* of these messages go to the > terminal, and others *only* go to the log file, but in any case the > lines that were missing here are critical information. > >>>> In hindsight I realize there was one thing I could have done >>>> differently on this session invocation to be more sure I >>>> didn't miss any stderr, etc., messages - but I didn't do it, so >>>> I'm not entirely positive there's nothing missing. I hope all >>>> the important stuff is there. >> >> in the original attempt at sending this. >> >> The relevant messages from the terminal from one such session, >> starting from that point and continuing until exit, are: >> >> >> [2020-04-15 14:25:56] ESP detected dead peer >> >> [2020-04-15 14:26:01] Failed to connect ESP tunnel; using HTTPS instead. >> >> [2020-04-15 14:26:01] Got inappropriate HTTP GET-tunnel response: HTTP/1.1 502 Bad Gateway >> >> [2020-04-15 14:26:01] Invalid user name >> >> [2020-04-15 14:26:01] Logout failed. >> >> RTNETLINK answers: No such process >> >> RTNETLINK answers: No such process >> >> [2020-04-15 14:26:01] Unknown error; exiting. > > Well, this clarifies it. There's something screwed up with the > server, it appears: > > * The server gives you initial ESP configuration, and tells you "come > back and rekey 180 minutes later." > * OpenConnect gets new ESP keys 179 minutes later. > * Those new keys appear to work… for a minute. > * One minute after getting the new keys, or 180 minutes after getting > the original keys… the connection stops working. > * When OpenConnect attempts to fallback to HTTPS… the server rejects > the (supposed-to-be-unexpired) auth cookie. > * It therefore gives up and tries to log out. > > Unless you can show that some other client software manages to keep > the connection going >180 minutes, I have to assume that we're doing > the best we can here. As I believe I recall that 8.05 or possibly 8.02 was working, I've now downgraded to 8.05 and initiated another session. If it disconnects overnight, I'll repeat with 8.02 in the morning (my next mandatory on-shift is Monday). If that disconnects in its turn, then we can consider it confirmed that the server has changed its behavior in the meantime, and this has nothing to do with the upgrade and is not OpenConnect's problem. If one of those remains connected, however - especially if the logs show something else happening at this point - then it may be worth digging deeper here. >>> 6. Finally, one more strange thing in your log. Something (????) >>> causes it to attempt a logout immediately after trying to >>> fallback to HTTPS, but the logout *fails* (POST >>> https://foo.bar.baz/ssl-vpn/logout.esp -> "Invalid user name"). >> >> Yeah, this is the thing that seemed weirdest to me. I certainly did >> not do anything which should have led this logout attempt to happen >> - as I noted in my original mail, I was asleep at the time. >> >> I just checked one of the other logs I've been grabbing of these >> sessions in the meantime, and it does end with a similar logout >> interaction. > > I'm guessing this is because of an inconsistent encoding of the > domain parameter (or maybe something else). I've seen it happen > before when the server domain is "(empty_domain)". just posted a > patch to try to alleviate this issue. > https://gitlab.com/openconnect/openconnect/-/merge_requests/93 Would it be worth my trying this patch, potentially, or is that not likely to make any difference in my situation? >> What I'm doing here is invoking a shell script (as ordinary user) >> from the command line, which runs >> >> su - -c "~/bin/vpn-bar 2>&1 >> >> /home/ORIGINALUSER/tmp/vpn-bar-highverbosity.log" >> >> and that command line in turn calls a shell script which sits >> under /root/bin/, gets run as root, and contains the actual >> invocation of openconnect - including >> echo password | openconnect --passwd-on-stdin [other-options] >> and a custom routing script which adjusts the routing table so that >> only traffic which *needs* the VPN goes over it (that latter being >> why I need to run this as root, rather than pre-creating a tun* >> device node and telling openconnect to use it). > > Something like what vpn-slice does > (https://github.com/dlenski/vpn-slice/), but without needing a messy > intermediary script? Possibly. I don't think I'd run across this before, and it doesn't seem to be in Debian. At such point as I've upgraded again, I'll give this a try and see if I can ditch my handbrewed solution, which is kludgy and only mostly-effective in any case; I put it in place as a get-things-working to begin with, and then it was working Well Enough so I hadn't gone back to it before these disconnects started happening. -- The Wanderer The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. -- George Bernard Shaw
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ openconnect-devel mailing list openconnect-devel@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/openconnect-devel