GlobalProtect: rekey should be based on lifetime, not timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I propose that rekeying for the GlobalProtect/"gp" protocol should be
based on the "lifetime" configuration value received from the gateway
and not "timeout".  Why?  Based on 1. the GP gateway configuration
documentation and 2. my recent experimentation.

For reference, here's is an example GP Gateway XML configuration
excerpt (in "POST /ssl-vpn/getconfig.esp" response) showing the
relevant settings:

<response status="success">
  ...
  <lifetime>2592000</lifetime>
  <timeout>10800</timeout>
  <disconnect-on-idle>10800</disconnect-on-idle>
  ...
</response>

1. GP Gateway Configuration Documentation

GP's documentation [1][2][3], specifically the section titled "Modify
the default timeout settings for endpoints," references three values:
Login Lifetime, Inactivity Logout, and Disconnect on Idle.  Login
Lifetime is similar in name to "lifetime", Disconnect on Idle is
similar to "disconnect-on-idle", leaving Inactivity Logout with
"timeout".  The Inactivity Logout is the period of time in which if a
HIP check is not received from the endpoint by the Gateway, then the
associated user is logged out.  The "timeout" value, if it truly is
the Inactivity Logout, seems unrelated to rekeying.

[1] https://docs.paloaltonetworks.com/globalprotect/7-1/globalprotect-admin/set-up-the-globalprotect-infrastructure/configure-globalprotect-gateways/configure-a-globalprotect-gateway.html
[2] https://docs.paloaltonetworks.com/globalprotect/8-0/globalprotect-admin/globalprotect-gateways/configure-a-globalprotect-gateway
[3] https://docs.paloaltonetworks.com/globalprotect/9-0/globalprotect-admin/globalprotect-gateways/configure-a-globalprotect-gateway

2. Recent Experimentation

Recently the GP Gateway I have access to was reconfigured so that
"lifetime" is longer than "timeout", where before they were the same.
This change allowed me more insight into the GP Gateway "black box",
its behavior, and interaction with OpenConnect.

tl;dr An ESP tunnel can last longer than "timeout" and up to
"lifetime", when "lifetime" > "timeout", proving that "lifetime" is
the pertinent value, not "timeout".  A GP Gateway terminates an ESP
tunnel (ie "ESP detected dead peer") shortly after rekeying causing
OpenConnect to fallback to less optimal HTTPS tunneling, which should
be motivation to move away from using "timeout" to "lifetime" so as to
avoid rekeying.  Issuing a HIP check does not appear possible while
HTTPS tunneling (as currently implemented, ie "Failed to parse HTTP
response '^Z+<"), prematurely ending the session and reinforcing the
move to using "lifetime" and not "timeout" so as to avoid rekeying.

When "lifetime" = "timeout" and rekeying at "timeout":
1. "timeout" minus 60 seconds elapses
2. rekey
3. ~30 seconds elapses
4. "ESP detected dead peer"
5. switch to HTTPS
6. ~30 seconds elapses
7. "SSL read error: The TLS connection was non-properly terminated.;
reconnecting."
9. try to reconnect
10. "Invalid authentication cookie"
11. "Cookie is no longer valid, ending session"
12. "Reconnect failed"

With "lifetime" > "timeout" and rekeying at "timeout":
1. "timeout" minus 60 seconds elapses
2. rekey
3. 1-2 minutes elapses
4. "ESP detected dead peer"
5. switch to HTTPS
6. HIP check (ie 1 hour elapses)
7. "Failed to parse HTTP response '^Z+<'"
8. "HIP check or report failed"

With the hypothesis that "timeout" is not relevant to rekeying and
wanting to avoid rekeying as subsequent "bad" behavior (ie the GP
Gateway ends the ESP tunnel, OpenConnect falls back to HTTPS, session
ends at next HIP check), I modified OpenConnect to rekey based on
"lifetime" instead of "timeout".  I also removed the logic to rekey 60
seconds prior to "lifetime" because otherwise OpenConnect keeps
repeatedly rekeying within the last 60 seconds because the Gateway
provides an updated "lifetime" during rekeying to reflect how much
time remains (and this doesn't happen with "timeout" because the
Gateway always reports "timeout" as a constant, ie the configured
value).  After that change my session with the GP Gateway continued to
stay up and working despite not rekeying at the shorter "timeout"
value and transitioning from ESP to HTTPS only occurred within seconds
of "lifetime".

With "lifetime" > "timeout" and rekeying at "lifetime":
1. "lifetime" elapses
2. rekey
3. "<lifetime>1</lifetime>"
4. immediately rekey
5. "<lifetime>1</lifetime>"
6. repeat 8 more times (for a total of 10 rekeys in 15 seconds with
the GP Gateway each time reporting a "lifetime" of 1 second)
7. "Failed to connect ESP tunnel; using HTTPS instead."
8. "Got inappropriate HTTP GET-tunnel response: HTTP/1.1 502 Bad Gateway"
9. finally give up and unsuccessfully try to logout (as "lifetime"
expired 20 seconds ago)

In my final implementation I set rekey to "lifetime" plus 60 seconds
so as to avoid the GP Gateway's "longest 1 second ever" and instead
the GP Gateway disconnecting us and OpenConnect failing much more
gracefully (ie see behavior when "lifetime" = "timeout" and rekeying
at "timeout").

Corey
cwright@xxxxxxxxxxxxxxxx

_______________________________________________
openconnect-devel mailing list
openconnect-devel@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/openconnect-devel



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux