On Wed, 2020-01-15 at 22:04 +0100, michal.lowas-rzechonek@xxxxxxxxxxx wrote: > Hi Brian, > > On 01/15, Gix, Brian wrote: > > > Good point. Note that his is also possible in the current > > > implementation: since seq_num is applied to nonce with a 24bit mask, > > > it's going to wrap around. > > > > The full IV Index is in the nonce, and at 192 hours per IV index, will > > be unique for something like 1.4 million years. > > Yes it is, but that's not the point. > > At the moment, net->seq_num is a 32 bit value that *can* exceed 24bit > range, because mesh_net_next_seq_num() doesn't check ranges. So the > raw value can reach 0x1000000 and above. > > Now, this raw value is used in send_seg, passed to > mesh_crypto_packet_build, which effectively applies a 24bit mask in line > 640: Yes, we should definitely be sanity checking this, and not sending if SeqNum out of range. > > l_put_be32(seq, packet + 1); > packet[1] = (ctl ? CTL : 0) | (ttl & TTL_MASK); > > So this means that when: > > - the network is already in iv update (so that you can't increase the > iv_index, maybe you even started the procedure yourself because your > seq_num is above the threshold) > > - your sequence number is sufficiently large (because of the "repeated > crash" scenario described below) I think if we are repeatedly crashing, and it is causing a runaway sequence number increase, that being forbidden to send more packets is a natural consequence, and people should fix the code that was causing the crash in the first place. > > Then the actual value used in the nonce will be net->seq_num & 0xffffff, > which is something you have *already* used before. All of that happens > with the same IV index. > > > The over commit is calculated based on the usage rate, and the daemon > > would need to unexpectedly abort (not just ctrl-c or exit) for us to > > use the over-commit value > > Indeed, that's precisely what I'm talking about - repeated, unhandled > process terminations. We're trying to come up with the patch simply > because this situation has *already happened* on one of our instances. >