map records expiration problem / multi-references (conntrack)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I am trying to implement in XDP/eBPF a somewhat relaxed version of TCP
connection tracking (defend against DDos attacks). To do it correctly,
an expiration by different timeout values is needed - e.g. 20 seconds
for SYN state, 1 minute for established state, 10 seconds for FIN/RST.

Using *_LRU map variants is NOT an option - as it is anti-DDoS, an
attacker may evict legitimate connections by fresh ones, because those
maps do not offer explicit control on expiration policy.

In a classic programming environment, it's simple: a conntrack record,
in addition to `when_expire_unixtime` field, would have a LIST_ENTRY
and whenever time changes, be relinked from a previous time's list to
new list, under locks held on record and both list's heads. Then a
per-second timer will cleanup entire lists whose time is in past.

But not in XDP/eBPF. I've encountered multiple problems in tries of
different ideas.

First, let's assume 100 million conntrack records. We can't have
a `bpf_timer` instance in every record - it would not scale to 100M.
So still need one timer as in classic variant.

And there are no linked lists in eBPF, and no pointers from
multiplemaps to same object, so I came to idea to (ab)use LPM_TRIE as an
"index" by time and 4-tuple with value be bitset of in which main maps
to expire records (TCP, UDP, ...). Then I found that:

* can't `bpf_spin_lock` for several maps, and values could be modified
  by several threads in parallel (modify old and new LPM values)
* `bpf_map_get_next_key()` is unavailable to kernel! So single BPF timer
  callback can't get just needed records in a loop.
* kernel helper `bpf_for_each_map_elem` is unavailable for LPM_TRIE,
  only for array/hash - very strange, as availability of get_next_key
  implementation makes it trivial to implement for_each for *any* map
  type.

So this leads to *userland* must clean up those records, but for
syscall this will lead to much worse performance; and
`BPF_MAP_TYPE_RINGBUF` is also of no help here...

The question is, how do I implement expiration properly in eBPF/XDP?
Anything I missed?..

-- 
WBR, @nuclight




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux