Max Kellermann wrote: > Why did you create separate functions for setting secs and usecs? > (set_alarm_expiration_secs and set_alarm_expiration_usecs); both > functions are not prototyped in a header file. Why not add something > like this to alarm.h: > > static inline void > set_alarm_expiration_secs(struct alarm_list *t, long tv_sec, long tv_usec) > { > t->tv.tv_sec = tv_sec; > t->tv.tv_usec = tv_usec; > } > > Since the alarm_list struct is public anyway, this looks more elegant > and creates smaller code. I have changed this. Thanks for the suggestion. I'd appreciate patches for this sort of cleanups. > The function do_alarm_run() assumes that all alarms are sorted by > their due time, but add_alarm() does not enforce this. Hm, add_alarm() always inserts new alarms at the end of the alarm list, so the list is sorted by their due time AFAICS. > I do not understand why you use random() to generate the next alarm > time in sync-alarm.c. This is part of the alarm-based synchronization approach. We send a synchronization message which talks about a conntrack entry each random(RefreshTime) seconds. This approach is, of course, spamming and CPU consuming but is simple and it resolves very well inconsistent situations among several replicas. The FTFW synchronization approach uses an ACK/NACK based protocol which requires less resources. > Why do you call INIT_LIST_HEAD() on all alarm_list objects? For > linux_list.h, that is only required on the sentinel (the global > variable "alarm_list" in this case which is already statically > initalized with the LIST_HEAD macro). Indeed, I have removed them. >>> I am using the most recent release, i.e. 0.9.5. I have no idea about >>> "alarm" or "persistent" mode, and I did not find any documentation on >>> this. I am using the "stats" example configuration from the tarball. >> Please, could you check out a working copy from SVN and tell me if the >> problem that you're reporting persists? > > With conntrackd and libnetfilter_conntrack from SVN r7196, the crash > does not occur anymore. Please tell me as soon as you release both, > so I can update the Debian package. > > But have a look at the strace: > > select(6, [4 5], NULL, NULL, {1, 0}) = 0 (Timeout) > > After that, it goes into an endless loop of: > > select(6, [4 5], NULL, NULL, {0, 0}) = 0 (Timeout) > > This is because select() modifies the timeout value, it contains the > rest time when select() returns. So timeout is zeroed after the first > select() because it times out, and do_alarm_run() never sets a new > timeout value. > > I suggest you pass a NULL timeout when there is no alarm, and ensure > that you always set the correct next_alarm value in do_alarm_run(). Fixed in SVN. Moreover, I have removed the ugly 1 sec wait start in the run() loop. Thanks for catching up this issue. As soon as we finish with this discussion, I plan to pass it to testing stage and then release a new version. -- "Los honestos son inadaptados sociales" -- Les Luthiers - To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html