On 7/7/15 10:24 AM, Andy Lutomirski wrote:
On Tue, Jul 7, 2015 at 9:17 AM, David Ahern <dsahern@xxxxxxxxx> wrote:
On 7/7/15 9:56 AM, Andy Lutomirski wrote:
Netlink is fine for these use cases (if they were related to the
netns, not the pid ns or user ns), and it works. It's still tedious
-- I bet that if you used a syscall, the user code would be
considerable shorter, though. :)
How would this be a problem if you used plain syscalls? The user
would make a request, and the syscall would tell the user that their
result buffer was too small if it was, in fact, too small.
It will be impossible to tell a user what sized buffer is needed. The size
is largely a function of the number of tasks and number of maps per thread
group and both of those will be changing. With the growing size of systems
(I was sparc systems with 1024 cpus) the workload can be 10's of thousands
of tasks each with a lot of maps (e.g., java workloads). That amounts to a
non-trivial amount of data that has to be pushed to userspace.
One of the benefits of the netlink approach is breaking the data across
multiple messages and picking up where you left off. That infrastructure is
already in place.
How does picking up where you left off work? I assumed the interface
was something along the lines of "give me information starting from
pid N", but maybe I missed something.
There are different use cases:
1. specific pid
2. specific thread group id
3. all child processes
4. all tasks in the system (perf record/top -a mode)
5. all tasks for a uid is another but not coded yet
The big hitter is 4 in terms of data volume.
Essentially the kernel side changes saves where you are when an skb
fills -- which task and which map. e.g, see the cb usage in patch 5. On
return if the task is still there continue dumping its data. If it
disappeared move on to the next.
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html