On 12/15/22 6:34 AM, Daan De Meyer wrote:
On 12/10/22 11:35 AM, Daan De Meyer wrote:
These hooks allows intercepting bind(), connect(), getsockname(),
getpeername(), sendmsg() and recvmsg() for unix sockets. The unix
socket hooks get write access to the address length because the
address length is not fixed when dealing with unix sockets and
needs to be modified when a unix socket address is modified by
the hook. Because abstract socket unix addresses start with a
NUL byte, we cannot recalculate the socket address in kernelspace
after running the hook by calculating the length of the unix socket
path using strlen().
Yes, although we cannot calculate the socket path length with
strlen(). But we still have a method to find the path. In
unix_seq_show(), the unix socket path is calculated as below,
if (u->addr) { // under a hash table lock here
int i, len;
seq_putc(seq, ' ');
i = 0;
len = u->addr->len -
offsetof(struct sockaddr_un, sun_path);
if (u->addr->name->sun_path[0]) {
len--;
} else {
seq_putc(seq, '@');
i++;
}
for ( ; i < len; i++)
seq_putc(seq, u->addr->name->sun_path[i] ?:
'@');
}
Is it possible that we can use the above method to find the
address length so we won't need to pass uaddr_len to bpf program?
Since all other hooks do not need to uaddr_len, you could add some
new hooks for unix socket which can specially calculate uaddr_len
after the bpf program run.
I don't think we can. If we look at the definition of abstract unix
socket in the official man page:
abstract: an abstract socket address is distinguished (from a pathname socket) by the fact that sun_path[0] is a null byte ('\0'). The socket's address in this namespace is given by the additional bytes in sun_path that are covered by the specified length of the address structure. (Null bytes in
the name have no special significance.) The name has no connection with filesystem pathnames. When the address of an abstract socket is returned, the returned addrlen is greater than sizeof(sa_family_t) (i.e., greater than 2), and the name of the socket is contained in the first (addrlen -
sizeof(sa_family_t)) bytes of sun_path.
This specifically says that the address in the abstract namespace is
given by the additional bytes in sun_path that are covered by the
length of the address structure. If I understand correctly, that means
there's no way to derive the length from just the contents of the
sockaddr structure. We need
the actual length as specified by the caller to know which bytes
belong to the address. Note that it's valid for the abstract name to
contain Null bytes, so we cannot use those in any way or form to
detect whether further bytes belong to the address or not. It seems
valid to have an abstract name
consisting of 107 Null bytes in sun_path.
Okay, it looks like bpf program is able to set abstract name as well.
It would be good we have an example for this in selftest.
With abstract address setable by bpf program, I guess you are right,
we have to let user to explicitly tell us the address length.
I assume it is possible for user to write an address like below:
"a\0b\0"
addr_len = offsetof(struct sockaddr_un, sun_path) + 4
but actually it is illegal, right? We have to validate the
legality of sun_path/addr_len beyond unix_validate_addr(), right?
This is not actually illegal according to the man page I think, let's
look at the following quote from the man page:
Pathname sockets
When binding a socket to a pathname, a few rules should be observed for maximum portability and ease of coding:
* The pathname in sun_path should be null-terminated.
* The length of the pathname, including the terminating null byte, should not exceed the size of sun_path.
* The addrlen argument that describes the enclosing sockaddr_un structure should have a value of at least:
offsetof(struct sockaddr_un, sun_path)+strlen(addr.sun_path)+1
or, more simply, addrlen can be specified as sizeof(struct sockaddr_un).
So when doing a pathname based path, the address length is allowed to
be bigger than the actual path. So I don't think
we need to do any more validation than what is done by
unix_validate_addr(). The selftests are already using abstract
unix sockets because they don't need any cleanup.
What about smaller, address "abc", and the length is
offsetof(struct sockaddr_un) + 2
On Tue, 13 Dec 2022 at 21:54, Yonghong Song <yhs@xxxxxxxx> wrote:
On 12/13/22 3:36 AM, Daan De Meyer wrote:
On 12/10/22 11:35 AM, Daan De Meyer wrote:
These hooks allows intercepting bind(), connect(), getsockname(),
getpeername(), sendmsg() and recvmsg() for unix sockets. The unix
socket hooks get write access to the address length because the
address length is not fixed when dealing with unix sockets and
needs to be modified when a unix socket address is modified by
the hook. Because abstract socket unix addresses start with a
NUL byte, we cannot recalculate the socket address in kernelspace
after running the hook by calculating the length of the unix socket
path using strlen().
Yes, although we cannot calculate the socket path length with
strlen(). But we still have a method to find the path. In
unix_seq_show(), the unix socket path is calculated as below,
if (u->addr) { // under a hash table lock here
int i, len;
seq_putc(seq, ' ');
i = 0;
len = u->addr->len -
offsetof(struct sockaddr_un, sun_path);
if (u->addr->name->sun_path[0]) {
len--;
} else {
seq_putc(seq, '@');
i++;
}
for ( ; i < len; i++)
seq_putc(seq, u->addr->name->sun_path[i] ?:
'@');
}
Is it possible that we can use the above method to find the
address length so we won't need to pass uaddr_len to bpf program?
Since all other hooks do not need to uaddr_len, you could add some
new hooks for unix socket which can specially calculate uaddr_len
after the bpf program run.
I don't think we can. If we look at the definition of abstract unix
socket in the official man page:
abstract: an abstract socket address is distinguished (from a pathname socket) by the fact that sun_path[0] is a null byte ('\0'). The socket's address in this namespace is given by the additional bytes in sun_path that are covered by the specified length of the address structure. (Null bytes in
the name have no special significance.) The name has no connection with filesystem pathnames. When the address of an abstract socket is returned, the returned addrlen is greater than sizeof(sa_family_t) (i.e., greater than 2), and the name of the socket is contained in the first (addrlen -
sizeof(sa_family_t)) bytes of sun_path.
This specifically says that the address in the abstract namespace is
given by the additional bytes in sun_path that are covered by the
length of the address structure. If I understand correctly, that means
there's no way to derive the length from just the contents of the
sockaddr structure. We need
the actual length as specified by the caller to know which bytes
belong to the address. Note that it's valid for the abstract name to
contain Null bytes, so we cannot use those in any way or form to
detect whether further bytes belong to the address or not. It seems
valid to have an abstract name
consisting of 107 Null bytes in sun_path.
Okay, it looks like bpf program is able to set abstract name as well.
It would be good we have an example for this in selftest.
With abstract address setable by bpf program, I guess you are right,
we have to let user to explicitly tell us the address length.
I assume it is possible for user to write an address like below:
"a\0b\0"
addr_len = offsetof(struct sockaddr_un, sun_path) + 4
but actually it is illegal, right? We have to validate the
legality of sun_path/addr_len beyond unix_validate_addr(), right?
On Tue, 13 Dec 2022 at 06:20, Yonghong Song <yhs@xxxxxxxx> wrote:
On 12/10/22 11:35 AM, Daan De Meyer wrote:
These hooks allows intercepting bind(), connect(), getsockname(),
getpeername(), sendmsg() and recvmsg() for unix sockets. The unix
socket hooks get write access to the address length because the
address length is not fixed when dealing with unix sockets and
needs to be modified when a unix socket address is modified by
the hook. Because abstract socket unix addresses start with a
NUL byte, we cannot recalculate the socket address in kernelspace
after running the hook by calculating the length of the unix socket
path using strlen().
Yes, although we cannot calculate the socket path length with
strlen(). But we still have a method to find the path. In
unix_seq_show(), the unix socket path is calculated as below,
if (u->addr) { // under a hash table lock here
int i, len;
seq_putc(seq, ' ');
i = 0;
len = u->addr->len -
offsetof(struct sockaddr_un, sun_path);
if (u->addr->name->sun_path[0]) {
len--;
} else {
seq_putc(seq, '@');
i++;
}
for ( ; i < len; i++)
seq_putc(seq, u->addr->name->sun_path[i] ?:
'@');
}
Is it possible that we can use the above method to find the
address length so we won't need to pass uaddr_len to bpf program?
Since all other hooks do not need to uaddr_len, you could add some
new hooks for unix socket which can specially calculate uaddr_len
after the bpf program run.
This hook can be used when users want to multiplex syscall to a
single unix socket to multiple different processes behind the scenes
by redirecting the connect() and other syscalls to process specific
sockets.
---
include/linux/bpf-cgroup-defs.h | 6 +++
include/linux/bpf-cgroup.h | 29 ++++++++++-
include/uapi/linux/bpf.h | 14 ++++--
kernel/bpf/cgroup.c | 11 ++++-
kernel/bpf/syscall.c | 18 +++++++
kernel/bpf/verifier.c | 7 ++-
net/core/filter.c | 45 +++++++++++++++--
net/unix/af_unix.c | 85 +++++++++++++++++++++++++++++----
tools/include/uapi/linux/bpf.h | 14 ++++--
9 files changed, 204 insertions(+), 25 deletions(-)
[...]