On 9/28/2022 12:30 PM, Enzo Matsumiya wrote:
On 09/28, Tom Talpey wrote:
On 9/22/2022 6:32 PM, Enzo Matsumiya wrote:
Hi all,
This patch is the (apparent) correct way to fix the issues regarding
some
messages with invalid signatures. My previous patch
https://lore.kernel.org/linux-cifs/20220918235442.2981-1-ematsumiya@xxxxxxx/
was wrong because a) it covered up the real issue (*), and b) I could
never
again "reproduce" the "race" -- I had other patches in place when I
tested it,
and (thus?) the system memory was not in "pristine" conditions, so
it's very
likely that there's no race at all.
(*) Thanks a lot to Tom Talpey for pointing me to the right direction
here.
Since it sucks to be wrong twice, I'm sending this one as RFC because I
wanted to confirm some things:
1) Can I rely on the fact that status != STATUS_SUCCESS means no
variable
data in the message? I could only infer from the spec, but not
really
confirm.
Technically, I think the answer is "no" but practically speaking maybe
"yes".
Not all errors are actually errors however, e.g. STOPPED_ON_SYMLINK and
the additional error contexts which accompany it. We definitely don't
want to skip validation of those.
There's also STATUS_PENDING, but that's special because it isn't actually
a response, it's just a "hang on I'm busy" that carries no other
data.
Finally, MS-SMB2 lists a few others in section 2.2.2.2, all of which
need validation I think.
Sorry, I might have not been clear. By "variable data" I actually meant
data that's not going to be allocated in the requests' iovs, but rather
in rqst->rq_pages.
The variable buffers that are already expected (by PDU/spec), are AFAICS
always allocated either through cifs*buf_get() or some kmalloc() variant,
i.e. non-HIGHMEM.
async IO, though, is the only place I can see that allocates
rqst->rq_pages through iov_iter_get_pages_alloc2(), which is the source
of my concern, and the reason of my patch. Particularly, on async read
cases, the page data is pre-allocated with the size we *expect* to
receive/read from the server, but if there's an error in a READ
response, the response itself will contain no/incomplete data in the page
data, but since those pages are allocated, the signature computation will
account for it, and generate a mismatching signature.
Aha - ok, got it. And I've been looking into those codepaths too because
the RDMA code fails badly when I try to optimize the completion path.
Right now, the code waits for ALL smbdirect packets to be sent before
returning, which is ridiculously inefficient. It does this in both
client and server sending. But if I remove the wait, nothing works as
it appears the queued buffers are either freed, or unwound from the
stack. Completely and totally bogus.
I'm running more tests now and I'll make sure to test the other possible
errors in 2.2.2.2.
2) (probably only in async cases) When the signature fails, for
whatever
reason, we don't take any action. This doesn't seem right IMHO,
e.g. if the message is spoofed, we show a warning that the
signatures
doesn't match, but I would expect at least for the operation to
stop,
or maybe even a complete dis/reconnect.
I don't think so. The spec calls for dropping the message, and after
all it could have been simple packet corruption. The retries will sort
it out.
Absolutely positively do not print a message for each received packet,
good or bad.
For the good messages ok, of course, but for bad? Why not? Otherwise, e.g.,
spotting the misvalidation mentioned above would've taken much longer.
It's basic MITM. If there is such an attacker, it can trivially redirect
the client, by injecting an error undetected. Inspecting the payload
before doing the integrity check is just insecurely wrong.
Regarding the logging on signature validation, consider that an attacker
will therefore trivially be able to overrun the log, either DOSing the
system, or at the very least hiding other messages. Why alarm the
sysadmin, for something they are not likely to be able to act upon?
Bump a statistic, maybe.
To me, the kernel log is not informational, it's for critical actions.
For example, it requires root to even see it. Consider what you need to
tell such a person each time it happens. "Oops, I dropped a packet"
doesn't cut it. If it's a developer hint, then place it under CIFS FYI,
or some sort of opt-in diagnostic tracing channel.
Tom.
(btw I spotted that misvalidation while debugging my AES-GMAC series,
but the same behaviuour happens with AES-CMAC i.e. in current code)
3) For SMB1, I couldn't really use generic/465 as a real
confirmation for
this because it says "O_DIRECT is not supported". From reading the
code and 'man mount.cifs' I understood that this is supported,
so what
gives? Worth noting that I don't follow/use/test SMB1 too much.
The patch does work for other cases I tried though.
Honestly, I don't think we care. No amount of patching can possibly
save SMB1.
Ack :)
Tom.
Cheers,
Enzo
I hope someone can address my questions/concerts above, and please
let me
know if the patch is technically and semantically correct.
Patch follows.
Enzo
--------
When verifying a response's signature, the computation will go through
the iov buffer (header + response structs) and the over the page
data, to
verify any dynamic data appended to the message (usually on an SMB2 READ
response).
When the response is an error, however, specifically on async reads,
the page data is allocated before receiving the expected data. Being
"valid" data (from the signature computation POV; non-NULL, >0 pages),
it's included in the parsing and generates an invalid signature for the
message.
Fix this by checking if the status is non-zero, and skip the page data
if it is. The issue happens in all protocol versions, and this fix
applies to all.
This issue can be observed with xfstests generic/465.
Signed-off-by: Enzo Matsumiya <ematsumiya@xxxxxxx>
---
fs/cifs/cifsencrypt.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c
index 46f5718754f9..e3260bb436b3 100644
--- a/fs/cifs/cifsencrypt.c
+++ b/fs/cifs/cifsencrypt.c
@@ -32,15 +32,28 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
int rc;
struct kvec *iov = rqst->rq_iov;
int n_vec = rqst->rq_nvec;
+ bool has_error = false;
/* iov[0] is actual data and not the rfc1002 length for SMB2+ */
if (!is_smb1(server)) {
+ struct smb2_hdr *shdr = (struct smb2_hdr *)iov[0].iov_base;
+
if (iov[0].iov_len <= 4)
return -EIO;
+
+ if (shdr->Status != 0)
+ has_error = true;
+
i = 0;
} else {
+ struct smb_hdr *hdr = (struct smb_hdr *)iov[1].iov_base;
+
if (n_vec < 2 || iov[0].iov_len != 4)
return -EIO;
+
+ if (hdr->Status.CifsError != 0)
+ has_error = true;
+
i = 1; /* skip rfc1002 length */
}
@@ -61,6 +74,9 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
}
}
+ if (has_error)
+ goto out_final;
+
/* now hash over the rq_pages array */
for (i = 0; i < rqst->rq_npages; i++) {
void *kaddr;
@@ -81,6 +97,7 @@ int __cifs_calc_signature(struct smb_rqst *rqst,
kunmap(rqst->rq_pages[i]);
}
+out_final:
rc = crypto_shash_final(shash, signature);
if (rc)
cifs_dbg(VFS, "%s: Could not generate hash\n", __func__);