Hi Hans, > Adding Benjamin Berg who is one of the gnome-settings-daemon > maintainers to the Cc. :) As you might imagine, I've talked to him ;-) > > kernel-headers-5.11.11-300.fc34.x86_64 Right, not yet. > Meaning that basically all of Fedora 34 has been built with the "bad" > headers. Yay. But still, I think in this case it wouldn't help. > > > Userspace that is interested in the new byte will read 9 bytes. > > > > Which, unfortunately, doesn't address *this* particular case, because it > > uses gio and that will fill the buffer with arbitrary size? > > > > When you (Hans) say you saw in strace a read of size 8, did you mean the > > size passed to it, or the return size? I guess it must be the return > > size, and the size passed to it was way larger. > > No for some reason, for some later read calls gio was actually passing > 8 as size to the read() syscall. > Wait, that's weird? The (latest) gio code just calls g_io_channel_fill_buffer() and that doesn't care about the size you passed to g_io_channel_read_chars(), in fact it isn't even passed down, as you might expect for stream reads? > And g-s-d was compiled with headers > where sizeof(struct rfkill_event) was 9. This is/was the issue, g-s-d > would do 2 read(fd, buf, 8) calls and then take the first 9 bytes read > out of the 16 bytes it got to fill a single rfkill_event which is fine, > except that it uses the remaining 7 bytes as the first 7 bytes of the > next rfkill_event which it processes making that next event be completely > bogus. I understand the issue. I just didn't expect it would get solved when the headers contain a struct rkfill_event of size 8 again, since the kernel would still return 9 bytes. In effect, you'd still get the same issue, because the read in g_io_channel_fill_buffer() is 1024 (G_IO_NICE_BUF_SIZE). > > I do believe this really is a g-s-d bug though, it should not have > been using a "buffered" gio-channel on a /dev/foo node; and so far it > only got away with this by the rfkill_event size being a nice power of 2 > value. I agree, but ... kernel regressions and all that, right? > As I mentioned in an email to Benjamin, g-s-d should really switch > to making direct read() calls on the fd circumventing the gio-channel > read code all together: > > "Right, notice I just realized that even after my fix there still is an issue, > when running code compiled against new kernel headers gsd-rfkill will now > always expect 9 bytes per event. But when running on an older kernel that > will not be true. > > To fix this the code should probably just directly call read() itself > on the fd (only using g_io_channel for the polling) and then accept anything > between 8 bytes and sizeof(struct rfkill_event) as valid return value from the > read() call..." I don't think this is related to gio, you can still use g_io_channel_read_chars() since that just does the read for you: if (!channel->use_buffer) { gsize tmp_bytes; g_assert (!channel->read_buf || channel->read_buf->len == 0); status = channel->funcs->io_read (channel, buf, count, &tmp_bytes, error); if (bytes_read) *bytes_read = tmp_bytes; return status; } but checking that you actually got the *exact* size that you wanted is indeed wrong (as well). Sounds like I made the wrong call here - I was discussing this with Benjamin a while ago and decided *not* to add an ioctl to opt in to the new event type ... sounds like I should have. Notice that the problem which I described there will go away when compiled against even newer kernel-headers where sizeof(struct rfkill_event) is back to 8 again. Yes. So question, for code like g-s-d, which does not care about the new hard_block_reasons field. What would be the preferred / max compatibility way to do the reads. Also keeping in mind that there are "bad" kernel headers out there where sizeof(struct rfkill_event) == 9 ? I think that this would be best: ret = read(fd, &event, RFKILL_EVENT_SIZE_V1); if (ret == RFKILL_EVENT_SIZE_V1) { /* Valid event process it */ } This should produce the same code regardless of the kernel-headers version and should work on both old and new kernels, correct ? I believe so, yes. I don't think that rolling back the new extended-event support altogether will help. Since this has been out there for 2 released kernel versions now, I believe the best way to fix this is to fix userspace; and to fix userspace in such a way (at least g-s-d) that this problem cannot happen again. Fair enough. Thanks, johannes