On 08/10/2022 14:52, Ard Biesheuvel wrote: > [...] >> This is exactly what 842 (sw compress) is doing now. If that's >> interesting and Kees agrees, and if nobody else plans on doing that, I >> could work on it. >> >> Extra question (maybe silly on my side?): is it possible that >> _compressed_ data is bigger than the original one? Isn't there any >> "protection" on the compress APIs for that? In that case, it'd purely >> waste of time / CPU cycles heheh >> > > No, this is the whole point of those helper routines, as far as I can > tell. Basically, if you put data that cannot be compressed losslessly > (e.g., a H264 video) through a lossless compression routine, the > resulting data will be bigger due to the overhead of the compression > metadata. > > However, we are compressing ASCII text here, so using the uncompressed > size as an upper bound for the compressed size is reasonable for any > compression algorithm. And if dmesg output is not compressible, there > must be something seriously wrong with it. > > So we could either just drop such input, or simply not bother > compressing it if doing so would only take up more space. Given the > low likelihood that we will ever hit this case, I'd say we just ignore > those. > > Again, please correct me if I am missing something here (Kees?). Are > there cases where we compress data that may be compressed already? This is an interesting point of view, thanks for sharing! And it's possible to kinda test it - I did in the past to test maximum size of ramoops buffers, but I didn't output the values to compare compressed vs. uncompressed size (given I didn't need the info at the time). The trick I used was: suppose I'm using lz4, I polluted dmesg with lz4 already compressed garbage, a huge amount of it, then provoked a crash. I'll try it again to grab the sizes heheh