On 01/08/2015 04:08 PM, Mimi Zohar wrote: > On Thu, 2015-01-08 at 12:19 -0600, Rob Landley wrote: >> >> But I am curious about how you propose to encode xattrs into the cpio >> format. (Which Al Viro chose because it's _simple_. There isn't really >> a >> controlling spec since Posix decided to deprecated it in 2001 and >> yank >> it from SUSv3 onwards. LSB extended several header fields to 8 hex >> digits instead of 6, but they still have 32 bit timestamps which seems >> a >> bit short-sighted. If you're going to define a new rev with a new >> magic >> number, there are a couple other things you might wanna fix...) > > Sounds like a good opportunity to make the other changes as well. We > can include the other changes in this patch set. Is this (initramfs) > the right mailing list for this discussion? I'd forgotten there was such a list until the email came in. :) > Do other people need to be included? In theory including the Austin Group (the posix committee mailing list) might be useful, but in practice they hear about stuff well after the fact, and they washed their hands of cpio over a decade ago (shortly after Linux started heavily using it in rpm, and a bit before initramfs). I note that there are two data formats of interest here: 1) the cpio file layout. 2) the list of files generated by gen_initramfs_list.sh and consumed by gen_init_cpio. The fact you're modifying gen_initramfs_list.sh seems to imply that you're changing the _second_ format as well as the first...? The second was never actually specified, but it does get used a lot by various build systems and breaking it would inconvenience people. (Plus I'd need to update the documentation, but I need to do that anyway.) Ss long as you're extending the format could you add a second 32 bits of time data that gets glued to the top half of a uint64_t, and then store the actual time value in microseconds (so time*1000000)? (I'd say "nanoseconds" but 63 bits of nanoseconds is 292 years, which is just short enough I'm uncomfortable with it. I'm just optimistic enough to think that might inconvenience somebody.) The other "huh" is the filesize, but 4 gigs per file seems unlikely to be more than initramfs needs any time soon? (It's possible that RPM might care in 15 years or so...) >> I ask because I maintain a new from-scratch cpio implementation >> (http://landley.net/hg/toybox/file/1571/toys/posix/cpio.c), so I'd >> presumably have to add your format extensions to this. Is there any >> sort >> of documentation on them? >> >> The toybox config Android is using has this cpio implementation >> enabled >> (see >> https://android.googlesource.com/platform/external/toybox/+/9250c95a8c47/Android.mk) >> so I'd rather like to get this sort of detail right... > > The xattr section, which follows the file name, is of the format: > <number of xattrs> { <xattr name> <xattr data size> <xattr data> } for > each xattr, terminated with a NULL byte and padded to a 4 byte boundary. where <value> is... 8 bytes of ascii hex digits like the header values? (Every cpio string is padded to a boundary. Sigh, lemme go read your patch...) Ok, 2/9, actual file format parsing. New magic string at the start "070703" for the new version. (Good, that was my first question: easy way to distinguish this from the previous format). - for (i = 0, s += 6; i < 12; i++, s += 8) { + for (i = 0; i < (!newcx ? 12 : 13); i++, s += 8) { You've tested this and the missing s+= 6 didn't cause problems? (Or did it move somewhere else...? Is that what the 1/9 did grabbing just the magic instead of the rest of the header data...?) > The header contains an additional field, before the checksum, containing > the xattr section length, including the NULL byte, but without the > padding. Ah, the old "4 bytes of padding to align to 4 bytes" silliness. (Even though you can't trust the null to _be_ there and have to set it yourself after the read.) I'm starting to remember this... Ok, different header magic, one new 8 hex digit field at the end of the header (before crc) containing "xattr_bufflen". The start of this buffer is an 8 digit hex "num_xattrs", which you iterate through and call strlen() on despite never having assured that the data you read in actually _does_ contain a null at the end (of the entire buffer). Then past that supposed null is another 8 digit hex "xattr_value_size", and that many bytes following you then send to sys_setxattr(). Except for the part about you trusting your input data waaaaay too much, seems reasonable? I have no idea what sys_setxattr() accepts, but presumably there's a man page for the system call... http://man7.org/linux/man-pages/man2/setxattr.2.html Ok, that's probably enough data to implement it. (Not sure why that man page isn't in my ubuntu 14.04 install which has manpages-dev installed? $ man setxattr No manual entry for setxattr > Note that gen_init_cpio does not include "security.evm" as it is file > system dependent. I have no idea what that is. Should I not include it in the command line cpio? (Or have a flag?) The last time I used extended attributes was on OS/2; my only non-academic interaction with selinux has been looking up how to switch it off. I still boggle that Fortune 500 bureaucracies include "must have a security system designed by the NSA based on the idea of complicating the system until there are no obvious holes, because after the Snowden leaks that's clearly a good idea" as part of their certification processes for reducing the risk of being unable to delegate blame. I'm also kind of impressed by the longevity of a hack the original Apple Lisa developers invented in 1981 because their OS didn't have an equivalent of the unix "file" command, and undoes the central unix "everything's a flat file" idea to bring us back to the structure records with magic meanings of the mainframe days. http://www.folklore.org/StoryView.py?project=Macintosh&story=The_Grand_Unified_Model_The_Finder.txt&sortOrder=Sort+by+Title Still, dictating what users can and can't do is policy. It exists, people use it... sigh. I'll try to take a stab at it some evening this week. > Mimi Rob -- To unsubscribe from this list: send the line "unsubscribe initramfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html