On 5/15/20 4:10 PM, Darrick J. Wong wrote: > On Fri, May 15, 2020 at 03:54:34PM -0500, Eric Sandeen wrote: >> On 5/15/20 3:48 PM, Darrick J. Wong wrote: >>> On Fri, May 15, 2020 at 02:14:17PM -0500, Eric Sandeen wrote: >>>> We validate commandline options for stripe unit and stripe width, and >>>> if a device returns nonsensical values via libblkid, the superbock write >>>> verifier will eventually catch it and fail (noisily and cryptically) but >>>> it seems a bit cleaner to just do a basic sanity check on the numbers >>>> as soon as we get them from blkid, and if they're bogus, ignore them from >>>> the start. >>>> >>>> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx> >>>> --- >>>> >>>> diff --git a/libfrog/topology.c b/libfrog/topology.c >>>> index b1b470c9..38ed03b7 100644 >>>> --- a/libfrog/topology.c >>>> +++ b/libfrog/topology.c >>>> @@ -213,6 +213,19 @@ static void blkid_get_topology( >>>> val = blkid_topology_get_optimal_io_size(tp); >>>> *swidth = val; >>>> >>>> + /* > > Tabs not spaces yeah I have no idea how that happened :P >>>> + * Occasionally, firmware is broken and returns optimal < minimum, >>>> + * or optimal which is not a multiple of minimum. >>>> + * In that case just give up and set both to zero, we can't trust >>>> + * information from this device. Similar to xfs_validate_sb_common(). >>>> + */ >>>> + if (*sunit) { >>>> + if ((*sunit > *swidth) || (*swidth % *sunit != 0)) { > > Why not combine these? > > if (*sunit != 0 && (*sunit > *swidth || *swidth % *sunit != 0)) { was making it look a little like the kernel sb checks but *shrug* > Aside from that the code looks fine I guess... > >>> I feel like we're copypasting this sunit/swidth checking logic all over >>> xfsprogs >> >> That's because we are! >> >>> and yet we're still losing the stripe unit validation whackamole >>> game. >> >> Need moar hammers! >> >>> In the end, we want to check more or less the same things for each pair >>> of stripe unit and stripe width: >>> >>> * integer overflows of either value >>> * sunit and swidth alignment wrt sector size >>> * if either sunit or swidth are zero, both should be zero >>> * swidth must be a multiple of sunit >>> >>> All four of these rules apply to the blkid_get_toplogy answers for the >>> data device, the log device, and the realtime device; and any mkfs CLI >>> overrides of those values. >>> >>> IOWs, is there some way to refactor those four rules into a single >>> validation function and call that in the six(ish) places we need it? >>> Especially since you're the one who played the last round of whackamole, >>> back in May 2018. :) >> >> So .... I would like to do that refactoring. I'd also like to fix this >> with some expediency, TBH... >> >> Refactoring is going to be a little more complicated, I fear, because sanity >> on "what came straight from blkid" is a little different from "what came from >> cmdline" and has slightly different checks than "how does it fit into the >> superblock we just read?" > > Admittedly I wondered if "refactor all these checks" would fall apart > because each tool has its own slightly different reporting and logging > requirements. You could make a checker function return an enum of what > it's mad about and each caller could either have a message catalogue or > just bail depending on the circumstances, but now I've probably > overengineered the corner case catching code. > >> This (swidth-vs-sunit-is-borken) is common enough that I wanted to just kill >> it with fire, and um ... make it all better/cohesive at some later date. >> >> I don't like arguing for expediency over beauty but well... here I am. > > :( Well, I guess it's not actually that urgent; we don't handle it well today but maybe I should resist the urge to do another spot-fix that could(?) be handled better.... i'll put this on the backburner & give it a bit more thought I guess. Thanks, -Eric