Its possible to express your example using lists if their entries are allowed to overlap. I see that you wanted a way to express a matrix (overlapping rules) with gluster's tree-like syntax as backdrop. A polytree may be a better term than matrix (DAG without cycles), i.e. when there are overlaps a node in the graph gets multiple in-arcs. Syntax aside, we seem to part on "where" to solve the problem- config file or UX. I prefer the UX have the logic to build the configuration file, given how complex it can be. My preference would be for the config file be mostly "read only" with extremely simple syntax. I'll put some more thought into this and believe this discussion has illuminated some good points. Brick: host1:/SSD1 SSD1 Brick: host1:/SSD2 SSD2 Brick: host2:/SSD3 SSD3 Brick: host2:/SSD4 SSD4 Brick: host1:/DISK1 DISK1 rule rack4: select SSD1, SSD2, DISK1 # some files should go on ssds in rack 4 rule A: option filter-condition *.lock select SSD1, SSD2 # some files should go on ssds anywhere rule B: option filter-condition *.out select SSD1, SSD2, SSD3, SSD4 # some files should go anywhere in rack 4 rule C option filter-condition *.c select rack4 # some files we just don't care rule D option filter-condition *.h select SSD1, SSD2, SSD3, SSD4, DISK1 volume: option filter-condition A,B,C,D ----- Original Message ----- From: "Jeff Darcy" <jdarcy@xxxxxxxxxx> To: "Dan Lambright" <dlambrig@xxxxxxxxxx> Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> Sent: Monday, June 23, 2014 7:11:44 PM Subject: Re: Data classification proposal > Rather than using the keyword "unclaimed", my instinct was to > explicitly list which bricks have not been "claimed". Perhaps you > have something more subtle in mind, it is not apparent to me from your > response. Can you provide an example of why it is necessary and a list > could not be provided in its place? If the list is somehow "difficult > to figure out", due to a particularly complex setup or some such, I'd > prefer a CLI/GUI build that list rather than having sysadmins > hand-edit this file. It's not *difficult* to make sure every brick has been enumerated by some rule, and that there are no overlaps, but it's certainly tedious and error prone. Imagine that a user has four has bricks in four machines, using names like serv1-b1, serv1-b2, ..., serv4-b6. Accordingly, they've set up rules to put serv1* into one set and serv[234]* into another set (which is already more flexibility than I think your proposal gave them). Now when they add serv5 they need an extra step to add it to the tiering config, which wouldn't have been necessary if we supported defaults. What percentage of users would forget that step at least once? I don't know for sure, but I'd guess it's pretty high. Having a CLI or GUI create configs just means that we have to add support for defaults there instead. We'd still have to implement the same logic, they'd still have to specify the same thing. That just seems like moving the problem around instead of solving it. > The key-value piece seems like syntactic sugar - an "alias". If so, > let the name itself be the alias. No notions of SSD or physical > location need be inserted. Unless I am missing that it *is* necessary, > I stand by that value judgement as a philosophy of not putting > anything into the configuration file that you don't require. Can you > provide an example of where it is necessary? OK... ----- Brick: SSD1 Brick: SSD2 Brick: SSD3 Brick: SSD4 Brick: DISK1 rack4: SSD1, SSD2, DISK1 filter A : SSD1, SSD2 filter B : SSD1,SSD2, SSD3, SSD4 filter C: rack4 filter D: SSD1, SSD2, SSD3, SSD4, DISK1 meta-filter: filter A, filter B, filter C, filter D * some files should go on ssds in rack 4 * some files should go on ssds anywhere * some files should go anywhere in rack 4 * some files we just don't care Notice how the rules *overlap*. We can't support that if our syntax only allows the user to express a list (or list of lists). If the list is ordered by type, we can't also support location-based rules. If the list is ordered by location, we lose type-based rules instead. Brick properties create a matrix, with an unknown number of dimensions (e.g. security level, tenant ID, and so on as well as type and location). The logical way to represent such a space for rule-matching purposes is to let users define however many dimensions (keys) as they want and as many values for each dimension as they want. Whether the exact string "type" or "unclaimed" appears anywhere isn't the issue. What matters is that the *semantics* of assigning properties to a brick have to be more sophisticated than just assigning each a position in a list, and we need a syntax that supports those semantics. Otherwise we'll end up solving the same UX problems again and again each time we add a feature that involves treating bricks or data differently. Each time we'll probably do it a little differently and confuse users a little more, if history is any guide. That's what I'd rather avoid. _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel