On Mon, 22 Jun 2020 at 08:14, Zbigniew Jędrzejewski-Szmek <zbyszek@xxxxxxxxx> wrote: > > On Mon, Jun 22, 2020 at 04:55:10AM +0200, clime wrote: > > >> > > Hello Josh, > > >> > > > > >> > > you can change the artifact type while keeping interface the same and > > >> > > it would be a _HUGE_ win because it would make modularity finally > > >> > > understandable for mere humans and better maintainable. > > >> > > > > >> > > Namely, modules should become rpms and therefore obey standard rpm rules. > > >> > > > >> > I'm not sure I entirely understand what you mean, but it sounds like > > >> > you have some interesting ideas. > > >> > > > >> > I'm looking forward to seeing what you and the community can build > > >> > from them, and how they could be brought into RHEL 10+! That kind of > > >> > collaboration is what makes Fedora great. > > >> > > >> I know this probably won't change anything because this was mentioned > > >> many times (by me at least) and nothing has changed but still... > > >> > > >> Currently, modules are essentially yum sub-repos, they are not really > > >> "modules", instead they are collections of rpms that reinvent rpm-like > > >> relations (obsoletes, requires, build-requires, etc.). > > >> > > >> There is no reason for this wheel-reintervention. Modules (the > > >> collections) can be simply squashed into an rpm by automation and this > > >> resulting rpm can go to the modular repo together with other modules. > > I agree with this general idea, even if not with the exact implementation > (comments below). In the past this was stated as "divorcing the build ordering > mechanism from the rpm delivery mechanism". The fact that we have two layers > of dependencies make Modularity conceptually hard and destroy the interaction > with the dependency solver. Also, if we disconnect the build and delivery > mechanisms, we can iterate and improve both separately. > > > >> That way we don't have two types of objects we complex inter-relations > > >> but only one we well-known behavior. > > >> > > >> I wonder if this is clear to everyone but nobody really cares or > > >> doesn't really want to say it or I don't know. > > >> > > >> Is this clear to everyone? I mean either I am stating an obvious stuff > > >> that nobody really considers worth typing or idk. > > > > > > > > > How would this work when there are optional rpms in the module? > > > > > > You do not need to install every rpm in eg the php module (different graphics/database backends) for that module to be useful, but every version of the module will have the rpm as an option which wont work outside a module of multiple rpms. > > > > Glad you ask, I wasn't precise... > > > > Well, I didn't mean everything always needs to be squashed, instead, > > it would be an optional step in modulemd processing. > > So... if it's only optional, that means that the general case where > squashing is not done needs to be solved anyway. And once you have > solved the general case, what would the point of squashing be? > Thus, I don't find squashing useful. > > > For some > > use-cases (like delicately compiled postgresql server), you can create > > a single rpm that contains all - postgresql-server, postgresql, > > postgresql-libs compiled in a specific way, optionally with some > > postgresql modules pre-included, so it would be let's say time-series > > optimized postgresql. Here it makes sense to make a single rpm from it > > - you install that and you are all set up for your use-case. > > > > Then there are language stacks where you might want to build things in > > a specific order - there nothing really needs to be squashed (or > > certain subset can if it makes sense) but you can still use modularity > > to easily batch-build certain rpms. If there are runtime optional > > deps, they can be described by Recommends/Suggests. > > > > Basically, once a "module" (things that comes from modulemd) is built, > > it should be put into normal repos and the "module" boundary should be > > forgotten (unless it is a single rpm), i.e. "module" is a built-time > > thing, at install-time we just have standard packages with standard > > deps. > > Yep. > > The unanswered question is what mechanism would be used make sure that > the rpms from the "module" are all installed. One option would be to > somehow mangle rpm names, another option would be to add some kind of > Provides/Requires, etc. But *some* mechanism is needed, because without > that dnf would often pick other rpms. > > In Modularity the solution is that the rpms from the module shadow > rpms with the same name from outside. That's probably the single > feature of Modularity that causes the most problems. Yeah, I could notice modularity causes quite a lot of troubles. The question is whether those troubles can be somehow technically justified, i.e. by the benefit modularity will eventually bring. But I personally don't see it. Actually, (big revelation) I have never understood modularity and I was even there in RH when it was just starting to be born. Probably, one of the reasons why I didn't understand it at that point was that there seemed to be no clear specification floating around. We knew modularity should solve "too fast, too slow" problem of distributions but it wasn't exactly clear how. It was a cool buzz word but nobody seemed to know what it means (at least that was my view of the situation). I and perhaps others were thinking that it wants to provide parallel availability+installability of the distribution software but after some time, it was cleared out this isn't the case and that the goal is just "parallel availability". OK, but even this isn't clear to me. I mean I understand the usefulness of modularity for build-time where you can create a recipe (modulemd) that will build your packages with interdependencies in a predictable and automatic manner. I think that's a cool thing to have. But what about run-time (or install-time in other words)? That's the part I don't understand. And I have spent quite some time trying to understand it but never managed. So it's possible that I am missing something all these years...in that case, it would be great if somebody could shed a light on it for me. Here, I would really like to give modularity a chance. >From what I understand, the use of modularity in run-time is to provide rpm namespaces. Natural way to do this would be to use separate repositories ala COPR where rpms are namespaced by repo ID but I know one of the requirements of modularity was to use a single repo for those namespaces with an argument that dnf is slow when working with a large number of repos...to me that reason always seemed quite artificial...something is slow...ok, then it can be made faster. I could understand if we were talking about let's say thousands of modules - there I would believe that initiating a thousand (or multiplied by few) new downloads of repo files might already have its price. But okay, if thousands of modules were the plan, then I could understand this argument. But now comes even more curious part. So...run-time modularity provides rpm namespacing if I understand it correctly. Basically <module>/<stream>/<package_name>. The easy solution for this would be to put the namespace implicitly into package name like python does it when there should be multiple pythons available, e.g. currently in CentOS7/EPEL7, there is python34-requests and python36-requests (I understand there will be a dot between major and minor at some point so e.g. python3.6-requests but that's another thing :)). So if we have different rpm names (because the namespace is already included in the rpm name itself), then there is no problem to provide multiple variants of the "same package" (the same thing but intended e.g. for a different python interpreter) in the same repo. So I would be willing to accept that this is a hacky solution or just a workaround (even though I am not sure it is). But even if I accept that it is just a hack and we need a more proper solution, I still have an issue in my mind. Let's say we have this two-level namespacing (<module>/<stream>/) and it enables us to have a package of the same name twice or more times in the same repo and it enables us to avoid mangling the rpm names. Great, isn't it? Well...but what if those different variants of the same package are actually parallel-installable and a user would benefit from having them parallel-installable (because it's a dev working with different versions of the same language at the same time)? We can only install a package of a certain name once into the system so that's why modularity enables us to use always just a single stream from all available streams of a module, i.e. you can only switch between the individual streams, having multiple of them enabled at the same time is not possible. So basically, modularity gives parallel-availability but at the same time, it disables the option of parallel-installability which could be achieved through alternatives and some smart packaging for probably all the language stacks if I understand correctly. I think that's a too much of a limitation. To avoid it, we would need to keep an rpm DB per the namespace (<module>/<stream>/) and these various DBs would be handled by dnf, which would basically mean, rpm command itself wouldn't know what's all installed on the system - hard to imagine that people would be alright with it. ...Or we can bring the notion of the namespaces into rpm itself (that's where my suggestion of "Stream" rpm attribute comes from but it could also be called just "Namespace"). But then there is the argument: "Why not just put the namespace into rpm name itself?" I mean...I wouldn't mind having it as a separate attribute but the usefulness of it would need to be discussed. So I don't really get even after almost five years where modularity is going or what it wants to achieve. I don't understand its use-case for any of Fedora, RHEL, and CentOS because disabling parallel-installability to allow parallel availability is imho not really an option. But yeah...maybe I am missing some angle. In that case, please, explain it to me because I would really like to understand... clime > > > dnf interface could be kept given that we "Stream" rpm property is > > added. This is still a bit rough what I am saying but hopefully it > > makes at least a bit of sense... > > Zbyszek > _______________________________________________ > devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx > To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx > Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx