Re: [Modularity] XML format for in-repository modules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 7, 2022 at 8:23 AM Petr Pisar <ppisar@xxxxxxxxxx> wrote:
>
> Hello Fedora developers,
>
> I'd like to show you a proposal for a new XML format of modular metadata which
> reside in YUM repositories.
>
> In short I propose replacing YAML syntax with XML syntax while removing
> features which where never implemented or used, while providing a detailed
> specification leaving small place for implementer's invention. The proposed
> specification is the "reduced" variant under
> <https://github.com/fedora-modularity/libmodulemd/tree/main/xml_specs>, for
> instance
> <https://github.com/fedora-modularity/libmodulemd/blob/main/xml_specs/reduced/overview.xml>.
>
> Bear in mind that this change is only about how the modules are stored in YUM
> repositories which are fetched by DNF. It does not change how modules are defined
> by module maintainers (YAML modulemd-packager-v3 or modulemd-v2 format) and
> how it is built by MBS and handled by Bodhi.
>
> Those who should be concerned most are DNF5 developers and relengs producing
> composes.
>
>
> Long story:
>
> Original modulemd format had a noble property, and that was an input format
> for MBS is the same as the output format. This is not true anymore because of
> modulemd-packager-v3 format. It also makes validation difficult as fields
> optional in an input format are mandatory in the output format, or vice versa.
>
> Original modulemd format drags in YAML format into YUM repository which is
> otherwise XML-only. That requires a YAML parser.
>
> Original modulemd format is not handled by DNF directly. Instead, DNF uses
> libmodulemd library. That library is heavily based on glib. In fact it embeds
> glib types into its API. Why do I mention it? Because new DNF5 aims to
> eradicate glib. Mostly to shrink container installations. librepo and
> libmodulemd are the last pieces with glib. Because it's impossible to remove
> glib from libmodulemd, there has to be a new library for parsing modular
> metadata. If there has to be a new library, there could be a transition from
> YAML to XML which would shrink the minimal installation more by removing
> libyaml.
>
> Original modulemd format possesses some features which nobody uses, or nobody
> implements, or if implements, than not fully. Do you remember a deprecation of
> intents from modularity
> <https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/RXDP2WMPR3HHBRTQAKPSTRU6KABTJSMA/#RXDP2WMPR3HHBRTQAKPSTRU6KABTJSMA>?
> There are more things that can be removed and make the format and its parser
> simpler.
>
> Original format is not well specified. DNF and Satellite people complained
> a lot when they were implementing it. The specification looks more like an
> example. E.g. a module stream name is probably a string. An arbitrary string.
> With spaces, with new lines. I think you do not want to see a stream named
> " :\n". Well, DNF does not even allow you to identify a module like that.
> There is definitely room for tightening the format. But each change like that
> is technically an incompatible change. To materialze the change we need at
> least a new modulemd format version. But if we need a new format version, we
> can actually come a completely new format.
>
> As you can see, there are good reasons to come up with a new in-repository
> format. Hence here it is
> <https://github.com/fedora-modularity/libmodulemd/tree/main/xml_specs>.
>
> I originally developed the XML format to be able to encode all features we
> have in the old YAML format. That's kept for your reference in "complete"
> subdirectory
> <https://github.com/fedora-modularity/libmodulemd/tree/main/xml_specs/complete>.
>
> Then I removed all unnecessary features and put it into "reduced" subdirectory
> <https://github.com/fedora-modularity/libmodulemd/tree/main/xml_specs/reduced>.
>
> If you are interested in it, I recommend starting with overview.xml file. It
> shows a skeleton of the format. It's so small I can quote it here:
>
> <index xmlns="http://fedoraproject.org/metadata/moduleindex"; version="" revision="">
>     <module name="">
>         <stream name=""> <!-- DNF wants versions and contexts to differ in @summary etc. -->
>             <build version="" context="" static="" arch="" summary="" description="">
>                                     <!-- @static defaults to false. -->
>                 <dependency name="">
>                     <requires></requires>   <!-- Only one for modulemd-packager-v3 -->
>                     <conflicts></conflicts> <!-- Not supported by modulemd-packager-v3 -->
>                 </dependency>
>                 <dependency name=""/>   <!-- An unspecified stream.
>                                              Not supported by modulemd-packager-v3. -->
>                 <license>
>                     <module></module>
>                     <content></content>
>                 </license>
>                 <references comunity="" documentation="" tracker=""/>
>                 <profile name="" description="">
>                     <package></package>
>                 </profile>
>                 <api></api>
>                 <demodularized></demodularized>
>                 <nevra name="" epoch="" version="" release="" arch=""/>
>             </build>
>
>             <default-profile modified=""> <!-- @modified could be renamed to version -->
>                 <profile></profile> <!-- With a value replaces, missing unsets. -->
>             </default-profile>
>
>             <obsolete modified="" context=""> <!-- @modified in seconds since the epoch.
>                         Missing or empty @context means all contexts. -->
>                 <eol when="" message=""> <!-- Missing element means unsetting. -->
>                         <!-- @when in seconds since the epoch, missing means now. -->
>                     <replacement module="" stream=""/>
>                 </eol>
>             </obsolete>
>
>             <translation modified=""> <!-- @modified could be renamed to version -->
>                 <locale name=""> <!-- Each of the child is optional, but there
>                                       must be at least one. -->
>                     <build summary="" description=""/>  <!-- missing @summary, @description unsets -->
>                     <profile name="" description=""/>   <!-- missing @description unsets -->
>                     <obsolete context="" message=""/>   <!-- missing or empty @context means
>                             all contexts,
>                             missing @message unsets, unsupported in YAML. -->
>                 </locale>
>             </translation>
>         </stream>
>
>         <default-stream modified="" stream=""/> <!-- @modified could be renamed to version -->
>                                         <!-- Existing @stream sets a default,
>                                              missing or empty unsets. -->
>     </module>
>
> </index>
>
> As you can see, there are no separate documents for modules and default
> streams. Everything is kept inside one document. That enables
> properties (e.g. obsoletes or default profiles) pertaining the same entity
> (e.g. a stream) to be placed together. That prevents from repeating the
> identifiers (e.g. stream names) and makes the format more succinct and easier
> for querying. That's especially import for DNF which needs quickly to know
> list of modules, streams of modules, to find out the latest build etc.
>
> An example.xml file shows how a real data would look
> <https://github.com/fedora-modularity/libmodulemd/blob/main/xml_specs/reduced/example.xml>.
> You can see e.g. see that time stamps are encoded as a number of seconds since
> a Unix epoch. That will save DNF from parsing e-mail date notations, handling
> time zones etc.
>
> There is also a formal specification in a form or XML Schema
> <https://github.com/fedora-modularity/libmodulemd/blob/main/xml_specs/reduced/schema.xsd>.
> And tests subdirectory with a preliminary sets of good and bad examples that
> validates and fails a validation.
>
> I'd be glad to hear any comments on the format.
>
>
> A grand plan how to implement and deploy this format is outlined in
> top-level README.md
> <https://github.com/fedora-modularity/libmodulemd/blob/main/xml_specs/README.md>.
> Basically it will be injected into createrepo_c tool to produce the XML data
> in YUM repositories. Then the format will be consumed by DNF5. (Just to
> clarify, currently missing support for modules in DNF5 is not caused by this
> new XML format. DNF5 will support modules in the old YAML format soon through
> libmodulemd library.) According to my consultation with DNF team, DNF5 plans
> to prefer the XML format if both XML and YAML would exist in a repository.
>

At first glance, this looks great! I'll try to spend some time to dig
into it more when I get time, but I'm really happy to finally see
this!



-- 
真実はいつも一つ!/ Always, there's only one truth!
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux