Re: to batch or not to batch?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Feb 17, 2018 at 11:15 PM, Zbigniew Jędrzejewski-Szmek
<zbyszek@xxxxxxxxx> wrote:
> Bodhi currently provides "batched updates" [1] which lump updates of
> packages that are not marked urgent into a single batch, released once
> per week. This means that after an update has graduated from testing,
> it may be delayed up to a week before it becomes available to users.
>
> Batching is now the default, but maintainers can push theirs updates
> to stable, overriding this default, and make the update available the
> next day.
>
> Batching is liked by some maintainers, but hated by others
> Unfortunately, the positive effects of batching are strongly
> decreased when many packages are not batched. Thus, we should settle
> on a single policy — either batch as much as possible, or turn
> batching off. Having the middle ground of some batching is not very
> effective and still annoys people who don't like batching.

(snip)

> To summarize the ups (+) and downs (-):
>
> + batching reduces the number of times repository metadata is updated.
>   Each metadata update results in dnf downloading about 20-40 mb,
>   which is expensive and/or slow for users with low bandwidth.

This savings effect is negligible, because metadata has to be updated
even if only 1 urgent security update is pushed to stable.

> + a constant stream of metadata updates also puts strain on our mirrors.
>
> + a constant stream of updates feels overwhelming to users, and a
>   predictable once-per-week batch is perceived as easier. In
>   particular corporate users might adapt to this and use it to
>   schedule an update of all machines at fixed times.

I'd rather want to see a small batch of updates more frequently than a
large batch that I won't care to read through.

> + a batch of updates may be tested as one, and, at least in principle,
>   if users then install this batch as one, QA that was done on the
>   batch matches the user systems more closely, compared to QA testing
>   package updates one by one as they come in, and users updating them
>   at a slightly different schedule.

Well, is any such testing of the "batched state" being done, and if it
is, does it influence which packages get pushed to stable?

> - batching delays updates of packages between 0 and 7 days after
>   they have reached karma and makes it hard for people to immediately
>   install updates when they graduate from testing.

This delay can be circumvented by maintainers by pushing directly to
stable instead of batched (thereby rendering the batched state
obsolete, however).

> - some users (or maybe it's just maintainers?) actually prefer a
>   constant stream of small updates, and find it easier to read
>   changelogs and pinpoint regressions, etc. a few packages at a time.

I certainly belong to this group.

> - batching (when done on the "server" side) interferes with clients
>   applying their own batching policy. This has two aspects:
>   clients might want to pick a different day of the week or an
>   altogether different schedule,
>   clients might want to pick a different policy of updates, e.g. to
>   allow any updates for specific packages to go through, etc.
>
>   In particular gnome-software implements its own style of batching, where
>   it will suggest an update only once per week, unless there are security
>   updates.

Which further delays the distribution of stable updates by up to a
week (depending on the schedule of gnome-software, I didn't check
that). That makes a total of up to 3 weeks (!).

> Unfortunately there isn't much data on the effects of batching.
> Kevin posted some [2], as did the other Kevin [3] ;), but we certainly
> could use more detailed stats.
>
> One of the positive aspects of batching — reduction in metadata downloads,
> might be obsoleted by improving download efficiency through delta downloads.
> A proof-of-concept has been implemented [4].

A simpler approach might be to just flush all batched updates to
stable if there is at least one update (possibly an urgent security
update) anyway. That way, the metadata don't have to be downloaded for
just one update, and all packages reach stable sooner.

> Second positive aspect of batching — doing updates in batches at a
> fixed schedule, may just as well be implemented on the client side,
> although that does not recreate the testing on the whole batch, since
> now every client it doing it at a different time. It's not clear though
> if this additional testing is actually useful.

Well, the whole testing/installing batches of updates sounds a lot
like what Atomic Workstation is doing (which I really like).
However, forcing the same kind of process onto the current way of
doing things (with individual updates and packages) doesn't seem to
make anybody happy right now ...

> There's an open FESCo ticket to "adjust/drop/document" batching [5].
> That discussion has not been effective, because this issue has many
> aspects, and depending on priorities, the view on batching is likely to
> be different. FESCo is trying to gather more data and get a better
> understanding of what maintainers consider more important.
>
> Did I miss something on the plus or minus side? Or some good statistics?
> Does patching make Fedora seem more approachable to end-users?

For end users, batching client-side has the same benefit as batching
server-side.
Even bandwidth-constrained users don't benefit right now, since
updates are pushed to stable way more frequently than weekly right
now.

> (this is a question in particular for Matthew Miller who pushed for batching.)
> Do the benefits of batching outweigh the downsides?

I don't think so (at least right now). With (significant!) changes and
improvements that could change.

> Should we keep batching as an interim measure until delta downloads are implemented?

Delta downloads of metadata would benefit both scenarios greatly, so
pushing for that seems like a good idea anyway.

> Should dnf offer smart batched updates like gnome-software?

No. Users who run dnf manually *want* to get updates.

> Should we encourage maintainers to allow their updates to be batched?

In theory, yes. However, only one urgent security update per day being
pushed to stable renders the whole argument moot.


TL;DR:
- updates stuck in "batched" don't get flushed to stable if there is
another update being pushed to stable anyway (this could be fixed),
- batching updates has little to no bandwidth savings effect (because
urgent security updates might be pushed independent of the batch
schedule),
- batching updates increases the time from updates being introduced to
fedora to reaching users to up to three weeks (!), depending on the
timing
- end users who don't care about updates get notified once a week (or
for security updates), those who do care can check and install them
manually (all done without server side batching)
- power users / sysadmins wanting less frequent updates in bigger
batches can just run the dnf update on their own schedule anyway?

-> I'm in favor of dropping the "batched" thing as it is currently implemented.

Fabio

> [1] https://github.com/fedora-infra/bodhi/issues/1157,
>     https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/thread/UDXVXLT7JXCY6N7NRACN4GBS3KA6D4M6/
> [2] https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/message/B6MMH3L36A2YXQ45Y4DUGMR4XIG7QKE5/
> [3] https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/message/F36YMWKDXBHAQWQOLDSYLYTMDF4WYHE6/
> [4] http://lists.rpm.org/pipermail/rpm-ecosystem/2018-February/000534.html
> [5] https://pagure.io/fesco/issue/1820
>
> Zbyszek
> _______________________________________________
> devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux