On Thu, Jun 02, 2022 at 03:20:17PM +0200, Peter Krempa wrote: > On Thu, Jun 02, 2022 at 15:02:24 +0200, Erik Skultety wrote: > > With GitLab cutting down on shared resource usage it's very likely that > > following our measure to decrease the number of CI minutes we'll also > > need to decrease our usage of storage. Start by decreasing artifact > > expiration time to 1 day for jobs that are currently exceeding it (by a > > lot -> 30 days). At the same time, define expiration on the integration > > jobs' artifacts where there currently isn't one defined. > > Although 1 day doesn't seem to be enough of a time period, given the > > cadency of libvirt pipeline executions it should suffice giving > > everyone/jobs enough time to download artifacts if needed. > > > > Signed-off-by: Erik Skultety <eskultet@xxxxxxxxxx> > > --- > > .gitlab-ci.yml | 4 ++-- > > ci/integration-template.yml | 1 + > > 2 files changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml > > index 6a8b89729f..1b39047862 100644 > > --- a/.gitlab-ci.yml > > +++ b/.gitlab-ci.yml > > @@ -74,7 +74,7 @@ website: > > expose_as: 'Website' > > name: 'website' > > when: on_success > > - expire_in: 30 days > > + expire_in: 1 day > > paths: > > - website > > > > Note that this automatically propagates into jobs run on other repos. > Adding links to artifacts showing changes to a web page is very useful > and thus retaining them only for 1 day will prevent reviewers from > looking at them. > > For other use as our mirroring job and such it's probably fine, but we > should keep this at least at 2 weeks unless you figure out how to set > this based on the repository name. > > Said that I wanted to object that this is negligible even retaining a > month of webpages but looking at the current state: > > There's 8 pages of CI runs in last month, 15 pipelines per page. That > equates to 120 pipeline runs. With 7.7MiB per run that's 924MiB of > mostly useless copies of the same thing. > > Unless we figure out how to change this per repo name, please modify the > website job to minimum of 15 days to give reviewers some time., that's > still halving the required space. Since variable expansion apparently doesn't work with the expire_in clause yet [1], the only thing that comes to mind is to define 2 jobs for the website and with the usage of 'rules' generate the correct one depending on whether this is upstream or a fork. Yes, it's ugly, but it would work. On the other hand, 924MiB isn't a tragedy. > > P.S: > > We have FAR bigger problems with retaining logs of all builds > indefinitely. Based on my rough calculation 1 average run of our CI > produces ~11MiB of logs (based on ~46GiB of total reported size of > artifacts by gitlab, ~4000 ci runs on upstream) > > I've confirmed that logs are counted towards artifacts empirically by > deleting all but 1 CI run in my repo which only has the webpage > artifacts (7.7MiB unpacked), yet gitlab reported 28 MiB of total usage > for artifacts. Well, the worse news is that we cannot run a CI job to prune the artifacts automatically, because one would need a personal access token for that. Why personal access token? Because Project/Group access tokens are apparently unavailable on GitLab SaaS if you're not a paying customer (which is weird, because most other features from Ultimate are enabled) [2] and in fact I don't see the described setting under our group/project just like this guy [3]. The problem with Personal access tokens is that there are security implications tied to them: - linked to a specific user - other users with high enough privileges could see the token - wrong settings can lead to leaking of that token which could expose all repositories of that user One possible solution would be to create a member service account with no repositories. The password would only be available to the maintainers. AFAIK you don't need full API access to purge pipelines, IOW read_api permissions should suffice which means it would not make this such an awfully ugly solution. The proper solution would be to use CI/CD job tokens because these are ephemeral by design, however they have no permission granularity settings and so cannot be used with all of the API endpoints (purging artifacts being one of them) :(. The least favourable solution IMO (but 100% functional) would be for one of the to set up a cron job on a private machine using their private access token which nobody could see and purge them from "the outside". [1] https://docs.gitlab.com/ee/ci/variables/where_variables_can_be_used.html#gitlab-ciyml-file [2] https://docs.gitlab.com/ee/user/project/settings/project_access_tokens.html [3] https://forum.gitlab.com/t/project-access-token-isnt-visible/51701 In any case I'll put this patch on hold until we have a clear idea what the best course of action is - we still have a month™. Erik