Revisit build artifact storage and retenesion

Description

We need to revisit how we store and manage build artifacts in our environment.

We need to do this to reach several goals:

  1. Stop having to frequently deal with running out of space on the Jenkins server

  2. Stop having to frequently deal with running out of space on the Resources server

  3. Make Jenkins load faster

  4. Make publishing of artifacts faster (If can take up to 20m to publish to 'tested' ATM)

  5. Make it so that finding artifacts is possible without knowing the exact details of the job that made them. We would like to be able to find artifacts by at least:

    • Knowing the build URL in Jenkins

    • Knowing the STDCI stage/project/branch/distro/arch/git hash combination.

    • Asking for STDCI stage/project/branch/distro/arch/latest artifact

We need to achieve the above without significantly harming the UX we provide. For example, users should still be able to find artifacts by navigating from links posted to Gerrit/GitHub to the Jenkins job result pages.

Activity

Show:
Barak Korren
April 24, 2018, 4:20 AM

Most of our storage space is taken up by RPMs which are compressed archives that don't de-duplicate efficiently.

Do you have some data to back up this claim? If you have the same data compressed by the same algorithm in different files, it should theoretically de-duplicate very well, as long as you do block-level as opposed to file-level de-duplicaiton.

Evgheni Dereveanchin
April 24, 2018, 2:49 PM

I will do a test on RHEL 7.5 with master/tested which is a few hundred gigs to confirm how it performs.

Evgheni Dereveanchin
April 25, 2018, 8:40 AM
Edited

Here are the test results on master/tested with a 160GB disk serving as backend for a 480 GB VDO volume (triple size according to official recommendations)

  1. vdostats --human-readable
    Device Size Used Available Use% Space saving%
    /dev/mapper/vdo1 160.0G 107.9G 52.1G 67% 21%

  1. df -h
    Filesystem Size Used Avail Use% Mounted on
    ...
    /dev/mapper/vdo1 480G 132G 349G 28% /tmp/vdo

The "saving" value is at 21% with 108GB of the backend volume consumed when storing 132GB of data. Moreover, this is mostly saved on ISOs. While copying the "rpm" directory the "saving" field was around 3-4% only.

After deleting the iso directory and running fstrim VDO block usage dropped to 92GB while disk usage was around 103 GB:

  1. vdostats --human-readable
    Device Size Used Available Use% Space saving%
    /dev/mapper/vdo1 160.0G 92.4G 67.6G 57% 14%

  1. df -h
    Filesystem Size Used Avail Use% Mounted on
    ...
    /dev/mapper/vdo1 480G 103G 378G 22% /tmp/vdo

Memory usage on this minimal system running nothing but VDO was around 520MB for this volume.
In my opinion, having ~15% savings isn't worth the increased risk of corruption and increased memory usage.

Barak Korren
April 25, 2018, 8:44 AM
Edited

I see, it means we'll need to actually delete data rather then retaining it...

Barak Korren
June 12, 2018, 1:29 PM

Converted this ticket into an epic to track all the artifact retention related activity.

Assignee

Unassigned

Reporter

Barak Korren

Blocked By

None

Components

Priority

Highest

Epic Name

Build artifact storage and retenesion
Configure