Revisit build artifact storage and retenesion

General

Additional Info

General

Additional Info

Description

We need to revisit how we store and manage build artifacts in our environment.

We need to do this to reach several goals:

Stop having to frequently deal with running out of space on the Jenkins server
Stop having to frequently deal with running out of space on the Resources server
Make Jenkins load faster
Make publishing of artifacts faster (If can take up to 20m to publish to 'tested' ATM)
Make it so that finding artifacts is possible without knowing the exact details of the job that made them. We would like to be able to find artifacts by at least:
- Knowing the build URL in Jenkins
- Knowing the STDCI stage/project/branch/distro/arch/git hash combination.
- Asking for STDCI stage/project/branch/distro/arch/latest artifact

We need to achieve the above without significantly harming the UX we provide. For example, users should still be able to find artifacts by navigating from links posted to Gerrit/GitHub to the Jenkins job result pages.

Child issues

100% Done

Linked issues

Activity

Show:

Barak Korren June 12, 2018 at 1:29 PM

Converted this ticket into an epic to track all the artifact retention related activity.

Barak Korren April 25, 2018 at 8:44 AM
Edited

I see, it means we'll need to actually delete data rather then retaining it...

Former user April 25, 2018 at 8:40 AM
Edited

Here are the test results on master/tested with a 160GB disk serving as backend for a 480 GB VDO volume (triple size according to official recommendations)

vdostats --human-readable
Device Size Used Available Use% Space saving%
/dev/mapper/vdo1 160.0G 107.9G 52.1G 67% 21%

df -h
Filesystem Size Used Avail Use% Mounted on
...
/dev/mapper/vdo1 480G 132G 349G 28% /tmp/vdo

The "saving" value is at 21% with 108GB of the backend volume consumed when storing 132GB of data. Moreover, this is mostly saved on ISOs. While copying the "rpm" directory the "saving" field was around 3-4% only.

After deleting the iso directory and running fstrim VDO block usage dropped to 92GB while disk usage was around 103 GB:

vdostats --human-readable
Device Size Used Available Use% Space saving%
/dev/mapper/vdo1 160.0G 92.4G 67.6G 57% 14%

df -h
Filesystem Size Used Avail Use% Mounted on
...
/dev/mapper/vdo1 480G 103G 378G 22% /tmp/vdo

Memory usage on this minimal system running nothing but VDO was around 520MB for this volume.
In my opinion, having ~15% savings isn't worth the increased risk of corruption and increased memory usage.

Former user April 24, 2018 at 2:49 PM

I will do a test on RHEL 7.5 with master/tested which is a few hundred gigs to confirm how it performs.

Barak Korren April 24, 2018 at 4:20 AM

Most of our storage space is taken up by RPMs which are compressed archives that don't de-duplicate efficiently.

Do you have some data to back up this claim? If you have the same data compressed by the same algorithm in different files, it should theoretically de-duplicate very well, as long as you do block-level as opposed to file-level de-duplicaiton.

Details
Assignee
Unassigned
Reporter
Barak Korren(Deactivated)
Components
Priority
Highest
Epic Name
Build artifact storage and retenesion

Created November 7, 2017 at 7:51 AM

Updated September 26, 2019 at 12:40 PM

Revisit build artifact storage and retenesion

Description

Child issues

Linked issues

blocks

is caused by

relates to

Activity

Barak Korren June 12, 2018 at 1:29 PM

Barak Korren April 25, 2018 at 8:44 AMEdited

Former user April 25, 2018 at 8:40 AMEdited

Former user April 24, 2018 at 2:49 PM

Barak Korren April 24, 2018 at 4:20 AM

DetailsAssigneeUnassignedUnassignedReporterBarak KorrenBarak Korren(Deactivated)ComponentsPriorityHighestEpic NameBuild artifact storage and retenesion

Details

Assignee

Reporter

Components

Priority

Epic Name

Barak Korren April 25, 2018 at 8:44 AM
Edited

Former user April 25, 2018 at 8:40 AM
Edited

Details
Assignee
Unassigned
Reporter
Barak Korren(Deactivated)
Components
Priority
Highest
Epic Name
Build artifact storage and retenesion