Jenkins job takes very long time to download images from Fedora28 mirror

Description

Looking at:
https://jenkins.ovirt.org/job/ovirt-appliance_master_build-artifacts-fc28-x86_64/148/consoleFull

Download of Fedora Boot ISO:

08:10:57 curl -L -O
"http://download.fedoraproject.org/pub/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso"08:10:57
% Total % Received % Xferd Average Speed Time Time Time
Current*08:10:57* Dload Upload
Total Spent Left Speed

It took ~ 2 hours to get the download running and then:

83 583M 83 484M 0 0 82299 0 2:03:48 1:42:55
0:20:53 33902*09:53:57* curl: (92) HTTP/2 stream 1 was not closed
cleanly: INTERNAL_ERROR (err 2)

Can you please check connectivity?

SANDRO BONAZZOLA

MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV

Red Hat EMEA <https://www.redhat.com/>

sbonazzo@redhat.com
<https://red.ht/sig>

Activity

Show:

Barak Korren December 16, 2018 at 10:40 AM

Another option if to add a *.files feature to STDCI as we're seeing more and more projects require arbitrary files in the build process rather then YUM packages.

Barak Korren December 16, 2018 at 10:37 AM

We can probably solve this from the developer's side by making sure they cache the ISO file on the slave and reuse the cached version if found.

We have a safe_download function we're using to download the windows images for kubevirt, we could use it here, if we can make it as a shared function in STDCI ( wrote a patch to allow shared functions a while ago, but we never merged it since the tests for it were based on the 'version' function that we never finished)

removing the proxy will not solve the issue, since it was caused by the slave selecting a bad mirror in the 1st place. The would have happened without the proxy as well.

Eyal Edri December 16, 2018 at 10:10 AM

I'd rather avoid adding another mirror of our own, we're already overloaded with services we maintain.
If we can use the HTTP WA let's do it, otherwise let's drop the proxy for those files and allow people to download directly from a faster mirror for Fedora/CentOS.

Former user December 13, 2018 at 4:27 PM

Looked through the squid logs and here are my findings:

We fetch from a redirecting URL and the -L option just connects a mirror that's sent as part of the HTTP/302 response

This means that each time we fetch download.fedoraproject.org it will redirect to a mirror and curl will re-connect and fetch from there. For squid, connections to different mirrors are completely different links so it just connects upstream and re-fetches the file if it's the first time acessing it.

Here's a test I did from PHX:

curl -v -L -O http://download.fedoraproject.org/pub/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso
...

  • Connected to download.fedoraproject.org (209.132.181.16) port 80 (#0)
    > GET /pub/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso HTTP/1.1
    > User-Agent: curl/7.29.0
    > Host: download.fedoraproject.org

< HTTP/1.1 302 Found
< Content-Length: 0
< Location: http://mirror.cs.princeton.edu/pub/mirrors/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso

Moreover, the mirror returned can be HTTPS and then squid can't cache it at all and needs to fetch data each time. Looks like this is what happened on December 9 since the timeout contains a HTTP/2 error which is a TLS-only protocol:

09:53:57 curl: (92) HTTP/2 stream 1 was not closed cleanly: INTERNAL_ERROR (err 2)

So the root cause was download.fedoraproject.org redirecting to a slow HTTPS mirror.

To fix this we need to either hard-code HTTP mirror URLs to fethc through the proxy in caching mode, or have a mirror of our own that would contain boot.iso and other non-RPM artifacts we might need.

Barak Korren December 12, 2018 at 10:40 AM

This is using the proxy:

00:01:44.031 Using proxified config /home/jenkins/workspace/ovirt-appliance_ovirt-4.3_build-artifacts-fc28-x86_64/jenkins/mock_configs/fedora-28-x86_64_proxied.cfg

can we check if the proxy became slow or unresponsive?

Eyal Edri December 12, 2018 at 7:45 AM

can you check if something is out of the ordinary?
How long does it usually take to download that image?

Cannot Reproduce

Details

Assignee

Reporter

Components

Priority

Created December 10, 2018 at 6:59 AM
Updated August 29, 2019 at 2:12 PM
Resolved August 27, 2019 at 12:45 PM

Flag notifications