Jenkins job takes very long time to download images from Fedora28 mirror
Description
Activity
Barak Korren December 16, 2018 at 10:40 AM
Another option if to add a *.files
feature to STDCI as we're seeing more and more projects require arbitrary files in the build process rather then YUM packages.
Barak Korren December 16, 2018 at 10:37 AM
We can probably solve this from the developer's side by making sure they cache the ISO file on the slave and reuse the cached version if found.
We have a safe_download function we're using to download the windows images for kubevirt, we could use it here, if we can make it as a shared function in STDCI (@Former user wrote a patch to allow shared functions a while ago, but we never merged it since the tests for it were based on the 'version' function that we never finished)
@Eyal Edri removing the proxy will not solve the issue, since it was caused by the slave selecting a bad mirror in the 1st place. The would have happened without the proxy as well.
Eyal Edri December 16, 2018 at 10:10 AM
I'd rather avoid adding another mirror of our own, we're already overloaded with services we maintain.
If we can use the HTTP WA let's do it, otherwise let's drop the proxy for those files and allow people to download directly from a faster mirror for Fedora/CentOS.
Former user December 13, 2018 at 4:27 PM
Looked through the squid logs and here are my findings:
We fetch from a redirecting URL and the -L option just connects a mirror that's sent as part of the HTTP/302 response
This means that each time we fetch download.fedoraproject.org it will redirect to a mirror and curl will re-connect and fetch from there. For squid, connections to different mirrors are completely different links so it just connects upstream and re-fetches the file if it's the first time acessing it.
Here's a test I did from PHX:
curl -v -L -O http://download.fedoraproject.org/pub/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso
...
Connected to download.fedoraproject.org (209.132.181.16) port 80 (#0)
> GET /pub/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso HTTP/1.1
> User-Agent: curl/7.29.0
> Host: download.fedoraproject.org
< HTTP/1.1 302 Found
< Content-Length: 0
< Location: http://mirror.cs.princeton.edu/pub/mirrors/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso
Issue another request to this URL: 'http://mirror.cs.princeton.edu/pub/mirrors/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso'
About to connect() to mirror.cs.princeton.edu port 80 (#1)
Trying 128.112.136.119...
Connected to mirror.cs.princeton.edu (128.112.136.119) port 80 (#1)
> GET /pub/mirrors/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso HTTP/1.1
Moreover, the mirror returned can be HTTPS and then squid can't cache it at all and needs to fetch data each time. Looks like this is what happened on December 9 since the timeout contains a HTTP/2 error which is a TLS-only protocol:
09:53:57 curl: (92) HTTP/2 stream 1 was not closed cleanly: INTERNAL_ERROR (err 2)
So the root cause was download.fedoraproject.org redirecting to a slow HTTPS mirror.
To fix this we need to either hard-code HTTP mirror URLs to fethc through the proxy in caching mode, or have a mirror of our own that would contain boot.iso and other non-RPM artifacts we might need.
Barak Korren December 12, 2018 at 10:40 AM
This is using the proxy:
00:01:44.031 Using proxified config /home/jenkins/workspace/ovirt-appliance_ovirt-4.3_build-artifacts-fc28-x86_64/jenkins/mock_configs/fedora-28-x86_64_proxied.cfg
@Former user can we check if the proxy became slow or unresponsive?
Eyal Edri December 12, 2018 at 7:45 AM
@Barak Korren@Former user can you check if something is out of the ordinary?
How long does it usually take to download that image?
Looking at:
https://jenkins.ovirt.org/job/ovirt-appliance_master_build-artifacts-fc28-x86_64/148/consoleFull
Download of Fedora Boot ISO:
08:10:57 curl -L -O
"http://download.fedoraproject.org/pub/fedora/linux/releases/28/Server/x86_64/os/images/boot.iso"08:10:57
% Total % Received % Xferd Average Speed Time Time Time
Current*08:10:57* Dload Upload
Total Spent Left Speed
It took ~ 2 hours to get the download running and then:
83 583M 83 484M 0 0 82299 0 2:03:48 1:42:55
0:20:53 33902*09:53:57* curl: (92) HTTP/2 stream 1 was not closed
cleanly: INTERNAL_ERROR (err 2)
Can you please check connectivity?
–
SANDRO BONAZZOLA
MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV
Red Hat EMEA <https://www.redhat.com/>
sbonazzo@redhat.com
<https://red.ht/sig>