gtar Cannot write: Input/output error (was: Change in ovirt-system-tests[master]: pytest: he: Port 008_restart_he_vm.py to pytest)

Description

Hi all,

On Tue, Nov 24, 2020 at 8:54 PM Code Review <gerrit@ovirt.org> wrote:
>
> From Jenkins CI <jenkins@ovirt.org>:
>
> Jenkins CI has posted comments on this change.
>
> Change subject: pytest: he: Port 008_restart_he_vm.py to pytest
> ......................................................................
>
>
> Patch Set 12: Continuous-Integration-1
>
> Build Failed
>
> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14018/ : FAILURE

https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14018//artifact/check-patch.he-basic_suite_master.el8.x86_64/mock_logs/script/stdout_stderr.log

[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Extract appliance to
local VM directory]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "dest":
"/var/tmp/localvm3i6cjs21", "extract_results": {"cmd":
["/usr/bin/gtar", "--extract", "-C", "/var/tmp/localvm3i6cjs21", "-z",
"--show-transformed-names", "--sparse", "-f",
"/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201124175128.1.el8.ova"],
"err": "/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
Cannot write: Input/output error\n/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
Cannot utime: Input/output error\n/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
Cannot change ownership to uid 0, gid 0: Input/output
error\n/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
Cannot change mode to rwxr-xr-x: Input/output error\n/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d.meta:
Cannot open: Input/output error\n/usr/bin/gtar: Exiting with failure
status due to previous errors\n", "out": "", "rc": 2}, "handler":
"TgzArchive", "msg": "failed to unpack
/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201124175128.1.el8.ova
to /var/tmp/localvm3i6cjs21", "src":
"/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201124175128.1.el8.ova"}

Other such failures already happened several times recently. Perhaps
some disk-space issue? Or something similar?

Thanks,

>
>
> –
> To view, visit https://gerrit.ovirt.org/112273
> To unsubscribe, visit https://gerrit.ovirt.org/settings
>
> Gerrit-Project: ovirt-system-tests
> Gerrit-Branch: master
> Gerrit-MessageType: comment
> Gerrit-Change-Id: Ib510a1624ac5baad0f637a96919c5fd6040e89aa
> Gerrit-Change-Number: 112273
> Gerrit-PatchSet: 12
> Gerrit-Owner: Yedidyah Bar David <didi@redhat.com>
> Gerrit-Reviewer: Anonymous Coward (1001916)
> Gerrit-Reviewer: Anton Marchukov <amarchuk@redhat.com>
> Gerrit-Reviewer: Dafna Ron <dron@redhat.com>
> Gerrit-Reviewer: Dusan Fodor <dfodor@redhat.com>
> Gerrit-Reviewer: Gal Ben Haim <galbh2@gmail.com>
> Gerrit-Reviewer: Galit Rosenthal <grosenth@redhat.com>
> Gerrit-Reviewer: Jenkins CI <jenkins@ovirt.org>
> Gerrit-Reviewer: Marcin Sobczyk <msobczyk@redhat.com>
> Gerrit-Reviewer: Yedidyah Bar David <didi@redhat.com>
> Gerrit-Comment-Date: Tue, 24 Nov 2020 18:54:46 +0000
> Gerrit-HasComments: No
>


Didi

Activity

Show:
Shlomi Zidmi
December 1, 2020, 3:48 PM

Looks like no other Jenkins instance uses this node. Also virsh lists --all produces no output so i guess there’s nothing to clean.

I’ll re-trigger the job manually to see what happens to the files

Evgheni Dereveanchin
December 1, 2020, 2:11 PM

cannot open directory '/home/jenkins/workspace/ovirt-system-tests_standard-check-patch/ovirt-system-tests/ovirt-node': No such file or directory

Looks like something deleted the workspace while the job was running. Please check if this node is used by any other Jenkins instances or if there are leftover processes from terminating jobs that could cause this.

 

Also please clean up all leftovers from virsh lists --all since this may be just a false positive when libvirt does not know that the workdir is already gone. If we can’t figure out what’s wrong with this node let’s rebuild it from scratch using foreman.

Shlomi Zidmi
December 1, 2020, 1:51 PM
Edited

There are bunch of file/directory not found errors from around the time the build was running:

 

 

Attaching a .txt file with all errors

 

Shlomi Zidmi
December 1, 2020, 1:20 PM

I took ovirt-srv22 back offline. This node has 900GB disk and uses only 90GB, so I guess disk space is not the issue. I’ll further check in the logs to see if we can find something suspicious there

Evgheni Dereveanchin
December 1, 2020, 9:51 AM

apparently the build is failing again so please offline faulty bare metals and verify what could be causing this. How much free space do we have? AFAIR the nodes should have at least 800GB disks.

Your pinned fields
Click on the next to a field label to start pinning.

Assignee

Shlomi Zidmi

Reporter

Yedidyah Bar David