gtar Cannot write: Input/output error (was: Change in ovirt-system-tests[master]: pytest: he: Port 008_restart_he_vm.py to pytest)

Description

Hi all,

On Tue, Nov 24, 2020 at 8:54 PM Code Review <gerrit@ovirt.org> wrote:
>
> From Jenkins CI <jenkins@ovirt.org>:
>
> Jenkins CI has posted comments on this change.
>
> Change subject: pytest: he: Port 008_restart_he_vm.py to pytest
> ......................................................................
>
>
> Patch Set 12: Continuous-Integration-1
>
> Build Failed
>
> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14018/ : FAILURE

https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14018//artifact/check-patch.he-basic_suite_master.el8.x86_64/mock_logs/script/stdout_stderr.log

[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Extract appliance to
local VM directory]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "dest":
"/var/tmp/localvm3i6cjs21", "extract_results": {"cmd":
["/usr/bin/gtar", "--extract", "-C", "/var/tmp/localvm3i6cjs21", "-z",
"--show-transformed-names", "--sparse", "-f",
"/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201124175128.1.el8.ova"],
"err": "/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
Cannot write: Input/output error\n/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
Cannot utime: Input/output error\n/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
Cannot change ownership to uid 0, gid 0: Input/output
error\n/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
Cannot change mode to rwxr-xr-x: Input/output error\n/usr/bin/gtar:
images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d.meta:
Cannot open: Input/output error\n/usr/bin/gtar: Exiting with failure
status due to previous errors\n", "out": "", "rc": 2}, "handler":
"TgzArchive", "msg": "failed to unpack
/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201124175128.1.el8.ova
to /var/tmp/localvm3i6cjs21", "src":
"/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201124175128.1.el8.ova"}

Other such failures already happened several times recently. Perhaps
some disk-space issue? Or something similar?

Thanks,

>
>
> –
> To view, visit https://gerrit.ovirt.org/112273
> To unsubscribe, visit https://gerrit.ovirt.org/settings
>
> Gerrit-Project: ovirt-system-tests
> Gerrit-Branch: master
> Gerrit-MessageType: comment
> Gerrit-Change-Id: Ib510a1624ac5baad0f637a96919c5fd6040e89aa
> Gerrit-Change-Number: 112273
> Gerrit-PatchSet: 12
> Gerrit-Owner: Yedidyah Bar David <didi@redhat.com>
> Gerrit-Reviewer: Anonymous Coward (1001916)
> Gerrit-Reviewer: Anton Marchukov <amarchuk@redhat.com>
> Gerrit-Reviewer: Dafna Ron <dron@redhat.com>
> Gerrit-Reviewer: Dusan Fodor <dfodor@redhat.com>
> Gerrit-Reviewer: Gal Ben Haim <galbh2@gmail.com>
> Gerrit-Reviewer: Galit Rosenthal <grosenth@redhat.com>
> Gerrit-Reviewer: Jenkins CI <jenkins@ovirt.org>
> Gerrit-Reviewer: Marcin Sobczyk <msobczyk@redhat.com>
> Gerrit-Reviewer: Yedidyah Bar David <didi@redhat.com>
> Gerrit-Comment-Date: Tue, 24 Nov 2020 18:54:46 +0000
> Gerrit-HasComments: No
>


Didi

Attachments

1

Activity

Show:

Former user December 1, 2020 at 1:20 PM

I took ovirt-srv22 back offline. This node has 900GB disk and uses only 90GB, so I guess disk space is not the issue. I’ll further check in the logs to see if we can find something suspicious there

Former user December 1, 2020 at 9:51 AM

apparently the build is failing again so please offline faulty bare metals and verify what could be causing this. How much free space do we have? AFAIR the nodes should have at least 800GB disks.

Yedidyah Bar David December 1, 2020 at 6:39 AM

On Sun, Nov 29, 2020 at 2:39 PM Shlomi Zidmi (oVirt JIRA) <
jira@ovirt-jira.atlassian.net> wrote:

>
> [
> https://ovirt-jira.atlassian.net/browse/OVIRT-3063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=40955#comment-40955
> ]
>
> Shlomi Zidmi commented on OVIRT-3063:
> ------------------------------------- >
> Looks like recent builds ran without this error. Taking ovirt-srv22 back
> online to see how it behaves
>

Now it failed again, on ovirt-srv22:

https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14124/

I am pretty certain that it didn't always fail, also after it started
failing - it was sporadic.

>
> > gtar Cannot write: Input/output error (was: Change in
> ovirt-system-tests[master]: pytest: he: Port 008_restart_he_vm.py to pytest)
> >
> ---------------------------------------------------------------------------------------------------------------------------------- > >
> > Key:
> > URL: https://ovirt-jira.atlassian.net/browse/OVIRT-3063
> > Project: oVirt - virtualization made easy
> > Issue Type: By-EMAIL
> > Reporter: Yedidyah Bar David
> > Assignee: Shlomi Zidmi
> >
> > Hi all,
> > On Tue, Nov 24, 2020 at 8:54 PM Code Review <gerrit@ovirt.org> wrote:
> > >
> > > From Jenkins CI <jenkins@ovirt.org>:
> > >
> > > Jenkins CI has posted comments on this change.
> > >
> > > Change subject: pytest: he: Port 008_restart_he_vm.py to pytest
> > > ......................................................................
> > >
> > >
> > > Patch Set 12: Continuous-Integration-1
> > >
> > > Build Failed
> > >
> > >
> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14018/
> : FAILURE
> >
> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14018//artifact/check-patch.he-basic_suite_master.el8.x86_64/mock_logs/script/stdout_stderr.log
> > [ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Extract appliance to
> > local VM directory]
> > [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "dest":
> > "/var/tmp/localvm3i6cjs21", "extract_results": {"cmd":
> > ["/usr/bin/gtar", "--extract", "-C", "/var/tmp/localvm3i6cjs21", "-z",
> > "--show-transformed-names", "--sparse", "-f",
> >
> "/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201124175128.1.el8.ova"],
> > "err": "/usr/bin/gtar:
> >
> images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
> > Cannot write: Input/output error\n/usr/bin/gtar:
> >
> images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
> > Cannot utime: Input/output error\n/usr/bin/gtar:
> >
> images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
> > Cannot change ownership to uid 0, gid 0: Input/output
> > error\n/usr/bin/gtar:
> >
> images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d:
> > Cannot change mode to rwxr-xr-x: Input/output error\n/usr/bin/gtar:
> >
> images/1519176a-e693-4425-b47b-7acfdd7f180b/4904ac49-1535-47d9-9d52-b803df4f869d.meta:
> > Cannot open: Input/output error\n/usr/bin/gtar: Exiting with failure
> > status due to previous errors\n", "out": "", "rc": 2}, "handler":
> > "TgzArchive", "msg": "failed to unpack
> >
> /usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201124175128.1.el8.ova
> > to /var/tmp/localvm3i6cjs21", "src":
> >
> "/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201124175128.1.el8.ova"}
> > Other such failures already happened several times recently. Perhaps
> > some disk-space issue? Or something similar?
> > Thanks,
> > >
> > >
> > > –
> > > To view, visit https://gerrit.ovirt.org/112273
> > > To unsubscribe, visit https://gerrit.ovirt.org/settings
> > >
> > > Gerrit-Project: ovirt-system-tests
> > > Gerrit-Branch: master
> > > Gerrit-MessageType: comment
> > > Gerrit-Change-Id: Ib510a1624ac5baad0f637a96919c5fd6040e89aa
> > > Gerrit-Change-Number: 112273
> > > Gerrit-PatchSet: 12
> > > Gerrit-Owner: Yedidyah Bar David <didi@redhat.com>
> > > Gerrit-Reviewer: Anonymous Coward (1001916)
> > > Gerrit-Reviewer: Anton Marchukov <amarchuk@redhat.com>
> > > Gerrit-Reviewer: Dafna Ron <dron@redhat.com>
> > > Gerrit-Reviewer: Dusan Fodor <dfodor@redhat.com>
> > > Gerrit-Reviewer: Gal Ben Haim <galbh2@gmail.com>
> > > Gerrit-Reviewer: Galit Rosenthal <grosenth@redhat.com>
> > > Gerrit-Reviewer: Jenkins CI <jenkins@ovirt.org>
> > > Gerrit-Reviewer: Marcin Sobczyk <msobczyk@redhat.com>
> > > Gerrit-Reviewer: Yedidyah Bar David <didi@redhat.com>
> > > Gerrit-Comment-Date: Tue, 24 Nov 2020 18:54:46 +0000
> > > Gerrit-HasComments: No
> > >
> > –
> > Didi
>
>
>
> –
> This message was sent by Atlassian Jira
> (v1001.0.0-SNAPSHOT#100152)
>

Former user November 29, 2020 at 12:38 PM

Looks like recent builds ran without this error. Taking ovirt-srv22 back online to see how it behaves

Former user November 26, 2020 at 2:03 PM

Thanks for the extra info. I checked the build you shared (https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14022/) and it looks like once again ovirt-srv22 is involved (check-patch-el8 ran on it).

Also i can confirm the issue appeared on another node as well - ovirt-srv21:
https://jenkins.ovirt.org/blue/organizations/jenkins/ovirt-system-tests_standard-check-patch/detail/ovirt-system-tests_standard-check-patch/14074/pipeline/150/

So it seems like multiple nodes are involved in these failures.

Yedidyah Bar David November 25, 2020 at 1:19 PM

On Wed, Nov 25, 2020 at 3:00 PM Shlomi Zidmi (oVirt JIRA) <
jira@ovirt-jira.atlassian.net> wrote:

> Shlomi Zidmi
> <https://ovirt-jira.atlassian.net/secure/ViewProfile.jspa?accountId=5c977456c430371a3c67dbf6>
> commented on [image: By-EMAIL]
> <https://ovirt-jira.atlassian.net/browse/OVIRT-3063?atlOrigin=eyJpIjoiMmNlZjRiNTAzMTdjNDM3MTkzZTQwOGE3N2QzOTU1NWQiLCJwIjoiaiJ9>
>
> Re: gtar Cannot write: Input/output error (was: Change in
> ovirt-system-tests[master]: pytest: he: Port 008_restart_he_vm.py to pytest)
> <https://ovirt-jira.atlassian.net/browse/OVIRT-3063?atlOrigin=eyJpIjoiMmNlZjRiNTAzMTdjNDM3MTkzZTQwOGE3N2QzOTU1NWQiLCJwIjoiaiJ9>
>
> I’m still reviewing this issue. I don’t think that’s a disk space issue
> since only 15% of the disk is being used. Also no info is returned from
> dmesg regarding any errrors/failures.
>
> For now I’ve disabled the node (ovirt-srv22) on Jenkins until we figure
> this out
>

This happens also on other nodes, e.g.:

https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14022/

https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14022/consoleText

[2020-11-24T12:41:02.482Z] Running on node:
openshift-integ-tests-container-sp0sc (integ-tests-container el7)

https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/14022//artifact/check-patch.he-basic_suite_master.el8.x86_64/mock_logs/script/stdout_stderr.log

[ INFO ] TASK [ovirt.ovirt.hosted_engine_setup : Extract appliance to
local VM directory]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "dest":
"/var/tmp/localvmwdymnd_d", "extract_results": {"cmd":
["/usr/bin/gtar", "--extract", "-C", "/var/tmp/localvmwdymnd_d", "-z",
"--show-transformed-names", "--sparse", "-f",
"/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201123175824.1.el8.ova"],
"err": "/usr/bin/gtar:
images/81ceaf89-2550-4724-9bdf-56c520fa14c0/e3728613-7016-4a4e-b95f-68145cd3c028:
Cannot write: Input/output error\n/usr/bin/gtar:
images/81ceaf89-2550-4724-9bdf-56c520fa14c0/e3728613-7016-4a4e-b95f-68145cd3c028:
Cannot utime: Input/output error\n/usr/bin/gtar:
images/81ceaf89-2550-4724-9bdf-56c520fa14c0/e3728613-7016-4a4e-b95f-68145cd3c028:
Cannot change ownership to uid 0, gid 0: Input/output
error\n/usr/bin/gtar:
images/81ceaf89-2550-4724-9bdf-56c520fa14c0/e3728613-7016-4a4e-b95f-68145cd3c028:
Cannot change mode to rwxr-xr-x: Input/output error\n/usr/bin/gtar:
images/81ceaf89-2550-4724-9bdf-56c520fa14c0/e3728613-7016-4a4e-b95f-68145cd3c028.meta:
Cannot open: Input/output error\n/usr/bin/gtar: Exiting with failure
status due to previous errors\n", "out": "", "rc": 2}, "handler":
"TgzArchive", "msg": "failed to unpack
/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201123175824.1.el8.ova
to /var/tmp/localvmwdymnd_d", "src":
"/usr/share/ovirt-engine-appliance/ovirt-engine-appliance-4.4-20201123175824.1.el8.ova"}

> [image: Add Comment]
> <https://ovirt-jira.atlassian.net/browse/OVIRT-3063#add-comment?atlOrigin=eyJpIjoiMmNlZjRiNTAzMTdjNDM3MTkzZTQwOGE3N2QzOTU1NWQiLCJwIjoiaiJ9> Add
> Comment
> <https://ovirt-jira.atlassian.net/browse/OVIRT-3063#add-comment?atlOrigin=eyJpIjoiMmNlZjRiNTAzMTdjNDM3MTkzZTQwOGE3N2QzOTU1NWQiLCJwIjoiaiJ9>
>
> Get Jira notifications on your phone! Download the Jira Cloud app for
> Android
> <https://play.google.com/store/apps/details?id=com.atlassian.android.jira.core&referrer=utm_source%3DNotificationLink%26utm_medium%3DEmail>
> or iOS
> <https://itunes.apple.com/app/apple-store/id1006972087?pt=696495&ct=EmailNotificationLink&mt=8>
> ------------------------------ > This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100151-
> sha1:c8fcc1e)
> [image: Atlassian logo]
>


Didi

Former user November 25, 2020 at 12:59 PM

I’m still reviewing this issue. I don’t think that’s a disk space issue since only 15% of the disk is being used. Also no info is returned from dmesg regarding any errrors/failures.

For now I’ve disabled the node (ovirt-srv22) on Jenkins until we figure this out

Anton Marchukov November 25, 2020 at 10:58 AM

how is it going? any update?

Former user November 24, 2020 at 11:09 PM

I’ve seen some issues on 80gig-disk labeled nodes before.

could you please inspect them and rebuild if necessary?

https://jenkins.ovirt.org/label/80gb-disk/

Cannot Reproduce

Details

Assignee

Reporter

Priority

Created November 24, 2020 at 8:35 PM
Updated June 16, 2021 at 2:44 PM
Resolved June 16, 2021 at 2:44 PM