failures in ovirt-4.2 for 2 projects on iscsi disk space

Description

we failed both ovirt-engine and ovirt-ansible-vm
http://jenkins.ovirt.org/job/ovirt-4.2_change-queue-tester/3487/
https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-4.2_change-queue-tester/3495/

Changes reported were:
https://gerrit.ovirt.org/#/c/95436/1
https://github.com/oVirt/ovirt-ansible-vm-infra/commit/d17bb187d66df41ac7fa848ffdadcd5efb349322

Dominik was failing locally as well so its not infra
no OST merge could have caused it

We could not see any regression that could have caused it either but we suddenly started passing:
https://jenkins.ovirt.org/view/Change%20queue%20jobs/job/ovirt-4.2_change-queue-tester/3496/

First patch on ovirt-engine that passed was: https://gerrit.ovirt.org/#/c/95447/

This seems to be a code regression only we do not know where and it may have been quietly fixed.

Activity

Show:

Dafna Ron November 30, 2018 at 4:38 PM

Dafna Ron November 30, 2018 at 4:38 PM

I added a patch to SkipTest which we should merge since everyone are saying its a storage issue but storage team are not cooperating in the debugging effort.

Dafna Ron November 30, 2018 at 3:50 PM

Dev need to take it from here.
we have failures on several projects along with vdsm and ovirt-engine.

Former user November 28, 2018 at 4:02 PM

do you have up-to date info on who is handling this now and if there is some help needed from my side?

Former user November 27, 2018 at 5:42 PM

This is not an infra issue as the test fails with an API error:

[Cannot run VM. Low disk space on Storage Domain iscsi.]

As iSCSI is a block storage domain using LVM this means that there are not enough extents to provision new disks. The test that fails seems to be "check_snapshot_with_memory" which likely tries to create a volume to save/restore a snapshot. If this didn't happen before then some new test started saving more stuff on disk and solutions are:

  • increase iSCSI SD size (not a good way as we are limited in space on /dev/shm)

  • delete the new functionality

  • optimize the tests in such a way that we don't exceed the disk space (i.e. clean up after previous tests or create a VM with less disk/RAM to produce smaller snapshots)

I'll check the mailing lists to confirm what the exact change was that caused it.

Fixed

Details

Assignee

Reporter

Priority

Created November 16, 2018 at 2:02 PM
Updated August 29, 2019 at 2:12 PM
Resolved December 11, 2018 at 12:59 PM