build-artifacts job stuck on s390x - cannot run OST

Description

Since Friday build-artifacts on s390x get stuck again, so we cannot run OST.
This is not a new issue, we have issues on s390x every few weeks.

I posted this patch to disable this job:
https://gerrit.ovirt.org/c/97851

We can enable the job when we have a reliable build slave again.

Here are some failed jobs:

Nir

Activity

Show:
Evgheni Dereveanchin
October 31, 2019, 4:34 PM

Unfortunately the current s390x VM is slow again due to an overloaded host. I’m working on getting dedicated VMs attached to compensate for that.

Eyal Edri
October 31, 2019, 12:04 PM

We’ll soon have new s390x slaves connected to jenkins without any sudo restriction, so any weird issues we’ve had with the slave should be fixed.

Nir Soffer
September 28, 2019, 11:48 PM

On Sun, Sep 29, 2019 at 2:35 AM Nir Soffer <nsoffer@redhat.com> wrote:

> On Sun, Sep 29, 2019 at 1:13 AM Nir Soffer <nsoffer@redhat.com> wrote:
>
>> Since Friday build-artifacts on s390x get stuck again, so we cannot run
>> OST.
>> This is not a new issue, we have issues on s390x every few weeks.
>>
>> I posted this patch to disable this job:
>> https://gerrit.ovirt.org/c/97851
>>
>
> The patch does not work, it still runs the s390x job, and it run it
> incorrectly.
> Maybe the issue is not at the slave, but at vdsm automation scripts?
>
> We have this yam:
>
> 19 stages:
> 20 - build-artifacts:
> 21 substages:
> 22 - build-py27:
> 23 archs:
> 24 - ppc64le
> 25 - x86_64
> 26 - build-py37:
> 27 distributions:
> 28 - fc30
>
> And we get these jobs:
> - build-artifacts.build-py27.el7.ppc64le
> - build-artifacts.build-py27.el7.x86_64
> - build-artifacts.build-py27.fc29.x86_64
> - build-artifacts.build-py37.fc30.x86_64
> - build-artifacts.fc29.s390x
>
> The last job - s390x looks wrong - we should have only
> build-py27 and build-py37 jobs, using:
>
> - automation/build-artifacts.build-py27.sh
> - automation/build-artifacts.build-py37.sh
>
> But both scripts are using a symlinks:
> lrwxrwxrwx. 1 nsoffer nsoffer 18 Sep 29 00:55 automation/
> build-artifacts.build-py27.sh -> build-artifacts.sh
> lrwxrwxrwx. 1 nsoffer nsoffer 18 Sep 29 00:55 automation/
> build-artifacts.build-py37.sh -> build-artifacts.sh
> -rwxrwxr-x. 1 nsoffer nsoffer 346 Sep 17 02:54
> automation/build-artifacts.sh
>
> Is it possible the the CI find build-artifacts.sh and run it even when no
> sub stage is specified?
>
> I'll try to rename this script to avoid this.
>

Hopefully fixed by:
https://gerrit.ovirt.org/c/103655/

>
> We can enable the job when we have a reliable build slave again.
>>
>> Here are some failed jobs:
>> - http://jenkins.ovirt.org/job/standard-manual-runner/757/
>> - http://jenkins.ovirt.org/job/standard-manual-runner/758/
>> - http://jenkins.ovirt.org/job/standard-manual-runner/759/
>> - http://jenkins.ovirt.org/job/standard-manual-runner/762/
>> - http://jenkins.ovirt.org/job/standard-manual-runner/763/
>> - http://jenkins.ovirt.org/job/standard-manual-runner/764/
>>
>> Nir
>>
>

Nir Soffer
September 28, 2019, 11:37 PM

On Sun, Sep 29, 2019 at 1:13 AM Nir Soffer <nsoffer@redhat.com> wrote:

> Since Friday build-artifacts on s390x get stuck again, so we cannot run
> OST.
> This is not a new issue, we have issues on s390x every few weeks.
>
> I posted this patch to disable this job:
> https://gerrit.ovirt.org/c/97851
>

The patch does not work, it still runs the s390x job, and it run it
incorrectly.
Maybe the issue is not at the slave, but at vdsm automation scripts?

We have this yam:

19 stages:
20 - build-artifacts:
21 substages:
22 - build-py27:
23 archs:
24 - ppc64le
25 - x86_64
26 - build-py37:
27 distributions:
28 - fc30

And we get these jobs:

  • build-artifacts.build-py27.el7.ppc64le

  • build-artifacts.build-py27.el7.x86_64

  • build-artifacts.build-py27.fc29.x86_64

  • build-artifacts.build-py37.fc30.x86_64

  • build-artifacts.fc29.s390x

The last job - s390x looks wrong - we should have only
build-py27 and build-py37 jobs, using:

  • automation/build-artifacts.build-py27.sh

  • automation/build-artifacts.build-py37.sh

But both scripts are using a symlinks:
lrwxrwxrwx. 1 nsoffer nsoffer 18 Sep 29 00:55 automation/
build-artifacts.build-py27.sh -> build-artifacts.sh
lrwxrwxrwx. 1 nsoffer nsoffer 18 Sep 29 00:55 automation/
build-artifacts.build-py37.sh -> build-artifacts.sh
-rwxrwxr-x. 1 nsoffer nsoffer 346 Sep 17 02:54 automation/build-artifacts.sh

Is it possible the the CI find build-artifacts.sh and run it even when no
sub stage is specified?

I'll try to rename this script to avoid this.

We can enable the job when we have a reliable build slave again.
>
> Here are some failed jobs:
> - http://jenkins.ovirt.org/job/standard-manual-runner/757/
> - http://jenkins.ovirt.org/job/standard-manual-runner/758/
> - http://jenkins.ovirt.org/job/standard-manual-runner/759/
> - http://jenkins.ovirt.org/job/standard-manual-runner/762/
> - http://jenkins.ovirt.org/job/standard-manual-runner/763/
> - http://jenkins.ovirt.org/job/standard-manual-runner/764/
>
> Nir
>

Fixed

Assignee

Evgheni Dereveanchin

Reporter

Nir Soffer

Blocked By

None

Priority

Medium