Updates to the ansible suite ( add proper logging )

Description

Adding @infra-support to open a ticket.

1. From the console logs the error seems to be:

02:50:07 TASK [ovirt-deploy : Import template from glance]
******************************
02:50:12 An exception occurred during task execution. To see the full
traceback, use -vvv. The error was: AttributeError: 'TemplatesModule'
object has no attribute 'wait_for_import'
02:50:12 fatal: [lago-ansible-suite-master-engine]: FAILED! =>
{"changed": false, "failed": true, "msg": "'TemplatesModule' object
has no attribute 'wait_for_import'"}
02:50:12 to retry, use: --limit
@/home/jenkins/workspace/ovirt-system-tests_ansible-suite-master/ovirt-system-tests/ansible-suite-master/engine.retry

Maybe something changed recently in the Ansible roles?

2. There is a bug in the suite, that when the ansible deploy fails,
the logs are not collected. That is why you see no logs(the logs you
see are right after 'initialize_engine', and not after the ansible
failure).

3. I recommend adding '-vvv' to the ansible command. Might get more
verbose logs, but easier to debug next time.

Thanks,

Nadav.

On Thu, Aug 24, 2017 at 9:49 AM, Martin Perina <mperina@redhat.com> wrote:
> Hi,
>
> Ansible test suite is failing for a few days, Lago says that it's failing on
> engine initialization. I've been looking on the logs, but I was unable to
> find any error. But there's something strange in wildfly logs on engine:
>
> 1. boot.log looks good, I see that WildFly 11 is starting
> 2. server.log also looks good, I see engine was deployed, but it seems to me
> that it's not complete (like WildFly was killed during engine deployment)
> 3. engine.log is completely empty - that's very strange, but it's probably
> related to my suspicion above
>
> In lago logs I can see that ovirt-engine service is being started, but no
> info about being killed/stop.
>
> Any idea what could have happened?
>
> Thanks a lot
>
> Martin
>
>
> On Thu, Aug 24, 2017 at 4:50 AM, <jenkins@jenkins.phx.ovirt.org> wrote:
>>
>> Project:
>> http://jenkins.ovirt.org/job/ovirt-system-tests_ansible-suite-master/
>> Build:
>> http://jenkins.ovirt.org/job/ovirt-system-tests_ansible-suite-master/7/
>> Build Number: 7
>> Build Status: Still Failing
>> Triggered By: Started by timer
>>
>> -------------------------------------
>> Changes Since Last Success:
>> -------------------------------------
>> Changes for Build #5
>> [Eyal Shenitzky] basic-suite-master: add cold_storage_migration test
>>
>> [Barak Korren] Adapting OST manual job to macro changes
>>
>> [Daniel Belenky] Add support for secrets and credentials to STD CI
>>
>>
>> Changes for Build #6
>> No changes
>>
>> Changes for Build #7
>> [Your Name] Use a long name with special chars as a network name
>>
>> [Barak Korren] Fix small bug in slave repo setup
>>
>> [Daniel Belenky] mock_runner fix empty mrmap bug
>>
>>
>>
>>
>> -----------------
>> Failed Tests:
>> -----------------
>> All tests passed
>
>
>
> _______________________________________________
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
>

Activity

Show:

Eyal Edri September 17, 2017 at 8:36 AM

Suite was fixed, patch for collecting logs still open, rebased it.

Ondra Machacek August 30, 2017 at 9:55 AM

Path which add a collecting of logs: https://gerrit.ovirt.org/#/c/81199/1
Path which fixes the job: https://gerrit.ovirt.org/#/c/81107/

Martin Perina August 24, 2017 at 7:57 AM

On Thu, Aug 24, 2017 at 9:01 AM, Nadav Goldin <ngoldin@redhat.com> wrote:

> Adding @infra-support to open a ticket.
>
> 1. From the console logs the error seems to be:
>
> 02:50:07 TASK [ovirt-deploy : Import template from glance]
> ******************************
> 02:50:12 An exception occurred during task execution. To see the full
> traceback, use -vvv. The error was: AttributeError: 'TemplatesModule'
> object has no attribute 'wait_for_import'
> 02:50:12 fatal: [lago-ansible-suite-master-engine]: FAILED! =>
> {"changed": false, "failed": true, "msg": "'TemplatesModule' object
> has no attribute 'wait_for_import'"}
> 02:50:12 to retry, use: --limit
> @/home/jenkins/workspace/ovirt-system-tests_ansible-
> suite-master/ovirt-system-tests/ansible-suite-master/engine.retry
>
> Maybe something changed recently in the Ansible roles?
>

​Ahh, somehow it didn't come to my mind that in console might be more than
in artifacts
I will post a fix for that.

>
> 2. There is a bug in the suite, that when the ansible deploy fails,
> the logs are not collected. That is why you see no logs(the logs you
> see are right after 'initialize_engine', and not after the ansible
> failure).
>

​Hmm, I will take a look to other suites what's the difference around
collecting logs and post a fix for that

>
> 3. I recommend adding '-vvv' to the ansible command. Might get more
> verbose logs, but easier to debug next time.​

>
> Thanks,
>
> Nadav.
>

​Thanks a lot!

Martin

>
>
>
> On Thu, Aug 24, 2017 at 9:49 AM, Martin Perina <mperina@redhat.com> wrote:
> > Hi,
> >
> > Ansible test suite is failing for a few days, Lago says that it's
> failing on
> > engine initialization. I've been looking on the logs, but I was unable to
> > find any error. But there's something strange in wildfly logs on engine:
> >
> > 1. boot.log looks good, I see that WildFly 11 is starting
> > 2. server.log also looks good, I see engine was deployed, but it seems
> to me
> > that it's not complete (like WildFly was killed during engine deployment)
> > 3. engine.log is completely empty - that's very strange, but it's
> probably
> > related to my suspicion above
> >
> > In lago logs I can see that ovirt-engine service is being started, but no
> > info about being killed/stop.
> >
> > Any idea what could have happened?
> >
> > Thanks a lot
> >
> > Martin
> >
> >
> > On Thu, Aug 24, 2017 at 4:50 AM, <jenkins@jenkins.phx.ovirt.org> wrote:
> >>
> >> Project:
> >> http://jenkins.ovirt.org/job/ovirt-system-tests_ansible-suite-master/
> >> Build:
> >> http://jenkins.ovirt.org/job/ovirt-system-tests_ansible-suite-master/7/
> >> Build Number: 7
> >> Build Status: Still Failing
> >> Triggered By: Started by timer
> >>
> >> -------------------------------------
> >> Changes Since Last Success:
> >> -------------------------------------
> >> Changes for Build #5
> >> [Eyal Shenitzky] basic-suite-master: add cold_storage_migration test
> >>
> >> [Barak Korren] Adapting OST manual job to macro changes
> >>
> >> [Daniel Belenky] Add support for secrets and credentials to STD CI
> >>
> >>
> >> Changes for Build #6
> >> No changes
> >>
> >> Changes for Build #7
> >> [Your Name] Use a long name with special chars as a network name
> >>
> >> [Barak Korren] Fix small bug in slave repo setup
> >>
> >> [Daniel Belenky] mock_runner fix empty mrmap bug
> >>
> >>
> >>
> >>
> >> -----------------
> >> Failed Tests:
> >> -----------------
> >> All tests passed
> >
> >
> >
> > _______________________________________________
> > Infra mailing list
> > Infra@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/infra
> >
>

Nadav Goldin August 24, 2017 at 7:38 AM

Barak Korren August 24, 2017 at 7:33 AM

On 24 August 2017 at 10:01, Nadav Goldin <ngoldin@redhat.com> wrote:
>
> 3. I recommend adding '-vvv' to the ansible command. Might get more
> verbose logs, but easier to debug next time.
>
>
We might also want to save Ansible logs to somewhere other then STDOUT
for easier grepping...


Barak Korren
RHV DevOps team , RHCE, RHCi
Red Hat EMEA
redhat.com | TRIED. TESTED. TRUSTED. | redhat.com/trusted

Fixed

Details

Assignee

Reporter

Priority

Created August 24, 2017 at 7:03 AM
Updated October 1, 2017 at 10:57 AM
Resolved September 24, 2017 at 11:21 AM