engine upgrade job is disabling puppet on slaves

Description

I found a lot of out-of-sync slaves in Foreman and further investigation revealed that puppet agent is disabled on them.

The journal indicates that it's the upgrade job doing this:

Oct 19 14:24:51 vm0079.workers-phx.ovirt.org sudo[18213]: jenkins : TTY=unknown ; PWD=/home/jenkins/workspace/ovirt-engine_master_upgrade-from-4.0_el7_created ; USER=root ; COMMAND=/bin/puppet agent --disable
Oct 19 14:24:54 vm0079.workers-phx.ovirt.org puppet-agent[18215]: Disabling Puppet.

The job seems to re-enable puppet after it runs, but once there's several consecutive jobs in the queue this causes puppet to be effectively turned off for hours.

Do we really need this? Can we run engine upgrade jobs inside mock to not affect the node itself?

Activity

Show:

Former user November 3, 2016 at 1:30 PM

Logged ovirt-808 to track moving upgrade jobs to lago - that should fix the issue. In general, disabling puppet during a job run is not critical and is only visible when there's a lot of builds so dozens of nodes get stale on Foreman. I think we can close this case.

Eyal Edri November 3, 2016 at 1:03 PM

any action item here?

Barak Korren October 25, 2016 at 7:29 AM

We need this because engine-setup does not know how to recover from a locked yum. Effectively this means that if Puppet runs while engine-setup is running, the engine-setup will fail.

Engine upgrade cannot run inside mock because it needs access to things like systemd which are not typically found inside mock.

We've discussed this in the past, we should have a ticket somewhere to migrate the upgrade jobs to Lago.

Fixed

Details

Assignee

Reporter

Priority

Created October 19, 2016 at 3:50 PM
Updated January 20, 2017 at 11:58 AM
Resolved November 3, 2016 at 1:31 PM