Install test oVirt Engine instance in PHX

Description

The current oVirt instance in PHX runs critical workloads which causes upgrade delays. We can set up another instance that will run less important loads (i.e. Jenkins slaves). This can be updated more often, including beta and snapshot versions of oVirt as potential downtime will not have a negative impact on CI.

The Engine can run as a VM in the existing oVirt instance and it can manage a couple of hypervisors with Jenkins slaves.

Activity

Show:

Former user November 14, 2017 at 5:31 PM
Edited

Here's the configuration on Node-NG side to enable Master Tested updates and VDSM hooks
1) define the repo, including hook and node-ng-image-update packages

cat /etc/yum.repos.d/ovirt-master-tested.repo
# imgbased: set-enabled
[ovirt-master-tested]
enabled=1
name=oVirt master tested
includepkgs=ovirt-node-ng-image-update ovirt-node-ng-image ovirt-engine-appliance vdsm-hook-macspoof vdsm-hook-nestedvt
baseurl=http://resources.ovirt.org/repos/ovirt/tested/master/rpm/el$releasever/
skip_if_unavailable=1
gpgcheck=0

2) install the hooks on nodes

yum install vdsm-hook-macspoof vdsm-hook-nestedvt

3) on Engine - put host into maintenance, then in properties enable Nested Virtualization from the Kernel section.
4) reboot the node and activate it
5) repeat for other nodes

Former user November 14, 2017 at 5:27 PM

Staging environment rebuilt as Master after upgrade failures caused Gluster corruption. Looked up a users-list thread on enabling vdsm-hook-nestedvt on Node-NG and added it to hosts. Several VMs provisioned and added to Staging Jenkins but ready to receive real workloads. Will now focus on automating updates of the environment as Master seems to work more stable than 4.1 at least on the Gluster part.

Former user October 31, 2017 at 1:34 PM

Logged a ticket to implement automatic upgrading of this new instance. One of the Gluster volumes is still inconsistent all the time so we can't really upgrade nodes at this point (they can't be put in maintenance due to uncync'ed entries) but engine upgrades work fine.

Former user October 25, 2017 at 12:27 PM

The Engine was updated to Master and performs just fine. Upgrading one of the nodes, however, seems to have caused a Gluster inconsistency on the hosted engine storage domain. This will probably need manual troubleshooting to fix which brings feasibility of an auto-updated environment under question.

Former user October 13, 2017 at 8:20 PM

I've enabled libgfapi and this now shows much better results, comparable to the test I did on the host directly:

VirtIO:
Version 1.97 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 Per Chr --Block-- Rewrite Per Chr --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
GLUSTER 30G 955 98 88297 13 57460 8 2113 97 159071 10 492.6 38
Latency 12248us 267ms 1397ms 42337us 177ms 132ms

1.97,1.97,GLUSTER,1,1507908458,30G,,955,98,88297,13,57460,8,2113,97,159071,10,492.6,38,,,,,,,,,,,,,,,,,,12248us,267ms,1397ms,42337us,177ms,132ms,,,,,,

VirtIO-SCSI
Version 1.97 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 Per Chr --Block-- Rewrite Per Chr --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
GLUSTER 30G 1110 98 87035 13 59531 9 2178 96 187680 12 479.4 36
Latency 17250us 45601us 1207ms 36813us 316ms 79365us

1.97,1.97,GLUSTER,1,1507919652,30G,,1110,98,87035,13,59531,9,2178,96,187680,12,479.4,36,,,,,,,,,,,,,,,,,,17250us,45601us,1207ms,36813us,316ms,79365us,,,,,,

VirtIO-SCS on newly provisioned device:
Version 1.97 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 Per Chr --Block-- Rewrite Per Chr --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
GLUSTER 30G 572 99 74478 13 72239 13 1445 81 247749 16 445.3 33
Latency 31925us 143ms 673ms 311ms 74059us 71430us

1.97,1.97,GLUSTER,1,1507922894,30G,,572,99,74478,13,72239,13,1445,81,247749,16,445.3,33,,,,,,,,,,,,,,,,,,31925us,143ms,673ms,311ms,74059us,71430us,,,,,,

The I/O lockup issue is also gone in this mode, which is very nice. Now I just need to figure out how to enable the NestedVT VDSM hook on Node NG and we should be able to run actual Jenkins builders on this environment

Fixed

Details

Assignee

Reporter

Components

Priority

Created February 8, 2017 at 4:47 PM
Updated December 4, 2017 at 3:44 PM
Resolved November 14, 2017 at 6:28 PM