Gerrit patches can reach CQ out-of-order

Description

Is seems that despite the fact that all the jobs involved were designed to preserve proper patch ordering, there can be cases where patches trigger Jenkins jobs in the wrong order, which in turn end up putting the patches in the wrong order into the CQ. This in turn, causes the CQ to fail accurately detecting the patches that cause issues.

Here is en example of such an occurance. The following patches were merged in the order specified below:

  1. https://gerrit.ovirt.org/c/83917

  2. https://gerrit.ovirt.org/c/83918

  3. https://gerrit.ovirt.org/c/83919

  4. https://gerrit.ovirt.org/c/83920

The standard-enqueue runs putting them in the CQ however, seem to have run in a different order:

  1. http://jenkins.ovirt.org/job/standard-enqueue/6712/ (83917)

  2. http://jenkins.ovirt.org/job/standard-enqueue/6713/ (83919)

  3. http://jenkins.ovirt.org/job/standard-enqueue/6714/ (83920)

  4. http://jenkins.ovirt.org/job/standard-enqueue/6715/ (83918)

This resulted in CQ 'add' command running in the wrong order:

  1. http://jenkins.ovirt.org/job/ovirt-master_change-queue/12200/ (add 83917)

  2. http://jenkins.ovirt.org/job/ovirt-master_change-queue/12201/ (add 83919)

  3. http://jenkins.ovirt.org/job/ovirt-master_change-queue/12202/ (add 83920)

  4. http://jenkins.ovirt.org/job/ovirt-master_change-queue/12203/ (add 83918)

We need to come up with a way to make the system properly preserver order.

Activity

Show:

Piotr Kliczewski May 22, 2018 at 7:16 AM

Yes, I am guilty of this one. I often do that and as far as I know other people are doing it. If that is the issue I can trigger merges one by one but we would have usablility issue :|.

Barak Korren May 22, 2018 at 6:15 AM

I think 1 hour should suffice, but it depends on how loaded the CQ is, Barak Korren any thoughts?

There is no need to wait one hour and this has nothing to do with the delay between merging 'vdsm-jsonrpc-java' and 'ovirt-engine'. The issue discussed in this ticket is triggered when reviewing a series of patches and merging all of then together by clicking the "submit with parents" button on a child patch in Gerrit. As long as patches are merged individually by clicking on the "merge" button for each one the issue shouldn't arise.

This issue should not be confused with the need to do dependency handling so that equivalent versions of engine and 'vdsm-jsonrpc-java' are tested together. That is tracked in OVIRT-1662.

Eyal Edri May 21, 2018 at 2:13 PM

I think we've seen it so far only on vdsm-jsonrpc-java, but maybe I'm not aware of all the incidents.
Also, differ between the old way we published in 4.1 ( copy from job ) and the new flow in 4.2 with CQ, which is a different issue than before.

I think 1 hour should suffice, but it depends on how loaded the CQ is, any thoughts?

Piotr Kliczewski May 21, 2018 at 2:07 PM

eedri@redhat.com I do not think it is related only to vdsm-jsonrpc-java and it my opinion it can happen for anyone. In general I give couple of hrs between merging jsonrpc version update patch and the engine patch. Do you have any specific time in your mind for how long should I wait?

Eyal Edri May 21, 2018 at 1:58 PM

since this issue only happened with vdsm-jsonrpc-java AFAIK, due to its tight integration with engine, How often would you say such issue happens? can we allow for a delay between version bumps patches of the project to try and avoid it until we implement a permenant solution?

Details

Assignee

Reporter

Components

Priority

Created November 13, 2017 at 9:44 AM
Updated June 24, 2018 at 7:24 AM