Single project tests in change-queue-tester

Description

Please change change-queue-tester jobs for testing a single project at a
time (ok for multiple patches at once, but from the same project).

Right now if multiple patches are merged between a change-queue-tester
execution and the next one are done, all of them will be tested in the next
run, even if the changes are in different projects.
So, supposing project A and B are completely fine, a failure in project C
can still prevent changes in A and B to go ahead through the pipeline and
get released.
This cause major headaches at least to me, requiring me to go again over
all the HEAD of the projects not being published and run yet again "ci
re-merge please" again and again and again until I'm lucky enough to have
nobody else merging patches or having all the ttested patches pass at once.

I understand the need to reduce the amount of executions of the job since
it takes an hour to execute but right now it's stealing days of execution
for getting a patch landing on tested repo.

SANDRO BONAZZOLA

ASSOCIATE MANAGER, SOFTWARE ENGINEERING, EMEA ENG VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/>
<https://red.ht/sig>
TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>

Activity

Show:

Eyal Edri January 1, 2019 at 12:20 PM

It seems that there shouldn't be an inherent issue with CQ logic and it might be some projects that are not necessarily being installed via OST might not get built for a while and don't end up in tested.
The way to solve it to report on a specific project once it happens, so we can debug and the decide:
1. If its a bug in the system and fix it
2. Add a test to install that pkg in OST so CQ will fail if its not in tested
3. Decide on another action, but we need real use case to work on.

Please open an issue next time you see a project is missing from tested and we'll make sure to debug and see what is the best way to solve it.

Dafna Ron March 6, 2018 at 1:48 PM

we monitor and report an issue once its identified.
we do not however, report every failed change which has failed because of the original root cause.

I do try to go and either re-trigger or re-merge changes if they failed cq once the root cause is fixed but if an issue lasts for a day or two it would be difficult to make sure we do not miss anything.

maybe we need to add a warning message after an issue is fixed which lists all changes that need to be re-triggered or re-merged?

Barak Korren March 6, 2018 at 12:55 PM

How a pacakge maintainer will know that a package didn't land in tested?
It tooks me 2 months to discover we have packages not getting in tested and I noticed just by mistake.
And I'm one of those maintainers that cares a lot about packages being published.

/infra-owner is supposed to tell you. This reporting is not fully automated yet because of false positives. Specifically WRT build failures you already get a comment in Gerrit.

Barak Korren March 6, 2018 at 12:51 PM

I'm not sure that the bisection really works then, since we have still packages released 2 months ago not yet landing on tested. I don't recall to have seen any job triggered automatically after a change-queue-tester failure doing bisection tests.

It works. 'change-queue-tester' itself gets re-triggered by 'change-queue' with a different set of patches after failure. Only failures that were narrowed down to a single patch are reported to infra-list.

You can see the state of the queue an the bisection status in a graphical display on the status page of the 'change-queue' job.

If you see something that wasn't moved to tested it can be because:

  1. the HEAD itself that was tested is really broken

  2. some build job failed because of an infra issue (e.g s390x/fcraw issue)

  3. OST itself was broken at the time of the test

"ci re-merge please" is for handling the last 2 reasons, essentially or the infra owner is supposed to be monitoring the CQ reports, detecting theses cases and issuing "ci re-merge please" as needed.

Sandro Bonazzola March 6, 2018 at 12:36 PM

Maybe we need to use always the official release repo as fallback to tested?

I don't think so. Tested repo should not require official release repo.
Official release should be eventually generated from tested repo, not the other way around.

In case a specific PKG failed to build and it's maintainer didn't make sure

to fix it and redeploy, it makes sense to get missing pkgs from latest

official release, as we do in OST.

How a pacakge maintainer will know that a package didn't land in tested?
It tooks me 2 months to discover we have packages not getting in tested and I noticed just by mistake.
And I'm one of those maintainers that cares a lot about packages being published.

Once the maintainer of the project will build a new image and it passes, it

will be in tested.

That's not always true. I had several packages that required me to issue "ci re-merge please" multiple times before getting to tested repo without any code change on the project itself.

Cannot Reproduce

Details

Assignee

Reporter

Priority

Created March 6, 2018 at 10:44 AM
Updated August 29, 2019 at 2:12 PM
Resolved January 1, 2019 at 12:20 PM