configure jenkins.ovirt.org to G1 garbage collector

Description

I noticed jenkins process cpu consumption going over 100% and stalling the
web handlers. The io wait is not a problem nor mem.

What I suspect is going on is tons of GC and GC pressure given that 12GB
heap and fairly nice amount of users and requests.

What we can do is to configure the gc logging to see if that is really GC
pauses really and to move to using the G1 garbage collector.

See this post from cloudbees on the move to G1 collector
https://www.cloudbees.com/blog/joining-big-leagues-tuning-jenkins-gc-responsiveness-and-stability

Activity

Show:

Former user February 20, 2018 at 12:09 AM

patch applied to jenkins.ovirt.org

Former user February 19, 2018 at 4:13 PM

The change has been working for quite some time on Staging without any issues. At the same time, we've seen several issues were Jenkins was slow so will enable this in prod. Patch sent for review.

Roy Golan November 16, 2017 at 2:27 PM

+1 on this.

Noted 2 things:

  • -XX:G1SummarizeRSetStatsPeriod=1 - this setting is considerd constly, according to the documentation. I think this need to be discarded.

  • for diagnostic for prod and test, we better add some GC logging. Add `-verbose:gc -Xloggcath/to/gc.log` so we will have a clue of how good the GC is. BTW its worth adding that to the prod before the switch to G1 so we will see the difference

  • you added StringDeduplication, not a bad thing and probably have an effect with systems like jenkins. I wonder how effective will that be on ovirt-engine as well. The spec for string dedup says there is a chance to win 10% of the heap usage.

Former user November 16, 2017 at 2:03 PM

Nice article, we should definitely try and implement some of these optimization.
Current Java options used by our Jenkins are visible here:
-Djava.awt.headless=true -Xmx12G -Xms4G -XX:MaxPermSize=3G

With optimizations from the article, they should look like this more or less:

-server -Djava.awt.headless=true -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ExplicitGCInvokesConcurrent -XX:+ParallelRefProcEnabled -XX:+UseStringDeduplication -XX:+UnlockDiagnosticVMOptions -XX:G1SummarizeRSetStatsPeriod=1 -Xmx16G -Xms8G -XX:MaxPermSize=3G

Will test this on Staging and submit a patch to apply on Prod

Done

Details

Assignee

Reporter

Components

Priority

Created November 12, 2017 at 5:00 PM
Updated February 28, 2018 at 3:33 PM
Resolved February 20, 2018 at 12:09 AM