Uploaded image for project: 'oVirt - virtualization made easy'
  1. OVIRT-1870

kubevirt_kubevirt_standard-check-pr jobs often get stuck

    Details

      Description

      kubevirt_kubevirt_standard-check-pr jobs often get stuck waiting forever for a connection to get established:

      http://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/376/console
      http://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/377/console
      http://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/378/console
      http://jenkins.ovirt.org/job/kubevirt_kubevirt_standard-check-pr/380/console

      The following message keeps repeating all the time:

      17:03:32 [check-patch.el7.x86_64] ++ awk '/virt-controller/ && /true/'
      17:03:32 [check-patch.el7.x86_64] ++ kubectl get pods -n kube-system '-ocustom-columns=status:status.containerStatuses[*].ready,metadata:metadata.name' --no-headers
      17:03:32 [check-patch.el7.x86_64] ++ wc -l
      17:03:32 [check-patch.el7.x86_64] ++ cluster/kubectl.sh get pods -n kube-system '-ocustom-columns=status:status.containerStatuses[*].ready,metadata:metadata.name' --no-headers
      17:03:44 [check-patch.el7.x86_64] Unable to connect to the server: dial tcp 192.168.121.111:6443: getsockopt: no route to host
      17:03:44 [check-patch.el7.x86_64] + '[' 0 -lt 1 ']'
      17:03:44 [check-patch.el7.x86_64] + echo 'Waiting for KubeVirt virt-controller container to become ready ...'
      17:03:44 [check-patch.el7.x86_64] Waiting for KubeVirt virt-controller container to become ready ...
      17:03:44 [check-patch.el7.x86_64] + kubectl get pods -n kube-system '-ocustom-columns=status:status.containerStatuses[*].ready,metadata:metadata.name' --no-headers
      17:03:44 [check-patch.el7.x86_64] + awk '/virt-controller/ && /true/'
      17:03:44 [check-patch.el7.x86_64] + cluster/kubectl.sh get pods -n kube-system '-ocustom-columns=status:status.containerStatuses[*].ready,metadata:metadata.name' --no-headers
      17:03:44 [check-patch.el7.x86_64] + wc -l
      17:03:56 [check-patch.el7.x86_64] Unable to connect to the server: dial tcp 192.168.121.111:6443: getsockopt: no route to host
      17:03:56 [check-patch.el7.x86_64] 0
      17:03:56 [check-patch.el7.x86_64] + sleep 10

      Need to implement timeouts as this takes up bare metal systems for days and weeks until someone manually kills the job

        Attachments

          Issue links

            Activity

              People

              • Assignee:
                Barak Korren
                Reporter:
                Evgheni Dereveanchin
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: