OST jobs fails on "address already in use"

Description

Evgheni,
Was there any change recently to Lago slaves?

On Fri, Oct 20, 2017 at 11:05 AM, Piotr Kliczewski <
piotr.kliczewski@gmail.com> wrote:

> I attempted to run manual OST twice and both failed with below issue.
> Can someone take a look?
>
> Thanks,
> Piotr
>
> 2017-10-20 07:59:12,485::log_utils.py::_exit_::607::ovirtlago.prefix:
> EBUG::
> File "/usr/lib/python2.7/site-packages/lago/log_utils.py", line 636,
> in wrapper
> return func(*args, **kwargs)
> File "/usr/lib/python2.7/site-packages/ovirtlago/reposetup.py", line
> 111, in wrapper
> with utils.repo_server_context(args[0]):
> File "/usr/lib64/python2.7/contextlib.py", line 17, in _enter_
> return self.gen.next()
> File "/usr/lib/python2.7/site-packages/ovirtlago/utils.py", line
> 100, in repo_server_context
> root_dir=prefix.paths.internal_repo(),
> File "/usr/lib/python2.7/site-packages/ovirtlago/utils.py", line 76,
> in _create_http_server
> generate_request_handler(root_dir),
> File "/usr/lib64/python2.7/SocketServer.py", line 419, in _init_
> self.server_bind()
> File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in server_bind
> SocketServer.TCPServer.server_bind(self)
> File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind
> self.socket.bind(self.server_address)
> File "/usr/lib64/python2.7/socket.py", line 224, in meth
> return getattr(self._sock,name)(*args)
>
> 2017-10-20 07:59:12,485::cmd.py::do_run::365::root::ERROR::Error
> occured, aborting
> Traceback (most recent call last):
> File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 362, in
> do_run
> self.cli_plugins[args.ovirtverb].do_run(args)
> File "/usr/lib/python2.7/site-packages/lago/plugins/cli.py", line
> 184, in do_run
> self._do_run(**vars(args))
> File "/usr/lib/python2.7/site-packages/lago/utils.py", line 501, in
> wrapper
> return func(*args, **kwargs)
> File "/usr/lib/python2.7/site-packages/lago/utils.py", line 512, in
> wrapper
> return func(*args, prefix=prefix, **kwargs)
> File "/usr/lib/python2.7/site-packages/ovirtlago/cmd.py", line 166,
> in do_deploy
> prefix.deploy()
> File "/usr/lib/python2.7/site-packages/lago/log_utils.py", line 636,
> in wrapper
> return func(*args, **kwargs)
> File "/usr/lib/python2.7/site-packages/ovirtlago/reposetup.py", line
> 111, in wrapper
> with utils.repo_server_context(args[0]):
> File "/usr/lib64/python2.7/contextlib.py", line 17, in _enter_
> return self.gen.next()
> File "/usr/lib/python2.7/site-packages/ovirtlago/utils.py", line
> 100, in repo_server_context
> root_dir=prefix.paths.internal_repo(),
> File "/usr/lib/python2.7/site-packages/ovirtlago/utils.py", line 76,
> in _create_http_server
> generate_request_handler(root_dir),
> File "/usr/lib64/python2.7/SocketServer.py", line 419, in _init_
> self.server_bind()
> File "/usr/lib64/python2.7/BaseHTTPServer.py", line 108, in server_bind
> SocketServer.TCPServer.server_bind(self)
> File "/usr/lib64/python2.7/SocketServer.py", line 430, in server_bind
> self.socket.bind(self.server_address)
> File "/usr/lib64/python2.7/socket.py", line 224, in meth
> return getattr(self._sock,name)(*args)
> error: [Errno 98] Address already in use
> _______________________________________________
> Infra mailing list
> Infra@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/infra
>
>
>

Eyal edri

MANAGER

RHV DevOps

EMEA VIRTUALIZATION R&D

Red Hat EMEA <https://www.redhat.com/>
<https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
phone: +972-9-7692018
irc: eedri (on #tlv #rhev-dev #rhev-integ)

Activity

Show:

Former user February 28, 2018 at 2:24 PM

It was done in https://gerrit.ovirt.org/c/87164/

I'll close the ticket

Eyal Edri February 28, 2018 at 2:13 PM

any update?

Former user February 1, 2018 at 7:02 AM

We'll probably add a cleanup function to global setup to ensure that lago can run smoothly. I'll talk with offline to see what needs to be done.

Eyal Edri January 31, 2018 at 2:43 PM

So what is the alternate fix?

Former user January 31, 2018 at 2:20 PM

We've abandoned the fix from mock_runner's side for now

Eyal Edri January 31, 2018 at 2:16 PM

what was the latest fix on this issue? did we abandon the fix on mock runner ? is there a plan to fix it from lago side?

Dafna Ron January 31, 2018 at 12:59 PM

We had 2 failures today on network in use.
in the latest one I can see that the build that ran before failed on libvirt issue.

Here is the failed build: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5173/
Here is the build that failed before: http://jenkins.ovirt.org/computer/ovirt-srv17.phx.ovirt.org/builds
This is the host: http://jenkins.ovirt.org/computer/ovirt-srv17.phx.ovirt.org/

Eyal Edri October 31, 2017 at 2:41 PM

Not sure if there anything else to do here, other than solving it on Lago side and we have a ticket there.
Feel free to close if we found the source ( networking suite ) and educated the maintainer how to use lago serve

Fixed

Details

Assignee

Reporter

Priority

Created October 20, 2017 at 8:28 AM
Updated February 28, 2018 at 3:33 PM
Resolved February 28, 2018 at 2:25 PM