Status.ovirt.org doesn't report real status

Description

Some services are not reachable but it is not shown on status page.

Activity

Show:
Marc Dequènes (Duck)
July 5, 2017, 3:23 AM

Do you mean there are broken service right now? Except from glance.ovirt.org which only shows a default Apache page, and I don't know what is to be tested to see if the service works, all the other services seem fine.

AFAIK Cachet is not doing any test by itself, some other tool has to report the status. So either you do it manually or Nagios is allowed to push status changes. Fact is Nagios is in PHX IIRC, which means if it is in the dark it cannot update anything.

It's nice to have monitoring inside your infra to see the stats and resource usage and so on, but if we want a real user view, we need to check from outside (not where another service is hosted), at least the main user-facing services.

Eyal Edri
July 9, 2017, 9:24 AM

No, services are UP AFAIK, I opened this during the last outage, it didn't show any update in the services which were down, which were most of them.
So we need to check why.

Marc Dequènes (Duck)
July 10, 2017, 5:16 AM

IIRC the last outage had PHX all in the dark, so Nagios probably could not update it. It only work if it is a service outage.

So unless we have another service outside updating the status, it will sometimes not reflect reality.

What we can do is ask Evegheny to check the Nagios update just to be sure when he is back from PTO.

Eyal Edri
July 10, 2017, 9:46 AM

Icigna should have updated it, and it was up, since it runs from outside of
PHX.

On Jul 10, 2017 8:16 AM, "Marc Dequènes (Duck) (oVirt JIRA)" <

Marc Dequènes (Duck)
July 10, 2017, 12:13 PM

Yes, s/Nagios/icinga/ it's the same sh^Wstuff.

I thought it was inside, my bad, sorry.

Assignee

Evgheni Dereveanchin

Reporter

Eyal Edri

Blocked By

None

Priority

High
Configure