Platform status operation
Learn how to use Platform status.
Let's show you the ropes for Platform status.
The Reltio Platform Status page is based on the Atlassian statuspage.io service, so it's therefore completely decoupled from Reltio's infrastructure. For more information, see Statuspage resources.
-
cloud level
-
infrastructure components
-
platform service components
-
performance check
-
health check
A set of automated alerts is defined for each layer of each service. The SRE and NOC teams monitor these alerts 24/7 and respond via an on-call process established in a follow-the-sun manner.
If there is an incident, an alert or notification is published and the teams triage the issue. If the incident can't be resolved in an automated manner, the appropriate team investigates and resolves it.
The NOC regularly and frequently updates the Incident report log of an ongoing incident until it is resolved. After the incident is detected, a Root Cause Analysis (RCA) is published to drive the uncovered root cause as well as the related issues to completion.
- View platform status
- Check the overall current status of all Reltio environments in all clouds over the previous 3 months. For more details, see topic View platform status.
- Monitor cloud service health
- Monitor the health cloud services and components for the current day through the preceding months. For more details, see topic Monitor cloud service health.
- View historical uptime metrics
- View historical uptime metrics for a specified 3-month period. For details, see topic View historical uptime metrics.
- Subscribe to platform status alerts
- Subscribe to automated email alerts to regularly monitor the health of cloud services in your Reltio Data Cloud. For details, see topic Subscribe to platform status alerts.
The topics in this section provide step-by-step instructions for these stages.