Getting started with Alerts
Last updated
Last updated
Alerts are critical enablers in the proactive journey of IT support teams. They allow teams to detect issues and help them prioritize their efforts to improve the digital employee experience (DEX).
Nexthink Alerts notifies you about issues that require swift action by filtering the noise so you can identify situations that require actual user intervention. Use alerts to identify situations where something has unexpectedly changed or occurred.
Detect issues Nexthink identified based on the cross-organization statistics that impact your environment, referred to as Cloud insights. Learn about binary reliability and performance, and detect anomalies such as abnormal CPU usage. Easily identify impacted binary versions and find the recommended version easily.
For example, the system triggers an alert when more than 50 devices use a certain binary configuration (binary version on a given operation system version) that consumes more memory than other configurations across many organizations.
Refer to the Alerts overview and Understanding cloud insights pages for more information.
Proactively monitor issues that impact multiple devices or incidents of sudden degradation.
Detect whether a certain number of devices or users experienced an issue.
For example, the system triggers the alert when more than 20 devices had a boot time of over 60 seconds, during the past 24 hours.
Monitor the values of any metric across multiple devices. Detect whether an aggregated metric value has breached the defined threshold or shifts by a specific percentage.
For example, the system sends an alert when the number of crashes of any binary increases by 100% in relation to a predefined norm, like the average of a metric value over the last 7 days.
Refer to the Detecting issues impacting multiple devices page for more information.
Monitor issues on a single device or for a specific user. Send separate notifications for each device or user.
For example, the system triggered an alert for each device that had at least 2 system crashes during the last 24 hours and created a ticket in the ITSM software on behalf of the user.
Nexthink limits the total number of objects that trigger the same alert to 500, avoiding alert flooding and keeping the required relevancy of individual alerts.
Refer to the Detecting issues impacting a single device or user page for more information.
Do not use alerts for reporting purposes that do not require immediate assistance or action. For example, to Report all devices with low disk space, Data Exporter capabilities are more suitable.
Use data exporters to report on a large number of objects that meet specific condition criteria that you can express with an NQL query, or if you expect that the system might trigger more than 500 alerts at the same time.
Additionally, use the data export scheduling option to export data on a regular basis. Refer to the Data Exporter page for more information.
An alert is a special type of event triggered when specific conditions are met for the performance metrics of different features of your IT infrastructure, for example, system crashes, load times, or failed connections. The system sends alerts in the form of an email or a webhook notification informing your IT teams about issues occurring within your organization. Triggered alerts are visualized in the timeline on the Alerts overview page.
A monitor is a component of the Alerts and Diagnostics module that you can configure to evaluate metrics against defined conditions and trigger alerts to identify specific issues. With monitors, the Nexthink platform offers anomaly detection capabilities for IT environments and allows you to notify users accordingly.
Refer to the Managing Alerts page for more information about monitors, monitor types, and how to create them.
Nexthink alerts detect critical issues based on the following detection modes:
Metric threshold: triggers an alert when value of one or more metrics reaches a user-defined threshold.
Metric change: triggers an alert when value of the metric reaches the reference baseline value as the average of the metric values retrieved over the past 7 days. This option is available only for built-in monitors.
Metric seasonal change: triggers an alert when value of the metric reaches the expected average value of the last 7 days at the same time of the day. The monitor triggers an alert when the value falls outside of the expected range, calculated using standard deviations. This option is available only for built-in monitors.
Global detection: triggers an alert when a specified number of devices use a particular binary version or binary configuration that performs worse than other versions or configurations across organizations using Nexthink. You can adjust the threshold for this alert within your organization. This option is available only for system monitors.
Refer to Customizing built-in monitors documentation page for more information about detection types.
Each NQL query-based monitor evaluates the metric(s) in regular intervals, according to the schedule defined in the specific monitor. During each evaluation it determines whether to trigger a new alert, close the open alert or keep the alert status open.
The alert is triggered when the condition criteria defined in the monitor are breached during scheduled evaluation. Once the alert is triggered it will remain in the Open state until the metric values stabilize and the alert is closed during one of the subsequent evaluations.
The system closes the alert when any of the monitored metrics no longer breaches the conditions defined.
If the monitor tracks metric threshold, the system closes the alert when any of the monitored metrics no longer breaches the threshold.
If the monitor tracks metric change, the system closes the alert when that metric value drops down to touch the baseline.
In case the monitor query does not return any data during evaluation, the alert will automatically close according to the following rules:
For alerts that track aggregated metrics across multiple devices, the alert will close if there have been 3 consecutive days of no data returned.
For alerts triggered for single device or user, the alert will close if the monitor query continuously returns no data during the period specified in the during past
parameter of the query.
If you have configured the notifications for your alert, the system sends them only when the alert is triggered and when the alert is closed. If the alert was triggered during any previous evaluation and already has the Open status, the system will not send the notification if the metric still meets the detection criteria in the current evaluation.
To enable proper permissions for Alerts, as an administrator, do the following:
Select Administration from the main menu.
Click on Role from the navigation panel.
Click on the New Role button to create a new role or edit an existing role by hovering over it and clicking on the edit icon to change the role configuration.
In the Permissions section, scroll down to the Alerts section to enable the appropriate permissions.
Refer to the Roles page for a detailed description of the possible options.
The table below showcases what users with full and limited view domain access can do, assuming the necessary permissions are enabled.
Permission | Full access | Limited access |
---|---|---|
Manage all alerts | ||
View all alert dashboards |
Users with full access to view domain and the necessary permissions can:
Manage all alerts.
View all alert dashboards.
RELATED TOPIC