Getting started with Alerts

Alerts are critical enablers in the proactive journey of IT support teams. They allow teams to detect issues and help them prioritize their efforts to improve the digital employee experience (DEX).

Before you begin

As an administrator, you should enable permissions to ensure the correct configuration and monitoring of Nexthink Alerts:

  1. Select Administration > Roles from the main navigation panel.

  2. Create a New Role or edit an existing role by hovering over it.

  3. In the Permissions section, scroll down to the Alerts section to enable appropriate permissions for the role.

Refer to the Roles documentation for a detailed description of Permissions, View domain options and Data privacy granularity settings.


What are monitors and alerts?

A monitor is a rule that you configure to regularly evaluate metrics against defined conditions or performance metrics, such as system crashes, load times, or failed connections.

  • Monitors trigger an alert when the defined conditions are met.

  • Monitors can be custom-created or built-in (system monitors or installed from Nexthink Library).

An alert is therefore the result of a monitor detecting issues or anomalies in your IT environment.

  • Triggered alerts are visible in the timeline on the Alerts overview.

  • If configured, triggered alerts activate emails or webhook notifications to communicate issues within your organization.

  • An alert stays open until metric values stabilize and a subsequent evaluation closes it.

Explore the Alerts FAQ to learn more about alert opening and closing logic.

In addition, refer to the Managing Alerts documentation to learn about setting up and customizing monitors.

How do alerts detect issues and help me diagnose them?

Alerts notify you about issues that require swift action and actual user intervention—situations where something changed or occurred unexpectedly.

Alerts issue-detection enables you to adjust the detection granularity and determine the affected users/devices, from a single disruption to widespread incidents:

Detect global issues impacting your environment

Detect issues Nexthink identifies based on the cross-organization statistics that impact your environment with Alerts cloud insights:

  • Learn about binary reliability and performance, and detect anomalies such as abnormal CPU usage.

  • Quickly identify impacted binary versions and find the recommended version.

Refer to the Alerts overview to learn how to monitor and use alerts for diagnostic purposes.

Detect frequent issues across devices

Monitor the values of any metric across multiple devices to detect whether an aggregated metric value has breached the defined threshold, or shifts by a specific percentage.

For example, the system triggers an alert when the number of crashes of any binary increases by 100% in relation to a predefined norm, like the average of a metric value over the last 7 days.

Refer to the Detecting issues impacting multiple devices for more information.

Detect a specific device or user with issues

Monitor issues on a single device or for a specific user to subsequently trigger alerts if applicable. Send separate notifications for each device or user.

For example, the system triggers an alert for each device that had at least 2 system crashes within the last 24 hours and creates a ticket in the ITSM software on behalf of the user.

Refer to the Detecting issues impacting a single device or user documentation for more information.

Detect the number of devices or users with issues

Detect whether a certain number of devices or users experienced an issue.

For example, the system triggers an alert when more than 20 devices had a boot time of over 60 seconds, within the past 24 hours.

Explore the Alerts FAQ to learn how to investigate devices associated with an existing alert, using NQL queries.

How does the system trigger and close an alert?

Nexthink monitors trigger alerts by regularly evaluating metrics against defined conditions or performance metrics.

This continuous evaluation can be scheduled for regular intervals or configured for real-time monitoring to detect issues instantly, indicating how long a threshold is breached.

Regardless of the trigger method, the monitor determines whether to open a new alert, keep the current alert Open, or close it.

An alert stays open until metric values stabilize and a subsequent evaluation closes it.

Explore the Alerts FAQ to learn more about alert opening and closing logic.

Are there monitors I can use out of the box?

When you open Nexthink Alerts and Diagnostic for the first time, the system activates built-in monitors by default to track your IT environment for the most common issues.

Find below the available built-in system monitors, you can tailor these monitors to your preferences. Refer to the Customizing built-in monitors for more information.

Binary performance
  • Binary connection establishment time increase: Keeps track of the average connection establishment time per binary in the last hour, for each binary present in the environment.

  • Binary crashes - High percentage of devices impacted: Keeps track of the percentage of devices with execution crashes per binary in the last 24 hours, for each binary present in the environment.

  • Binary crashes increase: Keeps track of the number of crashes per binary in the last 24 hours, for each binary present in the environment.

  • Binary failed connection ratio increase: Keeps track of the percentage of failed connections per binary in the last hour, for each binary present in the environment.

  • Binary freezes - High percentage of devices with freezes: Keeps track of the percentage of devices with freezes per binary in the last hour, for each binary present in the environment.

  • Binary memory - Average memory usage increase: Keeps track of the average memory usage per binary in the last six hours, for each binary present in the environment.

Binary - global anomalies
  • Binary CPU usage - global anomaly: Detects anomalies in CPU usage across versions or configurations of binaries, based on anonymized data from all companies using Nexthink.

  • Binary crashes - global anomaly: Detects anomalies in crashes reliability across versions or configurations of binaries, based on anonymized data from all companies using Nexthink.

  • Binary freezes - global anomaly: Detects anomalies in freezes frequency across versions or configurations of binaries, based on anonymized data from all companies using Nexthink.

  • Binary memory usage - global anomaly: Detects anomalies in memory usage across versions or configurations of binaries, based on anonymized data from all companies using Nexthink.

Binary - lagging performance
  • Binary CPU usage - lagging performance: Detects when the CPU usage of a specific binary in your organization is higher than that of other companies using the same binary. Nexthink benchmarks CPU usage with anonymized data from all companies using Nexthink.

  • Binary crashes - lagging performance: Detects when the crashes frequency of a specific binary in your organization is higher than that of other companies using the same binary. Nexthink benchmarks binary crashes with anonymized data from all companies using Nexthink.

  • Binary freezes - lagging performance: Detects when the freezes frequency of a specific binary in your organization is higher than that of other companies using the same binary. Nexthink benchmarks binary freezes with anonymized data from all companies using Nexthink.

  • Binary memory usage - lagging performance: Detects when the memory usage of a specific binary in your organization is higher than that of other companies using the same binary. Nexthink benchmarks memory usage with anonymized data from all companies using Nexthink.

Device performance decline
  • Boot duration increase: Keeps track of the average device boot duration.

  • Logon duration increase: Keeps track of the average device logon duration.

  • System crashes increase: Keeps track of the number of devices with system crashes per OS platform in the last day, for each OS platform present in the environment.

Device performance and connectivity poor ratings

Detect issues based on the poor thresholds that Nexthink Administrators can configure for endpoint-related performance metrics of the Digital Employee Experience score. Refer to Ratings management for more information.

  • Boot speed: Detects devices with poor Boot duration ratings.

  • Logon speed: Detects devices with poor Login time ratings.

  • CPU usage: Detects devices with frequent (30% of the time) poor CPU usage ratings.

  • CPU interrupt usage: Detects devices with frequent (30% of the time) poor CPU interrupt ratings.

  • Disk queue length: Detects devices with frequent (30% of the time) poor Disk queue length ratings.

  • System free space: Detects devices with poor System drive free space rating.

  • WiFi strength: Detects devices with frequent (30% of the time) poor WiFi signal strength ratings.

  • GPU 1 / GPU 2 usage: Detects devices with frequent (30% of the time) poor GPU1 or GPU2 usage rating.

  • Virtual session lag: Detects devices with poor average session network latency.

  • WiFi upload speed: Detects devices with frequent (30% of the time) poor WiFi transmission rate ratings.

  • WiFi download speed: Detects devices with frequent (30% of the time) poor WiFi receive rate ratings.

Web applications
  • Web applications errors increase: Keeps track of the increase of the number of pages with errors per web application in the last hour, for each web application defined in the Applications module.

  • Web applications slow page loads increase: Keeps track of the average page load time per web application in the last hour, for each web application defined in the Applications module.

  • Web applications slow transactions increase: Keeps track of the average transaction duration per web application in the last hour, for each web application defined in the Applications module.

  • Web applications - resource errors increase: Keeps track of the number of resource errors during past 12h, for each web application configured in your environment.

  • Web applications - percentage of incomplete transactions increase: Keeps track of the percentage of incomplete transactions during past 12h, for each transaction configured as part of web application configuration in your environment.

  • Web applications - percentage of frustrating transactions per transaction increase: Keeps track of the percentage of frustrating transactions during past 12h, for each transaction configured as part of web application configuration in your environment.

  • Web applications - percentage of frustrating page loads per key page increase: Keeps track of the percentage of frustrating page loads during past 12h, for each key page and web application configured in your environment.

Additionally, you can install and customize monitors from the Nexthink Library.

Library monitors for virtual desktop infrastructure—VDI

Nexthink Library offers built-in VDI-specific monitors to track real-time performance metrics and user experience in virtual desktop infrastructures.

Find below some of the use cases that built-in library VDI monitors allow you to address:

  • Detect network congestion causing latency spikes in specific office locations.

  • Identify CPU bottlenecks affecting virtual desktops.

  • Prevent overloading of a desktop pool to maintain optimal user experience.

  • Identify network instability affecting session continuity.


RELATED TOPIC

Last updated

Was this helpful?