Configuration guide: Slow PC troubleshooting

Introduction

To get started with this workflow, please ensure all related content is installed and configured appropriately. This page provides guidance on which content is included and how to configure it.

Please keep in mind this is just a guide and represents suggested configurations. You are free to customize and edit content as you see fit based on your specific environment.

Dependencies

To utilize this workflow, you need to install the necessary content into your Nexthink Infinity tenant.

Pre-requisites

This library pack contains content from the following expansion products.

Content and dependency

Configuration

Step 1) Install library pack content

Go to the Nexthink Library and install all required content.

Step 2) Configure ITSM API connector credentials

The configuration of connector credentials is essential for enabling API calls. See detailed information at https://nexthink.gitbook.io/opd/integrations/outbound-connectors/connector-credentials. Each Service/API thinklet has a dropdown field for credentials that needs to be filled out. When the workflow is installed or copied from the Library, this field will be blank as it is a local setup of each environment and is not included in the Library.

ServiceNow actions can be created using the built-in ServiceNow connector. The required action and the connector credentials can be selected from the drop down lists, and the available parameters will change in line with the action chosen.

Step 3) Configure remote action(s)

Please note: To be used in a workflow, the following remote actions must be configured with a manual trigger. This can be combined with other execution triggers if the remote action is also used outside of a workflow.

There are twelve remote actions used by, and included with this workflow. For the workflow to function correctly, some of these remote actions are triggered with different input parameters to the default (For example, the remote action “Disk cleanup” is triggered with a “Deep Clean”). Custom campaigns created for this workflow may launch before a remote action is triggered, in order to request permission to run it, or after a remote action has run, depending on it’s outputs. As a result, it may be necessary to change the default behaviour of certain remote actions by allowing custom input parameters. This will prevent campaigns built-in to the remote action from launching and causing confusion.

Step 5) Configure campaigns

This workflow contains eight Engage campaigns. During the guided process of troubleshooting the device, these campaigns will appear to give advice, or when an intervention on the device is required.

These campaigns should be modified before use to ensure that they match corporate communication guidelines.. Navigate to the manage campaigns administration page to review and edit your campaigns.

For each installed campaign, please ensure to:

  • Customize the sender name and image.

  • Review and adjust questions.

  • Publish the campaign when you are ready to use it.

Step 6) Schedule the workflow

Manual triggering

This workflow is not designed to be scheduled on a regular basis but executed manually on devices that could benefit from it. It is principally designed to automate the normal processes that an L1 support agent would perform when a slow running device is reported and could be added as a diagnostic step in that process.

Proactive detection

If there is a desire to proactively run this workflow on devices that are suspected to be running slow, please consider that the identification of target devices can be subjective due to the range of factors involved and may differ from customer to customer. For this reason we offer different metrics to be investigated that report on different aspects of a device’s performance as perceived slowness is likely to be due to a combination of factors. Some metrics that could be considered are:

  • CPU usage

  • CPU queue length (if the CPU queue length exceeds double the number of available logical processors for an extended period of time, this can indicate an issue with the device)

  • CPU interrupt usage (an extended duration of CPU interrupt usage can indicate high workload)

  • Disk queue length (Not in itself an indicator of slowness, but in conjunction with other metrics)

  • Memory swap rate to disk (frequent spikes can be the result of insufficient, or over used memory)

  • Memory swap size (continued usage can be the result of insufficient, or over used memory)

  • Available memory ratio

A dashboard containing KPIs and tables with sample queries is provided with this pack. All queries provided have thresholds that should be adjusted and query logic (permutations of and/or operators) that should be tested against devices in your landscape before use.

Code
1 devices during past 24h
2 | include device_performance.events
3 | compute cpu_queue_ratio = cpu_queue_length.avg()/(number_of_logical_processors.avg()*2), cpu_usage_ = normalized_cpu_usage.avg(), high_interrupt = duration_with_high_cpu_interrupt_usage.sum(), medium_interrupt = duration_with_medium_cpu_interrupt_usage.sum(), mem_swap_rate = memory_swap_rate.avg(), mem_swap_size = memory_swap_size.avg(), mem_ratio = used_memory.avg() / installed_memory.avg(), disk_queue_length_ = disk_queue_length.avg(), disk_space_ratio = system_drive_free_space.avg() / system_drive_capacity.avg()
4 | where cpu_queue_ratio >= 1 or cpu_usage_ > 80 or high_interrupt > 1min or medium_interrupt > 5min or (mem_swap_size > 5GB and mem_swap_rate > 1MB) or mem_ratio > 0.85 or disk_queue_length_ >= 1 or disk_space_ratio >= 0.90
5 | include execution.events
6 | compute response_time = connection_establishment_time.avg()
7 | where response_time >= 50ms
8 | sort cpu_queue_ratio desc
9 | sort disk_queue_length_ desc

Usage guide

Your content is now configured and ready to be used. For usage overview and recommendations, you can visit the usage guide:

Usage guide: Slow PC troubleshooting

Last updated