Example of self-healing scenario in Finder (classic)

Nexthink Finder is a Windows-only desktop application whose functionality is now available within the Nexthink web interface. Nexthink can now be used directly from a browser and most functions no longer require an additional desktop application.

For their daily operation, the devices of a public company must continuously run an enterprise application called MustRun. This application is not very stable, however, and it crashes intermittently. When mustrun.exe crashes, it is unable to shut down properly. As a result, the application usually leaves one of its log files in an inconsistent state. The corruption of this log file prevents MustRun from starting up again, until the file is deleted.

As soon as an employee realizes that MustRun is no longer running on a device, they must restart the application by deleting the corrupted log file and then relaunching mustrun.exe. This procedure is inconvenient for an employee and has a negative impact on their productivity. In addition, inexperienced employees who are not yet familiar with the problem are very likely to require support from the help desk in order to solve it.

In this example, learn how to leverage Nexthink Remote Actions to automatically detect the crash of mustrun.exe, delete the log file, and restart the application. Automating the process results in increased productivity of the employee and a reduction in the number of reported incidents.

Creating the remote action

To remediate the problem of the MustRun application with Remote Actions, create and schedule a remote action. The remote action will include:

An investigation that selects those devices on which MustRun has recently crashed.
A PowerShell script that deletes the corrupted log file and restarts the application on the selected devices.

Defining the target investigation

Start by creating an investigation that detects the crashes of the executable file mustrun.exe and returns the devices on which it happened. Set the time frame of the investigation to span the last 30 minutes. Combined with appropriate scheduling of the remote action, this time frame ensures that no reported application crash is missed. This is a very conservative choice. Because of the speed at which Collector reports application crash events, time frames of around 10 minutes should be equally valid:

Since Nexthink detects application crashes, the investigation in the remote action in this example is already able to return the specific devices on which the problem has occurred. On the other hand, in cases where Nexthink does not retrieve the information needed to detect the issue by default, for example a change in the value of a registry key, check the faulty condition in the script of the remote action. When the script checks the occurrence of a problem, the associated investigation must target all potentially impacted devices.

Therefore, we can classify the problem detection mechanism of a remote action as either:

Investigation-based.
Script-based.

Scheduling the remote action

After saving the investigation, create a remote action that periodically evaluates the investigation and runs the remediation script on the selected devices.

To properly schedule the remote action, configure the two periods:

Evaluation period

The time interval between two evaluations of the associated investigation. In our example, this value indicates how often the remote action checks for MustRun crashes. This period should be lower than or equal to the time frame of the associated investigation to not miss any application crash event. The smaller the evaluation period, the more responsive the remote action is and in turn, more load is put into the system. To detect application crashes, an evaluation period of 10 minutes should be responsive enough. For critical applications, select a fast evaluation period as low as 1 minute.

Triggering period

The time interval between two consecutive triggerings of the remote action . For a remote action that detects issues by means of an investigation on events, such as application crashes, set the triggering period to be equal to the time frame of the investigation (30 minutes, in the example). This ensures that the execution of the script is not triggered more than once for the same event.

To associate the previously created investigation to the remote action, drag and drop the investigation onto the appropriate area of the editor of remote actions.

Adding the PowerShell script

Open your favorite text editor and type in the remediation script. Remember to encode the text of the script in UTF-8 with BOM when saving the file.

The script does the following:

Adds the Nexthink dynamic library that deals with remote actions (nxtremoteactions.dll) by means of the Add-Type cmdlet.
Initializes the result of the script to the empty string.
Initializes a couple of variables with:
- The path of the executable mustrun.exe.
- The path of the corrupted log file to delete.
Tries to remove the log file with the Remove-item cmdlet and sets the result accordingly.
Restarts the MustRun application with the Start-Process cmdlet.
Sends the result to the Engine with the WriteOutputString function of the object NXT which was imported from the remote actions library.

  Add-Type -Path $env:NEXTHINK\RemoteActions\nxtremoteactions.dll

  [string] $result = ""

  # The paths to the MustRun application and its log file
  $mrexe = "$env:ProgramFiles\MustRun\mustrun.exe"
  $logfile = "$env:ProgramFiles\MustRun\log.txt"

  # Delete the log file if it is present
 try {
    Remove-item $logfile -ErrorAction Stop
    $result = "The corrupted log file was deleted"
  } catch {
    $result = "The log file does not exist"
  }

  # Restart the application
  Start-Process -FilePath $mrexe

  [NXT]::WriteOutputString("Result", $result)

For security reasons, Nexthink recommends that you sign your scripts. For testing purposes, it is safe to use unsigned scripts in pre-production environments only.

In the editor of the remote actions, click Import... to link the script to the remote action. Finder interprets the source of the script and lists the Result output under the Outputs section.

Adapt the previous script to your own use cases to profit from the self-healing capabilities of Remote Actions.

Last updated 1 year ago

Was this helpful?