Troubleshooting application connectivity
Problem
Reporting and acting on network-related issues requires accurate data collection, visualization and interpretation. Without reliable network performance indicators, fixing connection issues becomes a guessing game. Ultimately, leading to poor employee experience and resource wasting.
Solution
The Application Connectivity troubleshooting follows a set of investigation principles. Application Connectivity helps to:
Identify the root causes behind network-related issues or exclude possible root causes.
Effectively troubleshoot network-related issues with targeted solutions.
Stop the “blame game” by enabling a fact-based discussion between involved teams.
Control device data privacy and ensure compliance within your organization.
To achieve this, the Application Connectivity framework relies on connections data with connection metrics, destination information, and data privacy compliances at an application-device level.
The NQL queries on this page are examples how to use connections data to investigate network-related issues. Similar queries are supported by different query-based features available in the Nexthink web interface.
You can also use Network view for connections data visualization, filtering and drill-downs (transport protocols, devices, binaries, destinations, etc.).
Prerequisites
Connections events are only available for devices with Collectors that report 'Infinity only'.
Minimum Collector version of 2023.10.
Connections data
The connections data used by Network view and the NQL queries on this page include:
Connection events
Connection metrics
Destination decorations
Connection event aggregations
Jump to the Network troubleshooting with Connections Data section to learn about the use of Application Connectivity queries.
Connection events
A connection event represents an outgoing TCP connection (established by a device) or outgoing UDP packages. Each connection event provides the following information:
Start time, end time, and duration of the event’s bucket
The source of the connection event
The destination of the event
The transport protocol and IP version
Metrics about the connection
Connection events are sampled events, meaning Nexthink reports connection events in buckets of 15 minutes and 1 day.
The namespace connection of the NQL data model contains one main table:
The
connection.events
table contains events for outgoing TCP connections and UDP packages.
The following two tables in the connection namespace are deprecated and will be removed in the future:
The
connection.tcp_events
table contains events for outgoing TCP connections.The
connection.udp_events
table contains events for outgoing UDP packages.
Some metrics, like the number of failed connections or the connection establishment time, are only available for TCP connection events.
Refer to the NQL data model documentation for more information.
Connection events association
Connection events are linked to the following objects:
The device that establishes the connection.
The binary that uses the connection.
The user of the process that runs the binary.
Optionally, the desktop application configured for this binary.
Optionally, the network application matching the configured destinations.
Connection destination decoration
Nexthink decorates connection events data with additional destination information.
The IP and the network port define the destination of a connection. Additionally, Nexthink decorates connection events data with a destination type, the 'subnet address' and optional information:
Domain name of the destination.
Owner of the destination.
Country and location of the destination (GeoIP information).
Datacenter region name provided by the owner of the destination.
The destination subnet address equals the IP address with the last 8 bits set to zero.
Nexthink uses GeoIP data and published IP address ranges to enrich the destination information of connections. See the table below.
Information about the destination owner, country, and data center region is not always available or partially available only. The corresponding fields are NULL in this case.
Connection event metrics
Connection events provide the following metrics:
The connection round-trip-time (RTT): The average round trip time for all established TCP connections. The round trip time is measured between sending the SYN message and receiving the SYN-ACK message from the remote party during the TCP connection establishment, a 3-way handshake. This metric is only available for TCP connections with at least one established connection.
Incoming and outgoing traffic in bytes. Data received (TCP only) and sent (TCP and UDP) by the application during the event.
The ratio of all failed TCP connections over all attempted TCP connections i.e., all established and failed TCP connections.
Number of connections per status in the event.
Connection events aggregation
Nexthink aggregates connections into buckets of 15 minutes and 1 day.
Network troubleshooting with Connections data
Use connections data to troubleshoot network-related issues. To find the root cause of a network issue or exclude possible root causes, you must identify the relevant population (devices, apps, destinations) affected by network issues and when the impacted connection metric (failed connection ratio, establishment time, traffic) changed.
You can apply the same troubleshooting principle with Network view. Network View allows users to visually identify the relevant population by selecting a connection metric and filtering in the device, application, and destination dimension.
Identifying the relevant population
Focus on the three dimensions of the connections data:
Device Dimension: Which and how many are the impacted devices?
Determine the impacted devices sharing the same characteristics and location.
Application Dimension: Which and how many are the impacted desktop applications?
Destination Dimension: Which are the impacted destinations? Where are they located?
Device Dimension
Connections events are linked to the device object that created the network connection. This allows you to investigate the connections data of a single device (by devices.name
) or a group of devices, for example by devices.entity
, GeoIP-based location or other custom organizational unit classifications.
Refer to the Product configuration documentation for more information.
To group devices by GeoIP-based location, use the location context of the connection event, for example:
The location context is where the device was at the time of the event. It requires an activated geolocation feature and works best when the collector traffic is routed to the Internet directly and not through a VPN.
Alternatively, you can use the organizational context, for example:
Application Dimension
Connection events are linked to the binary object that initiated the connection.
Additionally, the connection event is linked to a desktop application if the binary is part of the application definition.
Destination Dimension
The destination is a structured field of the connection event. For example:
Note that it is impossible to summarize by IP address because the cardinality of IP addresses is too high. Instead, you can configure a Network Application based on IP address, IP subnet, network port, or domain name. Afterward you can filter connections events using the Network Application name, for example:
Investigating TCP Connections
The two main metrics to gain visibility on the quality of TCP connections are:
The connections round-trip-time (RTT): The connections RTT is available for all TCP events with established connections and can be accessed through
tcp_events.establishment_time.avg
. Connections RTT is a good indicator for slow connections.The failed connections ratio: The number of failed connections over the number of new connections (established and failed). The failed connections ratio can be accessed through:
failed_connection_ratio.avg
Failed connections ratio and its value fluctuation should always be evaluated along with the number of failed connections or the number of attempted connections. Consider the following example: if the ratio of failed connections is 100%, but the number of attempted connections equals 1, it’s not worthwhile to look into further.
Example: Investigation of VPN connectivity issues
Find an example below of a live dashboard to investigate VPN connectivity issues. Notice that the application dimension is fixed to the VPN binaries.
Find below the NQL queries from the example of investigating VPN connectivity issues:
Investigating UDP Traffic
Because of the connectionless nature of UDP, investigating UDP network traffic is limited compared to TCP network traffic. Your main tool is to look for changes and differences in the amount of outgoing UDP traffic, for example, comparing the average traffic per device for one application.
You can apply the same troubleshooting approach to Network view for connections data visualization, filtering and drill-downs (including transport protocols).
Application Connectivity in Nexthink Infinity
Find below the related documentation of some of the Nexthink features compatible with Application Connectivity’s connections data and queries:
Investigations using the NQL editor
Connections Timeline in the Device View
System Monitors for Alerts:
Binary connection establishment time increase
Binary failed connection ratio increase
Network view enabled for Network and Desktop Applications, Investigations, and Device view.
You can use the Application Connectivity queries included on this page for all NQL-based features in the Nexthink web interface.
Overseeing data privacy
Refer to the Configuring Collector level anonymization and Roles documentation to anonymize, filter and control connections data privacy .
Implementation Aspects
The following implementation aspects impact connection events:
RELATED TOPICS
RELATED TRAINING
Last updated