docker

Event: What can Splunk do for you?

Event: What can Splunk do for you?

Registration Form

Splunk Event: 23 November

By clicking this button, you submit your information to JDS Australia, who will use it to communicate with you about this event and their other services.

Event Details

Splunk .conf2017 was one of the biggest events of the year, with thousands gathering in Washington D.C. to experience the latest Splunk has to offer. One of JDS' senior consultants and Splunk experts, Michael Clayfield, delivered two exceptional presentations highlighting specific Splunk capabilities and how JDS can work with businesses to make them happen.

We don't want our Australian clients to miss out on hearing these exciting presentations, which is why we are pleased to invite you to our .conf17 recap event in Melbourne. You'll get to hear both presentations, and will also have a chance to chat with account executives and discuss Splunk solutions for your business.

The presentations will cover:

  • Using Active Robot Monitoring with Splunk to Improve Application Performance
  • Running Splunk within Docker

When: Thursday 23 November, 5-8pm 
Where:
Splunk Melbourne Office, Level 16, North Tower, 525 Collins Street

Monitor Dell Foglight Topology Churn with Splunk

Monitor Dell Foglight Topology Churn with Splunk

Topology churn is one issue that can cause serious Foglight performance degradation. It is the result of constant changing and creation of new versions of existing topology objects, from bad configurations or ill written custom agents. We can view the overall churn by browsing the Alarms dashboard’s All System Changes view (see figure 1).

All system changes

Figure 1: (Foglight Management Server) All system changes

While the dashboard above gives you an indication of churn, it does not tell you what is causing it. This information is only available if you generate a Foglight Management Server (FMS) Support Bundle and examine the Diagnostics Snapshot data (see figure 2), but it is a fixed snapshot for changes over the past week. The column that denotes churn is Num Recent Versions.

Figure 2: Churn from the diagnostic snapshot

Figure 2: (Bash) Churn from the diagnostic snapshot

There is a better approach. If we can capture the topology type changes every 30 minutes and feed that snapshot to Splunk, we can start graphing and spotting trends. Having the ability to spot trends mean we can understand when churn usually occurs so that we can isolate efforts in reducing churn.

To provide you with an example, I run an FMS and a Splunk lab on Docker containers (see figure 3).

Figure 3: Foglight Management Server and Splunk running containers

Figure 3: (Bash) Foglight Management Server and Splunk running on Docker containers

Next, I wrote a Foglight groovy script that extracts the number of changes observed for a topology type over a 30-minute period. Figure 4 shows the script in action.

Figure 4: Groovy script to extract churn for last 30 minutes

Figure 4: (Bash) Groovy script to extract churn for last 30 minutes

This script above can then be executed by Splunk every 30 minutes with the results stored and analysed. Instead of calling the fglcmd.sh script directly, i wrote a wrapper called run.sh (see figure 5).

Figure 5: Configuring Splunk to run script to collect churn metrics

Figure 5: (Splunk) Configuring Splunk to run script every 30 minutes to collect churn metrics

Once the data is stored in Splunk, we can analyse and create dashboards to highlight Topology Types that cause churn in real-time. Figure 6 below shows such an example. Compare this to what you see in figure 1, you get heaps more intelligence to work with when trying to reduce Foglight Topology Churn.

Figure 6: (Splunk) Splunk Dashboard showing churn

Figure 6: (Splunk) Splunk Dashboard showing churn

Posted by JDS Admin in Splunk, Tech Tips