Tag: Splunk

Using Splunk and Active Robot Monitoring to resolve website issues

Recently, one of JDS’ clients reached out for assistance, as they were experiencing inconsistent website performance. They had just moved to a new platform, and were receiving alerts about unexpectedly slow response times, as well as intermittent logon errors. They were concerned that, were the reports accurate, this would have an adverse impact on customer retention, and potentially reduce their ability to attract new customers. When manual verification couldn’t reproduce the issues, they called in one of JDS’ sleuths to try to locate and fix the problem—if one existed at all.

The Plot Thickens

The client’s existing active robot monitoring solution using the HPE Business Process Monitor (BPM) suite showed that there were sporadic difficulties in loading pages on the new platform and in logging in, but the client was unable to replicate the issue manually. If there was an issue, where exactly did it lie?

Commencing the Investigation

The client had deployed Splunk and it was ingesting logs from the application in question—but its features were not being utilised to investigate the issue.

JDS consultant Danesen Narayanen entered the fray and was able to use Splunk to analyse the data received. He could therefore immediately understand the issue the client was experiencing. He confirmed that the existing monitoring solution was reporting the problem accurately, and that the issue had not been affecting the client’s website prior to the re-platform

Using the data collected by HPE BPM as a starting point, Danesen was able to drill down and compare what was happening with the current system on the new platform to what had been happening on the old one. He quickly made several discoveries:

1. There appeared to be some kind of server error.

Since the re-platform, there had been a spike in a particular server error. Our JDS consultant reviewed data from the previous year, to see whether the error had happened before. He noted that there had previously been similar issues, and validated them against BPM to determine that the past errors had not had a pronounced effect on BPM—the spike in server errors seemed to be a symptom, rather than a cause.

Database deadlocks were spiking.
Database deadlocks were spiking
It was apparent that the error had happened before

2. There seemed to be an issue with user-end response time.

Next, our consultant used Splunk to look at the response time by IP addresses over time, to see if there was a particular location being affected—was the problem at server end, or user end? He identified one particular IP address which had a very high response time. What’s more, this was a public IP address, rather than one internal to the client. It seemed like there was a end-user problem—but what was the IP address that was causing BPM to report an issue?

Daily response time for all IPs (left axis), and for the abnormal IP (right axis). All times are in seconds.
Daily response time for all IPs (left axis), and for the abnormal IP (right axis). All times are in seconds.

Tracking Down the Mystery IP Address

At this point our consultant called for the assistance of another JDS staff member, to track down who owned the problematic IP address. As it turned out, the IP address was owned by the client, and was being used by a security tool running vulnerability checks on the website. After the re-platform, the tool had gone rogue: rather than running for half an hour after the re-platform, it continued to open a number of new web sessions throughout the day for several days.

The Resolution

Now that the culprit had been identified, the team were quickly able to log in to the security tool to turn it off, and the problem disappeared. Performance and availability times returned to what they should be, BPM was no longer reporting issues, and the client’s website was running smoothly once more. Thanks to the combination of Splunk’s power, HPE's active monitoring tools, and JDS’ analytical and diagnostic experience, resolution was achieved in under a day.

JDS is now a CAUDIT Splunk Provider

Splunk Enterprise provides universities with a fast, easy and resilient way to collect, analyse and secure the streams of machine data generated by their IT systems and infrastructure.  JDS, as one of Australia’s leading Splunk experts, has a tradition of excellence in ensuring higher education institutions have solutions that maximise the performance and availability of campus-critical IT systems and infrastructure.

The CAUDIT Splunk offering provides Council of Australian University Directors of Information Technology (CAUDIT) Member Universities with the opportunity to buy on-premise Splunk Enterprise on a discounted, 3-year basis.  In acknowledgement of JDS’ expertise and dedication to client solutions, Splunk Inc. has elevated JDS to a provider of this sector-specific offering, meaning we are now better placed than ever to help the higher education sector reach their data collection and analysis goals.

What does this mean for organisations?

Not-for-profit higher education institutions that are members of CAUDIT can now use JDS to access discounted prices for on-premises deployments of Splunk Enterprise.  JDS are able to leverage their expertise in Splunk and customised solutions built on the platform, in combination with their insight into the higher education sector, to ensure that organisations have the Splunk solution that meet their specific needs.

Secure organisational applications and data, gain visibility over service performance, and ensure your organisation has the information to inform better decision-making.  JDS and Splunk are here to help.

You can learn more about JDS’ custom Splunk solutions here:  JDS Splunkbase Apps

How operational health builds business revenue

Splunk IT Service Intelligence (ITSI) is a next-generation monitoring and analytics solution that provides new levels of visibility into the health and key performance indicators of IT services. Use powerful visualizations and advanced analytics to highlight anomalies, accelerate investigations and pinpoint the root causes that impact service levels critical to the business.Join this session to learn how to:

  • Translate operational data into business impact.
  • Provide cross-silo visibility into the health of services by integrating data across the enterprise.
  • Visually map services and KPIs to discover new insights.
  • Transform machine data into actionable intelligence.

What is Service Intelligence?
IT operations is responsible for running and managing the services that businesses depend on. Service intelligence improves service quality, helps IT make well-informed decisions and supports business priorities.

Why Splunk IT Service Intelligence?
Splunk ITSI transforms machine data into actionable intelligence. It provides cross-silo visibility into the health of services by integrating data across the enterprise, visually mapping services and KPIs to discover new insights, and translating operational data into business impact. With timely, correlated information on services that impact the business, Splunk ITSI unifies data silos, reduces time to resolution, improves service operations and enables service intelligence.

ensure IT works with

Splunk: Using Regex to Simplify Your Data

Splunk is an extremely powerful tool for extracting information from machine data, but machine data is often structured in a way that makes sense to a particular application or process while appearing as a garbled mess to the rest of us. Splunk allows you to cater for this and retrieve meaningful information using regular expressions (regex).

You can write your own regex to retrieve information from machine data, but it’s important to understand that Splunk does this behind the scenes anyway, so rather than writing your own regex, let Splunk do the heavy lifting for you.

The Splunk field extractor is a WYSIWYG regex editor.

Splunk 1

When working with something like Netscaler log files, it’s not uncommon to see the same information represented in different ways within the same log file. For example, userIDs and source IP addresses appear in two different locations on different lines.

  • Source 192.168.001.001:10426 – Destination 10.75.001.001:2598 – username:domainname x987654:int – applicationName Standard Desktop – IE10
  • Context [email protected] – SessionId: 123456- desktop.biz User x987654: Group ABCDEFG

As you can see, the log file contains the following userID and IP address, but in different formats.

  • userID = x987654
  • IP address = 192.168.001.001

The question then arises, how can we combine these into a single field within Splunk?
The answer is: regex.

You could write these regex expressions yourself, but be warned, although Splunk adheres to the pcre (php) implementation of regex, in practice there are subtle differences (such as no formal use of look forward or look back).

So how can you combine two different regex strings to build a single field in Splunk? The easiest way to let Splunk build the regex for you in the field extractor.

Splunk 2

If you work through the wizard using the Regular Expression option and select the user value in your log file (from in front of “undername:domainname”) you’ll reach the save screen.
(Be sure to use the validate step in the wizard as this will allow you to eliminate any false positives and automatically refines your regex to ensure it works accurately)
Stop when you get to the save screen.
Don’t click Finish.

Splunk 3

Copy the regular expression from this screen and save it in Notepad.
Then use the small back arrow (highlighted in red) to move back through the wizard to the Select Sample screen again.
Now go through the same process, but selecting the userID from in front of “User.”
When you get back to the save screen, you’ll notice Splunk has built another regex for this use case.

Splunk 4

Take a copy of this regex and put it in Notepad.
Those with an eagle eye will notice Splunk has inadvertently included a trailing space with this capture (underlined above). We’ll get rid of this when we merge these into a single regex using the following logic.
[First regex]|[Second regex](?PCapture regex)

Splunk 5

Essentially, all we’ve done is to join two Splunk created regex strings together using the pipe ‘|’
Copy the new joined regex expression and again click the back arrow to return to the first screen

Splunk 6

But this time, click on I prefer to write the regular expression myself.

Splunk 7

Then paste your regex and be sure to click on the Preview button to confirm the results are what you’re after.

Splunk 8

And you’re done… click Save and then Finish and you can now search on a field that combines multiple regex to ensure your field is correctly populated.