Installing HP Diagnostics

Installing HP Diagnostics in a Performance Testing environment is generally fairly simple. You will probably only be using it during a short performance testing phase, and therefore don't have to worry about the long-term management issues of a system that will be used for years at a time. Personally, I budget about half a day for an install (not including custom configuration) in a non-Production environment as long as I know that I have administrator/root access to all the servers, and there are no firewalls between any of the servers.
Installing HP Diagnostics when it is intended for Production Monitoring is a much more complicated exercise, and requires a much greater investment of time. Read on for my tips...
Overview
Basically, an installation of HP Diagnostics will be broken down something like this:
- Collect system information
- Organise Diagnostics Server infrastructure
- Install in Test environment
- Custom .points file development (+ developer training)
- Determine Diagnostics overhead
- Install in Production environment
- Production baselining + alerts setup
- User training
- Ensure processes are in place for ongoing operation
Collect System Information
The first step when planning an installation of Diagnostics is to get an idea of what the system looks like. If you are lucky, there will be some architecture documents that tell you everything you need to know, but usually you need to get some technical people to draw you some diagrams on the whiteboard.
The things you will want to know are:
- The names of all the servers, and what they do (e.g. database server, web server etc)
- The software that is running on each server, and the software version (e.g. SQL Server 2008, IIS 7.5). This should include the operating system version and whether the operating system is 32-bit or 64 bit, and also JVM and .NET runtime versions.
- How many CPUs there are on each server.
- Whether there are any firewalls between the servers, or between the servers and the corporate network.
- How the servers communicate with each other, and what protocols they use (e.g. the Application server retrieves pricing data from the mainframe using WebSphere MQ, the web server verifies credit card numbers by calling the web service on the bank's servers over HTTP)
- Are you planning on integrating Diagnostics with BAC, Transaction Vision, LoadRunner, or Performance Center.
- What is the name of the time server (NTP) and mail server (SMTP)?
Once you have all the system information, you will need to determine which servers you will install the Probe/Agent on. Some companies install probes on every server that runs .NET or Java code. Some choose to install on a single server, and get indicative performance measurements from that server.
Collectors are available for SAP NetWeaver, Oracle 10g database, WebSphere MQ, MS SQL Server, and CICS, so you will need to identify if any of these components are active in your environment.
It is absolutely critical that you check all of the software versions against the Product Availability Matrix for Diagnostics. You might find that your operating system, or the version of a software component is not supported by your version of Diagnostics. Some software versions are supported by HP, but do not have the right functionality for some Diagnostics features to work - a good example would be JVM versions before 1.4.2 that do not support some of the memory analysis features of Diagnostics, or early Linux kernel versions that do not support collection of CPU time metrics by Diagnostics.
Licensing cost is based on the number of CPUs that are in the servers with Probes installed on them. Note that licensing is not free in non-Production environments (as Diagnostics can be used during performance testing, or to monitor Production).
Organise Diagnostics Server infrastructure
Once you know which servers will have Agents installed, and which servers will have metrics collected by the Diagnostics Collector, you will need to determine the sizing and location of your Diagnostics Server (in Commander mode), your Diagnostics Collector, and any Diagnostics Servers in Mediator mode.
Other factors will be what you Diagnostics Server has to interface with (BAC, LoadRunner, Performance Center), and whether you want high-availability.
For 99% of Diagnostics installations, you will probably have a single server running a Diagnostics Server (in Commander mode) and a Diagnostics Collector. For a large-scale large-scale Diagnostics implementation (more than 40 Agents), you would start to run additional instances of the Diagnostics Server (in Mediator mode) on the same physical server (to utilise the hardware more efficiently). You would only have separate physical servers for Diagnostics Servers in Mediator mode if you had a really large Diagnostics installation, or you needed to have separate servers in separate areas of the network (for security reasons, or to limit the amount of network traffic between the monitored system and the Commanding server).
Note that BAC can only communicate with a single Diagnostics Server (in Commander mode), so if you have a separate Diagnostics Server for each monitored application, this might be a justification for separate servers.
Previously, rather than order a separate server for Diagnostics, I have installed it on the application's (WebSphere) "management" server, which had the advantage of not requiring me to provision extra hardware, and not requiring me to get too many extra firewall ports opened.
The hardware requirements for Diagnostics are quite modest. I have run a large (~$600K) installation on a single 4-CPU server, with only 10% CPU utilisation.
Platform | Item | Up to 50 Java Probes | Up to 100 Java Probes | Up to 200 Java Probes |
---|---|---|---|---|
Windows | CPU | 2x 2.4 GHz | 2x 2.8 GHz | 2x 3.4 GHz | Windows | Memory | 4 GB | 4 GB | 4 GB |
Solaris | CPU | 2x Ultra Sparc 3 | 2x Ultra Sparc 4 | 2x Ultra Sparc 4 |
Solaris | RAM | 4 GB | 4 GB | 4 GB |
Linux | CPU | 2x 2.0 GHz | 2x 2.4 GHz | 2x 2.8 GHz |
Linux | Memory | 2 GB | 4 GB | 4 GB |
HP-UX | CPU | PA-RISC 2x 650 MHz | PA-RISC 2x 699 MHz | PA-RISC 2x 750 MHz |
HP-UX | Memory | 2 GB | 4 GB | 4 GB |
All | Heap Size | 512 M | 750 M | 1280 M |
All | Disk | 4 GB per probe |
The firewall ports required for Diagnostics are quite simple.
From | To | Port/Protocol | Description |
---|---|---|---|
Desktop PC (corporate network) | Diagnostics Server | 2006/HTTP | User interface |
Desktop PC (corporate network) | Profiler on app servers | 35000-350xx | Need 1 port for each JVM running on the app server |
LoadRunner Controller (or Performance Center) | Diagnostics Server | 2006/HTTP | For integration with LoadRunner |
BAC | Diagnostics Server | 2006/HTTP | For BAC integration |
Diagnostics Server | Agent on App server | 35000-350xx | Need 1 port for each JVM running on the app server |
Agent on App Server | Diagnostics Server | 2612/TCP | Probe registration |
Agent on App Server | Diagnostics Server | 2006/HTTP | |
Diagnostics Server | NTP Server | NTP | Time synchronisation, if you don't want to use system time |
Network sizing is quite difficult, as the amount of data sent between the Agents and the Diagnostics Server will vary depending on:
- The number of transactions per hour processed by the system being monitored
- The number of points being applied to the application being monitored
- Whether sampling is enabled, and what percentage of requests are being sampled
- Depth trimming
- Latency trimming
Install in Test environment
Hopefully the organisation you are installing Diagnostics at is mature enough that they like to ensure that something works (and doesn't break anything) before installing it in Production.
I have never seen Diagnostics break application functionality but, on some occasions, I have seen it cause poor performance (see section on Determining Diagnostics Overhead).
If a company chooses to install Diagnostics into directly into the Production environment, they might choose a lower-risk deployment model and only install it on a single server to begin with. However, developing custom instrumentation usually requires a repeated change-restart-test cycle as the points file is developed.
Develop Custom Instrumentation
Diagnostics comes with default instrumentation (points) for common classes and methods (like Struts and JDBC), which will tell you a lot about where you application is spending its time. But if you don't use one of the frameworks that already have defined points, or if you want to see how much time is spent in your business logic, then you will have to create some of your own points.
Here is an example point.
[Servlet-service]
; ------------- extends HttpServlet ---------------------
; (See HttpCorrelation point for ignore documentation)
; In addition, ignore class we know we are not interested in.
class = javax.servlet.http.HttpServlet
method = !(service)
signature = !.*
ignore_cl = javax.servlet.http.HttpServlet, com.ibm.ws.jsp.runtime.HttpJspBase, com.ibm.ws.jsp.servlet.JspServlet, com.ibm.ws.webcontainer.servlet.FilterProxyServlet, com.ibm.wps.engine.Servlet, com.ibm.ws.webcontainer.jsp.servlet.JspServlet, com.ibm.ws.webcontainer.jsp.runtime.HttpJspBase, com.ibm.ws.console.core.servlet.NodeSyncStatusServlet
ignore_tree = org.apache.jasper.runtime.HttpJspBase
deep_mode = hard
layer = Mediator
active = true
As you can probably guess from looking at it, you need a developer-level understanding of the application that you want to create the custom instrumentation for. The best way to create your custom instrumentation (and code snippets) is to work side-by-side with someone from the development team. It is still critical that you still have good Java/.NET knowledge. If you don't understand packages, classes/interfaces, method overloading, and inheritance, then you should read a book or two before you start working with this tool.
Both the Java and .NET version of Diagnostics have tools that will show you all the classes and methods used by your application ("Reflector.exe" for .NET applications, and by enabling "Capture Class Map" in the Agent/Profiler for Java-based applications), but having a huge list of methods does not help someone with no knowledge of the application.
I like to first decompose the application into logical layers (in addition to the existing layers), and then pick methods that are the entry points to those logical layers.
As you can probably guess, developing custom instrumentation can turn into a huge time-sucking black hole, as it is possible to tweak the settings forever.
It is really important that you actually test your custom instrumentation. Does it give an appropriate level of visibility into where time is being spent. One common issue is the MVC problem. Imagine that you have a URL that looks like this: http://www.example.com/controller.aspx?action=generateReport, where "action" could be anything from "logout" to "placeOrder".
As the same server request is made in each case, Diagnostics will (by default) group all calls to /controller.aspx together, even though they do completely different things, and should be reported on separately. You will frequently see this when you find that your avg/min/max call tree instances show different methods being called, even though it is the same server request. This behaviour can be changed, but requires you to write a custom point to do it.
As points are tightly coupled to the source code. They will need to be maintained as the code changes. Refactoring exercises, where lots of method names change are highly likely to break at least some of your points. It is important that you leave someone from the development team with enough knowledge to be able to maintain the points files themselves.
It is best that points files (and other Diagnostics configuration files) are stored in the same version control system as the source code for your application, as they are so tightly coupled to the application's code. It is bad practice to manually deploy points files or make direct changes to them (as you end up with different configurations on different application servers).
Determine Diagnostics overhead
One of the first questions that a customer asks during the sales cycle is "how much overhead does Diagnostics have?"
The answer will really depend on your level of instrumentation (don't create a point for every method in your application) and the sampling/trimming settings for your Agents.
Personally, I have seen Diagnostics installations with lots of custom instrumentation that had an overhead that was too small to measure with LoadRunner, and I have also seen (default) instrumentation levels that made an application completely unusable under load.
I really like to measure the performance overhead of diagnostics by running a load test (in a Test environment) with it enabled and with it disabled, and then comparing the result.
If you are installing directly into Production, you really need a good quality monitoring tool (like RUM) that will allow you to see any differences between a server with an active Diagnostics Agent, and one without.
Another question which is common is "can Diagnostics be left on all the time, or is it designed to be used only when there is a problem?" Yes, this tool is designed to be left on all the time, rather than turned on in times of crisis.
Install in the Production environment
When you install in Production, you might do a couple of things differently to how you installed in Test.
- Enable HTTPS for Agent communications, and for the user interface.
- Ensure that the account the Diagnostics Server runs under has the minmum necessary priveliges, and will never expire.
- Reset all the passwords from their default values.
- Integrate with LDAP or BAC for user authentication.
- Connect to an SMTP server to send email alerts.
Production Baselining and Alerts
If you are using Diagnostics to generate alerts, you will want to set up realistic thresholds. While you can use the thresholds established when you ran the load test to determine the overhead of Diagnostics, it is still a good idea to let the tool run for a while in Production, so you can get typical Production values.
User training
A monitoring tool is useless without users. Diagnostics is a technical tool that the development team may pick up quickly, but this is not necessarily the case with support staff.
End-user training is a critical part of ensuring that a company gets enough benefit out of a tool that they will bother to invest time and money to maintain it.
It is good to show support staff how to diagnose a basic performance problem - i.e. start with slow server requests, check that system monitors are not showing high CPU, check time spent in outbound calls, and then drill down on specific call tree instances for key server requests.
This is a good time to make sure that everyone being trained has a login for the Diagnostics Server, and user account, and can customise their view to be meaningful (make sure they hide all the items they are not using - like CICS etc).
Ensure processes are in place for long-term operation
A lot of people think that once the install is complete and Diagnostics is running in Production, their job is over and they can go home.
It is important to give some thought to the long-term maintenance of the system.
The most
. Ongoing ownership. Users. Maintenance. Support.
- Ensure that the Diagnostics Server will automatically restart when the server it is running on is restarted. On Windows, the service should be set to start automatically, and on Linux/Unix it should be added to the init.d script.
- Points files will "rot" over time, as the application's code is updated. Make sure there is a developer who is responsible for maintaining the points file.
- Make sure the team that uses the tool knows who to call for support.
- Make sure that Diagnostics is listed in the document that specifies all the software used by the system. In a year or two, someone will need to remember to do an upgrade before HP stops supporting the installed version.
- Make sure that "monitoring with Diagnostics" is listed in the Non-Functional Requirements document. Hopefully this will mean that it will be on the Test Manager's list of things to test when a change is made to the application.
- Ensuring that there is a requirement for Diagnostics also means that (hopefully) there will be budget available if it breaks and needs support, or if it needs to be upgraded.
- Make sure that it is being backed up regularly.
- Someone will have to know to install the Java daylight savings patches on the JVM used by the Diagnostics Server whenever the dates for daylight savings change.
- Alerting set up for the server that the Diagnostics Server runs on (e.g. SiteScope alerts or equivalent). A good example would be a disk space monitor for the partition that the Diagnostics Server writes to.
Obviously there is a lot more to installing Diagnostics than what is written here (the Install Guide for version 8.0 is more than 700 pages long). It is a good idea to do HP's training course on Diagnostics if you can. Reading the manual is very helpful. And there are some undocumented features/behavious that you will only know about if you read the comments in some of the configuration files. Good luck!
Tech tips from JDS

Browser Console
Read More

Glide Variables
Read More

Understanding Database Indexes in ServiceNow
Read More

Fast-track ServiceNow upgrades with Automated Testing Framework (ATF)
Read More

Read More

Splunk .conf18
Read More

ServiceNow Catalog Client Scripts: G_Form Clear Values
Read More

Is DevPerfOps a thing?
Read More

The benefits of performance testing with LoadRunner
Read More

Monitoring Atlassian Suite with AppDynamics
Read More

5 quick tips for customising your SAP data in Splunk
Read More

How to maintain versatility throughout your SAP lifecycle
Read More

How to revitalise your performance testing in SAP
Read More

Reserve and import data through Micro Focus ALM
Read More

How to effectively manage your CMDB in ServiceNow
Read More

ServiceNow and single sign-on
Read More

How to customise the ServiceNow Service Portal
Read More

Integrating a hand-signed signature to an Incident Form in ServiceNow
Read More

Integrating OMi (Operations Manager i) with ServiceNow
Read More

Implementing an electronic signature in ALM
Read More

Service portal simplicity
Read More

Learning from real-world cloud security crises
Read More

Static Variables and Pointers in ServiceNow
Read More

Citrix and web client engagement on an Enterprise system
Read More

Understanding outbound web services in ServiceNow
Read More

How to solve SSL 3 recording issues in HPE VuGen
Read More

How to record Angular JS Single Page Applications (SPA)
Read More

Calculating Pacing for Performance Tests
Read More

Vugen and GitHub Integration
Read More

What’s new in LoadRunner 12.53
Read More

Filtered Reference Fields in ServiceNow
Read More

ServiceNow performance testing tips
Read More

Monitor Dell Foglight Topology Churn with Splunk
Read More

Straight-Through Processing with ServiceNow
Read More

Splunk: Using Regex to Simplify Your Data
Read More

ServiceNow Choice List Dependencies
Read More

Tips for replaying RDP VuGen scripts in BSM or LoadRunner
Read More

Incorporating iSPI metric reports into MyBSM dashboard pages
Read More

Using SV contexts to simulate stored data
Read More

What’s new in LoadRunner 12.02
Read More

Recycle Bin for Quality Center
Read More

LoadRunner Correlation with web_reg_save_param_regexp
Read More

LoadRunner 11.52
Read More

QC for Testers – Quiz
Read More

Agile Performance Tuning with HP Diagnostics
Read More

What’s new in HP Service Virtualization 2.30
Read More

Understanding LoadRunner Virtual User Days (VUDs)
Read More

Problems recording HTTPS with VuGen
Read More

Improving the management and efficiency of QTP execution
Read More

Performance testing Oracle WebCenter with LoadRunner
Read More

Generating custom reports with Quality Center OTA using Python
Read More

Asynchronous Communication: Scripting For Cognos
Read More

How to fix common VuGen recording problems
Read More

Monitoring Active Directory accounts with HP BAC
Read More

URL Attachments in Quality Center
Read More

What’s new in LoadRunner 11.00?
Read More

Restore old License Usage stats after upgrading Quality Center
Read More

Changing LoadRunner/VuGen log options at runtime
Read More

Restricting large attachments in Quality Center
Read More

Retrieving Quality Center user login statistics
Read More

A comparison of open source load testing tools
...
Read More

Worst practices in performance testing
Read More

LoadRunner Sales Questions
Read More

LoadRunner Analysis: Hints and tips
Read More

LoadRunner in Windows 7
HP Loadrunner 11 is now available. This new version now natively supports Windows 7 and Windows Server 2008. I ...
Read More

Using the QuickTest Professional “commuter” license
Read More

Installing HP Diagnostics
Read More

Understanding LoadRunner licensing
Read More

VuGen scripting for YouTube video
Read More

Creating a Web + MMS vuser
Read More

Why you should use backwards dates
Read More

How to get the host’s IP address from within VuGen
Read More

VuGen scripting for BMC Remedy Action Request System 7.1
Read More

Unique usernames for BPM scripts
Read More

Mapping drives for LoadRunner Windows monitoring
Read More

VuGen feature requests
Read More

LoadRunner script completion checklist
Read More

Querying Quality Center user roles
Read More

Querying the Quality Center Database
Read More

HPSU 2009 Presentation – Performance Testing Web 2.0
Read More

Scaling HP Diagnostics
Read More

Global variables aren’t really global in LoadRunner
Read More

Client-side certificates for VuGen
Read More

Detect malicious HTML/JavaScript payloads with WebInspect (e.g. ASPROX, Gumblar, Income Iframe)
Read More

VuGen code snippets
Read More

Integrating QTP with Terminal Emulators
Read More

Why you must add try/catch blocks to Java-based BPM scripts
Read More

Querying a MySQL database with LoadRunner
Read More

ANZTB 2009 Presentation: Performance Testing Web 2.0
Read More

How to make QTP “analog mode” steps more reliable
Read More

Testing multiple browsers in a Standardized Operating Environment (SOE)
Read More

DNS-based load balancing for virtual users
Read More

What’s new in LoadRunner 9.50?
Read More

Calculating the difference between two dates or timestamps
Read More

The “is it done yet” loop
Read More

Think time that cannot be ignored
Read More

Understanding aggregate variance within LoadRunner analysis
Read More

Load balancing vusers without a load balancer
Read More

Harvesting file names with VuGen
Read More

Parameterising Unix/Posix timestamps in VuGen
Read More

HP Software trial license periods
Read More

How to handle HTTP POSTs with a changing number of name-value pairs
Read More

VuGen string comparison behaviour
Read More

Persistent data in VuGen with MySQL
Read More

How to write a Performance Test Plan
Read More

Unable to add virtual machine
To get ...
Read More

LoadRunner scripting languages
Read More

WDiff replacement for VuGen
Read More

Testing web services with a standard Web Vuser
Read More

Why your BPM scripts should use Download Filters
Read More

Querying your web server logs
Read More

Importing IIS Logs into SQL Server
Read More

QTP “Uninstall was not completed” problem
Read More

VuGen correlation for SAP Web Dynpro
Read More

How to save $500 on your HP software license
Read More

Testing and monitoring acronyms
Read More

Solving VuGen script generation errors
Read More

An introduction to SiteScope EMS Topology
Read More

Using the BAC JMX Console
Read More