Don’t Roll Out an Application Performance Problem

Sinking Ship

If you’re planning to implement a new application that will have remote users accessing application servers via WAN links, you could be getting set up for a support disaster if adequate pre-deployment analysis and preparations aren’t performed.

It Depends on the App

Application performance in a network environment can vary significantly because of bandwidth availability and network path latency factors – data payload sizes, app turn counts, TCP effects, etc. – on top of client and server processing times. All of these are related to application design. If the application is new, or even an existing app being moved, client-server interactions and server processing times for typical user transactions may not have been benchmarked, and without the right tools the metrics that allow calculations of application performance in various network environments aren’t available.

You can test a pilot of a new application from a set of representative user locations – but if performance is lacking at some locations you won’t be able to determine why – or what to do about it – without some good performance metrics data. Same story with WAN emulators without supporting analysis tools.

And the Network

And what about network loading? The new application traffic is going to be added to current usage levels, which will need to be carefully measured and total loading accounted for. And to make matters more difficult, network data rates from a single active application user can vary significantly depending on that application’s sensitivity to bandwidth availability and latency factors between the data center and the remote location – so you can’t just take a sniffer capture from a given test environment and multiply the average data rate times some number of users. You also have to account for aggregate loading on the data center access links from all the remote sites.

So basically – knowing how the application is going to perform in your network infrastructure, and what the total network impacts are going to be, can be a couple of very large question marks.

The answer to both of these questions – and to avoiding a problematic implementation that can suck up vast amounts of unplanned time and troubleshooting costs is…

The Network Impact & Performance Assessment

A Network Impact & Performance Assessment (‘NIPA’) is a pre-deployment analysis and modeling exercise that provides projections of the network impacts a new or migrated application will have on the network infrastructure, as well as the application response time performance at each remote user location. The goal is to decrease risk, minimize costs, and help ensure successful implementations and optimal performance by making sure the network is ready for the application, and the application will perform well across the network – *before* the application is deployed or moved and potentially becomes a support and customer relations nightmare.

APA Report

During the NIPA process the target application is analyzed using the same processes and tools as those used in application troubleshooting scenarios (Application Performance Analysis) to obtain the metrics needed for modeling network impacts and response times; this data also reveals any potential response time performance issue with the application caused by sensitivity to bandwidth or latency factors, as well as any inherent client or server processing time problems that have nothing to do with the network.

BW Analysis - DataRate - Large Data Center WAN Link - 1Gbps

A thorough analysis of network routing, latency, and bandwidth usage / availability is also performed. The application and network metrics are combined with concurrent user counts for each user location in a proprietary modeling environment to produce the network impact and response time projections.

NIPA Process Diagram 11-10-2012

The NIPA modeling projections show the expected total (current + new) traffic loading on all of the WAN links to remote user locations, and the aggregate loading on Data Center links; where needed, recommendations are made for ‘right-sized’ upgrades to ensure there’s enough bandwidth to avoid adverse effects to the new as well as existing applications and network services, without wasting money on more bandwidth than is needed – especially with applications for which additional bandwidth allocations beyond an optimal ‘sweet spot’ provide quickly diminishing improvements in performance due to other inherent design factors.

The NIPA model’s response time projections include a breakdown of how the response times are distributed between client, network transport and app-turns latency, and server processing delays. The cause of any excessive response times expected at remote user locations is clearly evident so the proper remediation actions can be taken.

Example: Major SAP Suite Roll-out to Alaska

I performed network impact & performance assessments that finished last year to support a ~$500M deployment of SAP ERP & financials, Maximo materials management, Business Information (BI), and several equipment inventory/maintenance applications to over 15,000 users across multiple locations around the globe for a major client. Several of these assessments resulted in WAN link upgrade recommendations to accommodate what typically amounted to 6-10 Mbps of additional network traffic per region.

The Alaska part of the project, as an example, revealed a WAN link to one remote site with insufficient capacity to support the imminent roll-out, and the fact that a certain WAN link failure scenario between Anchorage and the North Slope would swamp the redundant path link with ~10 Mbps of more traffic than it could accommodate (after the new traffic was introduced), which could have resulted in significant slow-downs and session timeouts for a very large number of mission-critical applications and users.

The likelihood of this actually happening was significantly increased by the barren landscape in this region and the fact that winter storms can prevent problem resolution for days. An upgrade of the fail-over link was ordered, and an alternate route with sufficient bandwidth was provisioned for the remote site.

All of these implementations were successful; most of the applications were reasonably network efficient with the exception of a mobile device-based inventory application for which ~50% of high response times (this app also suffered from high server processing times) was going to be incurred due to a high app-turns count. This was identified by the NIPA modeling outputs and user expectations set accordingly – before the roll-outs – while the application is being reworked to perform better across the network.

Alaska Network Impacts
Click the image for a larger view

Network Impact Graph: One site in Alaska had insufficient bandwidth

The Incoming Traffic (to user location) network impact graph above shows the current (Blue) plus the new (Green) and peak (Red) loading. Notice that one site’s link – which was already pretty busy – wasn’t going to handle the new loading; a recommendation was made to right-size upgrade this to handle the new load plus ~30% spare overhead.

Alaska RTs

Click the images for a larger view
Alaska RTs (2)

These Response Time graphs / tables show the Hi-RT Inventory app tasks (Top) and the RT’s on the other apps (Bottom)

The images above reflect the projected response times – and the time components of those RTs – for a set of tasks on several applications in the Alaska suite rollout. The first image includes some Inventory tasks and a Hi-RT reporting task, whose excessive RTs tend to mask the other tasks. In the second image these offending tasks have been hidden to allow viewing the rest of the set with better resolution. Most of these applications are going to exhibit 1-4 second response times, except for one task that is suffering from high app-turns effects (the Purple part of the graph bar) that will push RT’s to ~8 seconds.

On the right-hand size of the response time tables, note the percentages of network transport (bandwidth) delay (% RT = Ntwk Delay) and app turns (latency) delay (% RT = App Turns) for the 8 second task – the contribution of app turns delay (76.6%) is reflected here as well. A good rule of thumb is that the percentage of response times attributable to network effects (transport + latency delays) be 30% or less; greater values warrant further investigation.

Don’t let this happen at your company

Again, the idea is to avoid introducing any new applications that are going to create additional performance problems.

I’ve run across a number of applications that simply could not and never would perform well in a networked environment, some of which were or needed to be abandoned and replaced or totally redesigned. One of my clients specifies that a NIPA be done before each and every new application implementation – because a number of years ago (before my time) they had to basically throw away a $17M software investment and start over with a different solution. Ouch. And since they’ve starting enforcing the NIPA process the only applications I’ve come across in their environment that needed troubleshooting for performance reasons had slipped through the project implementation process w/o a NIPA. The benefits and ROI is very evident to this company.

The network and application/server performance analysis required for a NIPA is conducted by a fresh, unbiased set of eyes coupled with a great deal of experience in this arena and the right tools for the job – and there are often additional side benefits from this analysis in terms of identifying and rectifying performance issues that are affecting existing applications.

PacketIQ exists to help you make sure your application roll-outs are successful – click the link below for more information, and be sure to download the NIPA brochure – it’s a great summary of the what, how, and why’s for discussion within your groups:

PacketIQ Network Impact & Performance Assessments

A final note and friendly warning: performing the correct analysis, with the right tools, to produce accurate model-based projections of network impact & application response times is not something you can or should expect your existing network or application support teams to do – this is a unique skill set which requires a fair amount of expertise and experience, and some fairly involved software. PacketIQ associates have been delivering accurate and value-proven NIPA assessments for right at 20 years – if you want accurate results and the best assurance of a successful roll-out, give us a call or email.

James H. Baxter
PacketIQ Inc.
JamesHBaxter_PacketIQ_Inc_ 9-2010_272x321

Comments are closed.