by George Lawton

How to Improve Application Performance and Reduce Latency

Sep 10, 20127 mins
Cloud ComputingEnterprise ArchitectureInternet

Web developers can no longer look at network latency and application performance as mutually exclusive concerns. Fortunately, there are several ways that developers can "hide" data transmission and computation so that user experience doesn't suffer.

Developers have traditionally looked at network latency and application performance as two separate phenomena. However, modern Web development needs to consider both of these phenomena together as they move toward more complex applications and networking infrastructure.

“Latency is clearly the biggest factor in network constraints of page loading on the Web,” says Guy Podjarny, chief product architect at Akamai. This is clear in run measurements of real users or in synthetic measurements, he adds, especially when compared to changes in download and upload speeds. “Unless you start with an especially slow connection, even doubling the speed will not make much difference. But with growth in latency, load times increase linearly.”

Latency Is a Big Deal for Users

Latency measures the delay between an action and a response. Over the Internet, individual delays at the application or packet level can accumulate, as each element in the chain of communication is linked to others. Asynchronous development approaches, such as Ajax and the use of Asynchronous JavaScript, can help to reduce these delays by separating the program logic from the need for constant network connectivity.

Analysis: Latency, Interference, Security Main Wireless Networking Challenges

Latency can often be hidden from users through multi-tasking techniques. This lets them continue with their work while transmission and computation take place in the background. The differences that latency-sensitive software design make can be dramatic, Podjarny says—start times that are four times as fast as load times twice as fast, plus better resilience due to fewer intermittent failures.

Major companies see significant usage and sales benefits from shaving off even fractions of a second of latency. For example, the Microsoft search engine Bing found that a two-second slowdown in page load performance decreased revenues per user by 4.3 percent, Podjarny notes. (His personal blog points out 16 more Web performance optimization statistics that demonstrate the importance of reducing latency.)

Developers also need to think about the law of unintended consequences of feature creep and address the possibility that new features may in fact subtly push users away. For example, when Google offered to let users increase the number of search results per screen from 10 pages to 30, the average page load time increased from 400 ms to 900 ms. The number of searches initiated per user dropped by 25 percent as a result, even though these users voluntarily chose to see the more voluminous search results.

How Will App Affect the WAN?

With latency such an important issue, software vendors are spending much more time today considering how their designs will impact the wide area network (WAN) than they did in the past, says Damon Ennis, VP of product management at Silver Peak. “The most important design pattern a software vendor should consider is to reduce the overall ‘chattiness’ of the application in order to minimize the number of round trips required for each operation, such as File > Open, Close or Edit.”

To that end, software vendors are making efforts to reduce application chattiness, Ennis points out. Take Microsoft, for example. The version of Common Internet File System (Server Message Block 2.0, or CIFS (SMB2), in Windows 7 and Server 2008 performs an “order of magnitude” better than it previously did, he says.

These improvements add value to Microsoft Outlook and Exchange, Web browsers and many other applications, Ennis adds. The same is true of the local area network, but LAN latency is low as it is, so “the gap, and therefore the opportunity, is not as large,” he says.

While these SMB2 improvements promise to reduce application latency, Ennis believes that organizations will still benefit from WAN optimization techniques, since “a well-designed WAN optimizer works generically and thus benefits any and all applications.”

Optimizing Application Front End Also Reduces Chatter

Front-end optimization, on the other hand, looks at a wide range of design practices that can make Web pages perform faster in a high-latency environment. One prominent technique is to reduce the number of requests by making the page less chatty. Techniques such as sprinting and using data URI let developers embed binary data such as images into text resources such as CSS and HTML pages.

Analysis: Webmasters Face New Site Optimization Challenges

Podjarny also advocates combining multiple JavaScript files into one, and combining multiple CSS scripts together. This reduces the number of back-and-forth requests required for separate files. This works because not all resources on a Web page are equal. There is a logic the browser applies regarding when and how to download a given resource. In some cases, the scripts block subsequent downloading of other resources on the page. In a high-latency environment, it takes longer to access these subsequent resources.

Other techniques are more complicated. Asynchronous JavaScript, for example, lets the browser load components for first-, second- and third-party scripts asynchronously. In doing so, though, JavaScript components are loaded in parallel, which means the Web page may load before all the JavaScript does. In a high-latency environment, this one change can have a dramatic effect on how long a user stares at a blank Web page.

There’s also potential for those scripts to become a bigger point of failure. For example, the recent Facebook API outage affected many sites that embed Facebook components, since the behavior of the browser in rendering the scripts was blocking subsequent resources.

“Those pages failed to load, even though the Facebook API was only a small part of it,” Podjarny says. “Those individual scripts became a single point of failure. With high latency and high packet loss, each resource might be prone to such service delays or outages.”

Vendors are creating tools to help stay on track with best practices for mobile development. For example, the Akamai Aqua Mobile Accelerator provides a toolbox to manipulate wireless and Internet protocols to address a variety of mobile network properties. One such technique is using larger TCP windows. This allows more information to be sent in each transmission, thus reducing the number of acknowledgements required in the conversation.

HTML5 WebSockets Can Help Bypass HTTP

Most Web applications today are built on HTTP. Most developers create kludges such as comet and long polling to mimic a constant connection. However, Peter Lubbers, senior director of technical communication at Kaazing, believes that many Web applications can be significantly accelerated using the new WebSockets protocol for constant connections.

WebSockets, introduced in HTML5, reduces the packing overhead, thus creating a more efficient communications link. Research by Peter Lubbers has indicated that WebSockets can reduce the latency from 100 ms to only 50 ms for a typical Web application.

In addition, WebSockets can reduce the data overhead in communicating the same information to multiple clients. In one demo stock market application for 1,000 clients, WebSockets reduced communication bandwidth to only 0.015 Mbps, compared to 6.6 Mbps using HTTP.

Test Early, Test Often to Reduce Latency

Developers can test applications during the development process to see how different design choices will affect performance in a high-latency environment. One way is to use Mobitest, a Web performance management that Akamai recently made open source.

After an application has been deployed, it is also possible to capture real user measurements with JavaScript code. Newer browsers can provide the best data performance data with direct support for a navigation and timing API, while older browsers need special scripts for requesting data about load times.

Tip: Don’t Do Complex App Deployments Yourself

It is important to test applications after they have been deployed, Podjarny explains. “In pretty much all cases, the real user measurement numbers are two to four times higher than synthetic measurements. Usually synthetic measures are biased towards clean, fault-free, and usually faster environments. That is sadly not the real world, especially in mobile.”

George Lawton is a California-based freelance technology writer who has been covering computers and communications for 20 years. You can reach him at email. Follow everything from on Twitter @CIOonline, on Facebook, and on Google +.