Tuesday, May 11, 2010

CloudFucius Wants: An Optimized Cloud

Konfuzius-1770 Although networks have continued to improve over time, application traffic has increased at a rapid rate in recent years.  Bandwidth-efficient client server applications have been replaced with bandwidth-demanding web applications.  Where previous generation client server transactions involved tens of kilobytes of data, rich web based portal applications can transfer hundreds of kilobytes per transaction and with the explosion of social media and video, megabytes per transaction is not uncommon.  Files attached to email and accessed across remote file shares have also increased in size.  Even data replication environments with dedicated high speed links have encountered bandwidth challenges due to increases in the amount of data requiring replication.  Our bandwidth hungry society, with people now watching videos right from their mobile devices can have both a financial and technical impact on cloud infrastructures needed to deliver those pieces of content.

Attempts to apply compression at the network level have been relatively bland.  Routers have touted compression capabilities for years, yet very few organizations enable this capability since it usually entails either an all-on or all-off mode and can add overhead both in terms of additional load placed on the routers and the additional latency due to the time it takes for the router to compress each packet.  A key factor in compressing traffic is how the data is presented to the compression routine.  All compression routines achieve greater levels of compression when dealing with homogenous data.  When presented with heterogeneous data, such as a collection of packets from multiple different protocols, compression ratios fall dramatically.

The primary problem with packet based compression is that it mixes multiple data types together when compressing.  They usually buffer packets destined for a remote network, compress them either one at a time or as a group and then send.  The process is then reversed on the other end.  Packet based compression systems can have other problems. When compressing packets, these systems must choose between writing small packets to the network or performing additional work to aggregate and encapsulate multiple packets.  Neither option produces optimal results.  Writing small packets to the network increases TCP/IP header overhead and aggregating and encapsulating packets adds encapsulation headers to the stream.

clip_image002

Packet Compressor

Instead, you might want to investigate a WAN Optimization solution that operates at the session layer.  This allows it to apply compression across a completely homogenous data set while addressing all application types. This results in higher compression ratios than comparable packet based systems.

clip_image004

Session Compressor

By operating at the session layer, packet boundary and re-packetization problems are eliminated. You can easily find matches in data streams which at layer 3 may be many bytes apart but at layer 5 are contiguous. System throughput is also increased when compression is performed at the session layer through the elimination of the encapsulation stage.

Achieving a high compression ratio is only part of the performance puzzle.  In order to improve performance, the compressor must actually increase network throughput. This requires that the compressor be able to achieve greater than line speed since as network speeds increase, the compressor might not be able to fully utilize the available bandwidth. For optimal performance you want to apply the best compression ratio for the bandwidth available. You are paying for the bandwidth and having a half-empty pipe can be costly.

In cloud deployments, Selective Data Deduplication (SDD) can have a significant impact.  SDD is designed to identify and remove all repetitive data patterns on the WAN.  As data flows through the WANOp appliances, they record the byte patterns and build synchronized dictionaries.  Should an identical pattern of bytes traverse the WAN a second time, the WANOp device near the sender replaces the byte pattern with a reference to its copy in the dictionary.  This can have huge benefits particularly when deploying virtual machine images or moving applications from the local data center to cloud peering-points.  Even though virtual machine images can be quite large (in the tens of Gigabytes), they are often comprised of a significant amount of redundant data like the underlying OS and are optimal candidates for SDD processing.

After SDD has removed all previously transferred byte patterns, you can apply a second class of data reduction routines called Symmetric Adaptive Compression (SAC).  While SDD is optimized to enhance repeat transfer performance, SAC is designed to improve first transfer performance through the use of advanced encoding techniques and dictionaries optimized for very small repetitive patterns.  SAC constantly adapts to changing network conditions and application requirements mid-stream.  During periods of high congestion, SAC increases compression levels to reduce congestion and network queuing delay.  During periods of low congestion, SAC reduces compression levels to minimize compression induced latency.  By examining every packet and adjusting the codec based on the flow, the adaptive nature of SAC ensures that the optimal compression strategy is applied and enables network administrators to deploy compression without fear of degrading application performance.

Like SDD, SAC can benefit Cloud deployments as well.  The elasticity of the cloud requires that the infrastructure be just as dynamic.  By responding to the ever changing network conditions of both the cloud and end user, SAC can make certain that you are using your bandwidth efficiently and quickly delivering the needed content to any user around the globe.

And one from Confucius: Life is really simple, but we insist on making it complicated.

ps

The CloudFucius Series: Intro, 1, 2, 3, 4

Technorati Tags:F5, Replication, Web Optimization, Data Deduplication, Pete Silva, F5, technology, application delivery, intercloud, cloudinfrastructure 2.0

twitter: @psilvas

Digg This

No comments:

Post a Comment