Advertisment

CACHE SERVERS: Improving Content Delivery

author-image
VoicenData Bureau
New Update

The

enormous success of the Web as a source of information and a

platform for e-commerce has not come without challenges. One

ever-increasing problem is the highly variable and often frustrating length of time it takes to access a web site and

download pages, prompting cynics to claim that "WWW"

stands for "World Wide Wait". While this presents

serious challenges to both marketer and consumer alike, it also

presents opportunities.

Advertisment

Carriers and service

providers are making huge investments to increase Internet

bandwidth. However, by itself, additional bandwidth cannot

address network latency or accelerate slow origin servers. Enter caching. This technique addresses the

challenges of the Web by moving content closer to users who need

it. Caching has immediate benefits not only for the end user, but also for Internet service providers and content providers.

And, in a future where every business is an e-business, it can

give any site a major competitive advantage.

This paper looks at the

case for Web caching, providing an overview of caching

technology and the implementation requirements. It also

describes various deployment options, identifies ideal cache

locations and considers the strengths of caching appliances.

Additional resources on caching are provided at the end of the

paper.

Advertisment

 

Why Implement

Web Caching?

Advertisment

Meeting the

Internet challenge

Advertisment

The explosive growth of

the Web has severely stressed Internet capacity and performance.

Carriers and ISPs have responded with massive investments to

expand capacity, from the Internet backbone to the "last

mile" into businesses and homes.

However, both the number

of Web users (Figure 1) and the amount of Web content

accessed are predicted to accelerate dramatically in the years

ahead. The projected value of the rapidly expanding Internet

economy is enormous. The growing importance of the Web as a

stimulant for economic growth means that it must become a more

reliable and predictable place to do business.

Figures 1 and 2: Growth, both in the number of Web users and the amount of content those users are accessing, is predicted to accelerate after the turn of the century. This will place a tremendous load on the Internet and potentially impact users' quality of service experience.

Advertisment

Currently, the Web is

fundamentally inefficient. Every user seeking to view specific

content must obtain it directly from the server that is the

point of origin for that content. This is the equivalent of

having everyone fly to Hollywood to see the latest movie. There

is no distribution mechanism designed into the Web that is

analogous to the system of movie theaters that offer first-run

films in every viewer’s hometown.

 

Since it is not possible

to have dedicated, point-to-point bandwidth allocated to users,

congestion is inevitable. Problems contributing to user

frustration include:

Advertisment

  • Slow

    connection speeds

  • Unpredictable

    performance

  • Limitations

    in available bandwidth

  • Overwhelmed

    web sites

Advertisment

The capacity of the

Internet is constantly being built out to handle the growing

load. For the foreseeable future, this build-out will continue

to lag behind demand. In any case, simply increasing bandwidth

by building up the network with bigger pipes cannot address all

the Quality of Service (QoS) issues involved. For purposes of

this discussion, QoS means a high quality user experience,

measured in low latencies for downloads and fast download times.

Adding bandwidth may

improve speed, but not latency or delay. In addition, adding

bandwidth at one point may only move a bottleneck to another

location.

Figures 3: The amount of bandwidth required for trips across the backbone is significantly greater in a non-cached network. With caching configured, a large portion of the requests can be fulfilled using only local pipes.

Caching makes more

bandwidth available by using existing pipes more efficiently,

not only improving QoS for the user, but also giving service

providers substantial savings and



additional room to grow (for details, see "Who
Benefits?" below).

What is caching?

Caching is a technology

that is already familiar in other applications. Many hardware

devices cache frequently used instructions and data in order to

speed up processing tasks. For example, data that is frequently

used by a computer’s Central Processing Unit (CPU) is stored

in very fast memory, sometimes right on the CPU chip, thereby

reducing the need for the CPU to read data from a slower disk

drive. In addition, Web browsers are designed to cache a limited

amount of content on a user’s PC. That is why selecting

"back" or "previous page" on a browser

toolbar typically results in near-instantaneous retrieval.

With true Web caching, the

same concept is applied more widely, using a server or

specialized appliance. Web content is placed close to user in

the form of a network cache, reducing the number of

routing/switching hops that are required to retrieve content

from a remote site. In other words, viewers are not required to

travel to Hollywood to see a movie, rather movies are sent to

local theaters where people can access them–or better yet, the

viewers themselves determine which movies are made available

locally.

There are two kinds of Web

caching models. In the "edge-services" model,

businesses subscribe to a third-party service vendor to have

their content cached. This has serious disadvantages for some of

the parties:

  • The ISP does

    not own or control the infrastructure.

  • The most

    frequently used sites are not necessarily the ones cached,

    which is a disappointment to users.

In the "open"

model, supported by Intel caching appliances, service providers

install their own caching equipment and are able to offer

caching as an value-added service to their customers. Advantages

include:

  • The ISP

    invests in its own destiny, and not that of a third party.

  • Additional

    revenue can be realized directly by the service provider.

  • The system

    automatically caches that sites that users access most.

Who benefits from caching?

End users, user

enterprises, service providers and content providers all stand

to benefit substantially from caching implementation.

The ultimate beneficiaries

are end users, the people who drive the Internet economy.

Caching provides distinct benefits for end users in the form of

an enhanced Internet experience and better perceived value for

their monthly service fees.

Caching also has benefits

for enterprises. By providing a local cache for Web content,

companies can monitor how much Internet bandwidth is required to

satisfy employee needs. They can also initiate access policies

to limit employee use of the Web to corporate activities.

For ISPs, caching has

several important advantages:

  • It reduces

    Internet bandwidth usage by eliminating redundant requests

    for popular documents.

  • Leased-line

    expenses are reduced or postponed. Benchmarks have shown

    that if a cache successfully serves a modest percentage of

    user requests, the amount of outbound bandwidth



    required can be reduced by up to 30-40 percent. That can
    mean a significant cost savings, or the ability to add more

    users with the current network. To access a bandwidth

    calculator for individual computation, please see http://www.intel.com/network/products/cache_1500.htm

  • Caching also

    provides better QoS, leading directly to higher customer



    satisfaction and reduced customer turnover. Less spending is
    needed for acquiring new customers.

  • A caching

    solution provides the basis for new value-added Web hosting

    services that boost ISP profitability.

Content providers benefit

from higher site availability and a better user experience with

fewer, shorter delays. This creates increased customer

satisfaction, giving cached sites a competitive advantage over

those that are not cached.

Studies indicate that a

delay of only five to eight seconds is enough to frustrate the

average user into retrying or leaving a site. Caching helps

prevent this. And, from an overall commercial viewpoint, users

can visit more sites, do more shopping and purchase more

products if content can be delivered and downloaded faster.

Overview of

Caching Technology

With

and without a solution in place

Without a caching solution

in place, requests for content from a browser and the content

delivered from the origin server must repeatedly take the same

long-distance trip–from the requesting computer to the

computer that has the content, and back (Figure 3). The



following steps are typical:

  • The Web

    browser sends a request for a Uniform Resource Locator (URL)

    that refers to a specific Web document on a particular

    server on the Internet.

  • This request

    is routed through the normal TCP/IP network transport.

  • Content

    requested from the server (also known as an HTTP server) may

    be a static HTML page with links to one or more additional

    files, including



    graphics. The content may also be a dynamically created page
    that is generated from a search engine, a database query or

    a Web application.

  • The HTTP

    server returns the requested content to the Web browser one

    file at a time. Even a dynamically created page often has

    static components that are combined with the dynamic content

    to create the final document.

  • If there is

    no cache server in place, the next user who requests the

    same document–even if that user is in the next cubicle–must

    send a request across the Internet to the originating Web

    server and receive the content by return trip.

When caching is used, the

process is far more efficient because frequently accessed

content does not have to repeatedly make the long trip from the

origin server (Figure 3).

  • The requested

    document may be stored on a cache server inside the user’s

    corporate LAN, at the user’s ISP, or at some other Network

    Access Point (NAP) or Point of Presence (PoP) located closer

    to the user than the majority of Web servers.

  • If the

    requested document is stored on the cache server, then the

    server will check to make sure the content is current

    (fresh). To ensure that a user does not receive a stale

    object, freshness parameters are pre-set by content

    providers and others, and servers and normally configured

    with default algorithms.

  • If the

    content is fresh according to these parameters, then the

    transaction is considered a cache "hit," and the

    request can be immediately fulfilled from the cache server.

  • If the

    content needs to be refreshed, the cache server will

    retrieve updated files from the Internet and send them to

    the user, also keeping fresh copies for itself.

  • The more

    frequently a cache can serve user requests, the higher the

    hit rate and the better the performance enjoyed by users.

Figures 4: Layer 4 switches can route requests for cacheable data (HTTP, NNTP, etc.) to the cache server while sending other requests to the Internet.Similar

process are involved for FTP file transfers, with an FTP server

handling each request for a file submitted by the FTP client

application. Delays and bottlenecks can be an even bigger

problem with FTP because the size of a typical FTP files is

larger than a typical HTML file. Streaming audio and video are

additional examples of Internet applications that can benefit

from caching content. Internet latency problems can cause

jittery video and delayed or distorted audio. Better use of

bandwidth can be a solution for these problems.

Reducing bandwidth usage

Along with giving users an

improved experience, caching reduces the upstream bandwidth an

ISP has to provide to fulfill user content requirements. A cache

only passes user requests on to the Internet if it isn’t able

to service them. The greater the number of user requests that

can be fulfilled from a cache, the less bandwidth is used to

reach distant origin servers. This traffic reduction means

significant savings for a service provider, since an estimated

one-third of an ISP’s operating costs are recurring

telecommunications charges.

It is true that freshness

updates must be performed, so there would still be traffic from

the ISP out to the Internet even if all requested content were

to be found in the cache server. But by using caching, bandwidth

utilization can be greatly reduced. Caching is even beneficial

when retrieving dynamic documents, because these pages do have

some static elements that can be served from a cache.

Depending on the

distribution of traffic and the scalability of the cache, up to

40 percent (Source: Patricia Seybold Group, 1999) of user HTTP

requests can be taken off the network and fulfilled from the

cache server. This makes networks far more efficient, enabling

better service to be offered at a lower cost. Caching as much

Web content as possible within the ISP while using modest

amounts of upstream bandwidth is a way to give users what they

demand without creating a "black hole" for bandwidth

investment on the part of the service provider.

Deployment

Models

There

are several approaches, or models, for implementing a cache

architecture. Which model is chosen depends on where the cache

is implemented, the primary purpose of the cache and the nature

of the traffic.

Forward proxy

Forward proxy cache is

defined by its reactive nature. With a forward proxy cache

configuration, user requests go through the cache on the way to

the destined Web server. If the cache contains the requested

document, it serves it directly. If it does not have the desired

content, the server acts as a proxy, fetching the content from

the Web server on the user’s behalf.

Reverse proxy

Figures 5: Cache servers may be placed at an ISP PoP to serve requests locally, at an aggregation point on the edge of the internet to reduce bandwidth requirements, or in front of a Web farm to reduce load on content servers.A

cache can also be configured as a fast Web server to accelerate

slower, traditional Web servers. Documents stored in cache are

served at high speed, while documents not in cache–usually

dynamic content and other short-term objects–are requested

when necessary from the origin Web servers. This model is

frequently used to optimize the performance of a Web server

site. The caching system sits in front of one or more Web

servers, intercepting requests and acting as a proxy.

Cache servers can be

deployed throughout a network to create a distributed network of

sites for hosted content, a model that is sometimes referred to

as site replication. In addition to performance benefits for the

user and content provider, reverse proxy caching also has

benefits for the ISP. Those benefits include the ability to

enable load balancing, to offer peak-demand availability

insurance and to provide dynamic mirroring for high

availability.

Transparent caching

Forward proxy caches can

be further configured as either transparent or non-transparent.

A transparent cache sits in the network flow and functions

invisibly to a browser. The benefits of caching are

automatically delivered to clients without anyone having to

reconfigure browsers. For ISPs and enterprise backbone

operations, a transparent configuration is often preferred

because it minimizes the total administrative and support

burden. Individual users and small businesses without IT staff

also appreciate the absence of configuration requirements.

The most popular

implementation is to use a Layer 4 capable switch to interface

cache servers to the Internet (Figure 4). These switches can

inspect network traffic and make decisions above the IP level.

For example, the switch can direct HTTP (or other) traffic to

the cache and send the rest of the traffic directly to the

Internet. The switch can also send requests to specific nodes in

a cache server cluster, a capability that can be used for load

balancing purposes. Using a pair of switches with multiple cache

servers allows for redundancy and failover protection.

Cache locations

To identify ideal cache

deployment points, there are three types of location

characteristics to keep in mind:

Choke point:

Traffic convergence points or choke points are locations where a

large majority of networks traffic passes and would, therefore,

be visible to a cache server. This allows the cache to handle

more requests and store more content than if located somewhere

that is easily bypassed.

High traffic load:

Any area characterized by high traffic conditions allows higher

cache utilization. The more cache hits, the greater the

benefits.

Economic potential:

Points where users will benefit from high cache hit rates while

also reducing upstream bandwidth requirements will provide both

QoS benefits and positive economics for the access provider.

These characteristics are

typically found at major Internet switching locations, dial-in

aggregation points, or corporate gateways (Figure 5). Uses

include standard POP and dial-up



access, NAPs and exchanges, Web hosting, "last mile"
acceleration, satellite-based cache feeding and more. Caching is

even employed as an economical means of updating information for

online news services.

Cache hierarchies

In the event that a

requested document is not stored in cache (a cache

"miss"), the cache server usually must forward the

request to a distant origin server. However, if the cache server

were able to check with another nearby cache instead, the

process could be much faster. This is the idea behind cache

hierarchies.

It is possible to create

relatively small regional caches–for example, a server or

cluster handling a department or limited geographical area–and

link them to larger parent caches that define larger groups or

areas. If a regional cache does not have a requested document,

it can forward the request to the parent cache. This will still

provide faster service than contacting the origin server.

Multiple level hierarchies can be configured, giving cache

servers a sequence of larger and larger caches to query if the

first attempt misses.

By combining capabilities

such as site replication and a linked hierarchical caching

structure, a highly efficient distributed network can be created

for Web hosting over a wide geographical area.

Advantage of

Using a Cache Appliance

While

this paper is intended to provide information on Web caching in

a generic context whenever possible, Intel products
are

used in the section below in order to provide a meaningful level

of detail. Unless otherwise indicated, the appliance

functionality and attributes described below are applicable to

the caching appliances offered by Intel.

Cost-effectiveness

By definition, an

appliance (sometimes referred to as a "thin server")

is a device that provides a limited number of dedicated

function, and is therefore able to deliver those functions more

cost-effectively than a multi-purpose device.

This does not mean that

appliances are not robust solutions. In fact, by specializing in

one particular area, they often provide a richer feature set,

superior stability and broader flexibility in terms of

deployment and configuration.

As example, the Intel

NetStructured Cache Appliance’s integrated hardware and

software design has been specifically engineered to provide

robust, carrier-class caching.

Capabilities include:

  • Speed (the

    ability to handle thousands of simultaneous user

    connections)

  • Scalability

    (nodes can easily be added as needed to a cache cluster)

  • Fault

    tolerance (contributing to network stability)

  • Secure

    single-point administration (many nodes can be configured at

    once)

Ease of installation and use

As a fully integrated

"solution in a box" comprising all of the necessary

hardware and software, an appliance is very easy to install and

configure. The Intel NetStructure Cache Appliance has automated

wizard and intuitive software configuration that make setup

easy.

This is a significant part

of the cost savings provided by appliances, because it takes

minimal time to incorporate the device into the network and does

not require the expertise and expense of a systems

administrator.

Further savings are

provided by the relatively compact size of most appliances.

The Intel NetStructure

Cache Appliance, for instance, comes in a low profile, rack

mountable design. This provides an easy way to increase network

capacity in the same limited space that an infrastructure owner

or operator already has available.

Flexibility

Since it is designed for a

single, specialized purpose, an appliance typically offers a

high degree of deployment flexibility. This appliance is

no exception.

It can be used in a

variety of deployment models, alone or with other enterprise

software, including other caching products. Here are some of the

ways it can be implemented:

  • Forward proxy

  • Reverse proxy

  • Transparent

    caching

  • Part of an

    HTTP cache hierarchy

  • ICP sibling:

    The Intel NetStructure Cache Appliance can send ICP

    queries to neighboring caches as part of an ICP cache

    hierarchy

  • NNTP news

    cache: The Intel appliance caches frequently accessed

    news articles and can also receive news feeds for designated

    news groups

In addition, the Intel

NetStructure Cache Appliance offers broad support for

content and interoperability protocols:

  • HTTP versions

    0.9 through 1.1

  • FTP

  • NNTP

  • ICP (to help

    implement cache hierarchies)

  • SSL

    encryption

  • WCCP

  • WPAD

Performance

Key

Requirements

The most

important requirement of a caching solution is the ability

to provide optimized performance. There are two sides to

cache performance:

Operational capacity: This

is addressed by the architecture and the implementation of

the cache server. Along with raw cache capacity,

architectural issues include how the server makes use of

multiple threads of execution, and how well it performs

load balancing in a multiple cache server deployment.

Responsiveness to user

requests: This is determined by the various techniques the

cache server uses to maximize hit rate, including the

structure of hierarchies (see "Cache

Hierarchies," below) and content optimization. Cache

hit rate is a function of many factors, including the

cache size and the load on the cache.

A cache server can be

turned to improve capacity and responsiveness in many

ways. Potential areas of optimization include:

  • Processing queues for

    the various objects that make up a document
  • Determining whether or

    not a requested object is cached
  • n Delivering the

    requested object to the browser if it is not cached
  • Total throughput based

    on how fast incoming requests are handled

Performance depends on how

well these possibilities were understood and used by those

who built the cache server and engineered the software.

Scalability for the Web is

another key requirement that a cache must address. The

effectiveness of caching improves as the traffic served by

the cache increases–the bigger the challenge, the more

valuable the solution. To support very large caches, cache

server clustering or load balancing is necessary, and the

caching solution must support these capabilities.

Cache server support is

also required for a variety of protocols. Network caching

can be applied to content delivered over HTTP, NNTP, FTP

and others. All are characterized by having some

proportion of static content.

Manageability is mandatory

for any caching solution. Cache management includes the

ability to easily install and maintain cache servers, and

to access the wealth of usage and traffic data the servers

can provide. It is often necessary to mange arrays of

cache servers, which may be distributed over great

distances, from a single point of control.

Browser-based management

interfaces are increasingly common as the standard way to

manage distributed systems. The interface should provide

functionality for configuring cache servers, administering

security, setting filters, loading the cache, controlling

the cache system and gathering information from logs.

A caching solution should

be designed to provide high reliability and availability.

Although the duplicative nature of caching has a measure

of fault tolerance built in, the solution must feature

high quality software and a highly reliable platform if it

is to be considered an integral part of the network

infrastructure.

Configuration approaches

such as failover and clustering also contribute to

reliability and availability.

Finally, hardware and

software must be well integrated to achieve the

efficiencies that are at the heart of caching performance.

As previously noted,

performance depends on capacity, including how well the server

makes use of multiple threads of execution, and the ability to

respond quickly to use requests. The Intel cache is designed

for high-performance operation across an broad range of load

conditions.

It aggressively

implements multi-threading–breaking down large transactions

into small, efficient tasks. A threaded event scheduler allows

the Intel NetStructure Cache Appliance to handle

thousands of simultaneous connections and maximize CPU usage.

The appliance is able to respond to multiple requests

simultaneously and efficiently even under peak loads.

The appliance’s

Inktomi Object Store is a custom-designed Web object database

that has been fully optimized for caching. It uses raw disk

input/output to achieve optimized storage and retrieval of

content objects, resulting in much higher speeds than

conventional file systems. In order to provide fast access for

the most frequently requested objects, a RAM cache is

maintained so that hot objects can be read from high-speed

memory instead of from the disk. In addition, all objects are

indexed according to their URL and associated headers. This

means the Intel NetStructure Cache Appliance can store,

retrieve and serve not only web pages, but also parts of web

pages, providing optimized bandwidth savings.

Centralized administration

This cache helps minimize

the cost of administering, maintaining and operating a large

cache system. It offers several centralized management

alternatives to suit the needs of a wide range of environments:

Browser-based interface:

The manager User Interface (UI) offers password-protected,

single point administration for the entire Intel cache cluster.

Command line interface:

A command line interface lets the administrator configure the

system’s network addresses, and control, configure and monitor

the cache.

SNMP management:

This cache supports two Management Information Bases (MIBs) for

management through SNMP facilities. MIB-2 is a well-known

standard MIB. A proprietary Intel Cache MIB provides more

specific node and cluster information.

Performance reporting:

Performance statistics are available at a glance from the

manager UI or the command line interface. Some of the

characteristics that can be managed include: log file formats;

site or content blacklist filtering; anonymization; never-cache,

pin-in-cache, revalidated after; store multiple versions of

cached objects for user-defined or browser-defined differences

in content; domain and hot-name expansion, and content routing.

Scalability and reliability

In recognition of the

mission-critical nature of caching, the Intel NetStructure Cache

Appliance is designed to provide a highly reliable and

available cache service. And, since it is designed to implement

caches at the highest levels of network traffic, including

Network Access Points and on the backbone, it is easy to scale.

This cache achieves

a high degree of scalability through three mechanisms. They

include:

  • Symmetric

    Multi-Processing (SMP)

  • Clustering

  • Cache

    hierarchies

Multiple threaded

processors provide the in-box performance to accommodate growth,

and clustering provides scalability across several machines by

spreading the workload. The Intel NetStructure Cache Appliance

executes its own cache hierarchy configuration, which is

used in



conjunction with ICP to communicate with other caches.

Clustering technology is

supported by the Intel NetStructure Cache Appliance, combining

the resources of several machines to increase capacity. As new

nodes are added to the cluster they build on existing nodes to

provide additional disk and processing resources. Clustering

also offers failover protection–node failures can be

automatically detected, and traffic is then redistributed to

active nodes.

In today’s complex Web

environment, it is important to consider end-to-end performance

and response time as being the product of many factors, over

which few web sites, service providers or users have control.

Service providers need to provide an optimal user experience,

measured in low latencies for downloads and fast download times.

Various caching approaches

are available, and they can be implemented in a variety of ways

depending on the specific caching requirements. When correctly

placed and configured, caches can significantly improve the user

experience and QoS, while saving service providers significant

costs of providing upstream bandwidth.

Another plus is the added

revenue that caching can bring to service providers by giving

them opportunities to offer service level guarantees and peak

insurance.

The Intel NetStructure

1500 Caching Appliance, featuring Inktomi Traffic Server Engine

caching software, is a carrier-class product cable of delivering

fresh content to a large number of users from a large number of

Web servers.

It is ideal for

enterprises that need to better manage the use of network

resources, provide superior information distribution to

employees, and reduce the administrative burden through

transparent proxy and caching capabilities. Even more

importantly, it gives service providers a superior approach to

managing growth in backend connectivity–growth that otherwise

could expand at an almost infinite rate.

Courtesy:

Intel Corp.

Advertisment