Telecom Services

CACHE SERVERS: Improving Content Delivery

Voice&Data Bureau

01 Apr 2000 07:44 IST

New Update

The
enormous success of the Web as a source of information and a
platform for e-commerce has not come without challenges. One
ever-increasing problem is the highly variable and often frustrating length of time it takes to access a web site and
download pages, prompting cynics to claim that "WWW"
stands for "World Wide Wait". While this presents
serious challenges to both marketer and consumer alike, it also
presents opportunities.

Advertisment

Carriers and service
providers are making huge investments to increase Internet
bandwidth. However, by itself, additional bandwidth cannot
address network latency or accelerate slow origin servers. Enter caching. This technique addresses the
challenges of the Web by moving content closer to users who need
it. Caching has immediate benefits not only for the end user, but also for Internet service providers and content providers.
And, in a future where every business is an e-business, it can
give any site a major competitive advantage.

This paper looks at the
case for Web caching, providing an overview of caching
technology and the implementation requirements. It also
describes various deployment options, identifies ideal cache
locations and considers the strengths of caching appliances.
Additional resources on caching are provided at the end of the
paper.

Advertisment

Why Implement
Web Caching?

Advertisment

Meeting the
Internet challenge

Advertisment

The explosive growth of
the Web has severely stressed Internet capacity and performance.
Carriers and ISPs have responded with massive investments to
expand capacity, from the Internet backbone to the "last
mile" into businesses and homes.

However, both the number
of Web users (Figure 1) and the amount of Web content
accessed are predicted to accelerate dramatically in the years
ahead. The projected value of the rapidly expanding Internet
economy is enormous. The growing importance of the Web as a
stimulant for economic growth means that it must become a more
reliable and predictable place to do business.

Figures 1 and 2: Growth, both in the number of Web users and the amount of content those users are accessing, is predicted to accelerate after the turn of the century. This will place a tremendous load on the Internet and potentially impact users' quality of service experience.

Advertisment

Currently, the Web is
fundamentally inefficient. Every user seeking to view specific
content must obtain it directly from the server that is the
point of origin for that content. This is the equivalent of
having everyone fly to Hollywood to see the latest movie. There
is no distribution mechanism designed into the Web that is
analogous to the system of movie theaters that offer first-run
films in every viewer’s hometown.

Since it is not possible
to have dedicated, point-to-point bandwidth allocated to users,
congestion is inevitable. Problems contributing to user
frustration include:

Advertisment

Slow
connection speeds
Unpredictable
performance
Limitations
in available bandwidth
Overwhelmed
web sites

Advertisment

The capacity of the
Internet is constantly being built out to handle the growing
load. For the foreseeable future, this build-out will continue
to lag behind demand. In any case, simply increasing bandwidth
by building up the network with bigger pipes cannot address all
the Quality of Service (QoS) issues involved. For purposes of
this discussion, QoS means a high quality user experience,
measured in low latencies for downloads and fast download times.

Adding bandwidth may
improve speed, but not latency or delay. In addition, adding
bandwidth at one point may only move a bottleneck to another
location.

Figures 3: The amount of bandwidth required for trips across the backbone is significantly greater in a non-cached network. With caching configured, a large portion of the requests can be fulfilled using only local pipes.

Caching makes more
bandwidth available by using existing pipes more efficiently,
not only improving QoS for the user, but also giving service
providers substantial savings and

additional room to grow (for details, see "Who
Benefits?" below).

What is caching?

Caching is a technology
that is already familiar in other applications. Many hardware
devices cache frequently used instructions and data in order to
speed up processing tasks. For example, data that is frequently
used by a computer’s Central Processing Unit (CPU) is stored
in very fast memory, sometimes right on the CPU chip, thereby
reducing the need for the CPU to read data from a slower disk
drive. In addition, Web browsers are designed to cache a limited
amount of content on a user’s PC. That is why selecting
"back" or "previous page" on a browser
toolbar typically results in near-instantaneous retrieval.

With true Web caching, the
same concept is applied more widely, using a server or
specialized appliance. Web content is placed close to user in
the form of a network cache, reducing the number of
routing/switching hops that are required to retrieve content
from a remote site. In other words, viewers are not required to
travel to Hollywood to see a movie, rather movies are sent to
local theaters where people can access them–or better yet, the
viewers themselves determine which movies are made available
locally.

There are two kinds of Web
caching models. In the "edge-services" model,
businesses subscribe to a third-party service vendor to have
their content cached. This has serious disadvantages for some of
the parties:

The ISP does
not own or control the infrastructure.
The most
frequently used sites are not necessarily the ones cached,
which is a disappointment to users.

In the "open"
model, supported by Intel caching appliances, service providers
install their own caching equipment and are able to offer
caching as an value-added service to their customers. Advantages
include:

The ISP
invests in its own destiny, and not that of a third party.
Additional
revenue can be realized directly by the service provider.
The system
automatically caches that sites that users access most.

Who benefits from caching?

End users, user
enterprises, service providers and content providers all stand
to benefit substantially from caching implementation.

The ultimate beneficiaries
are end users, the people who drive the Internet economy.
Caching provides distinct benefits for end users in the form of
an enhanced Internet experience and better perceived value for
their monthly service fees.

Caching also has benefits
for enterprises. By providing a local cache for Web content,
companies can monitor how much Internet bandwidth is required to
satisfy employee needs. They can also initiate access policies
to limit employee use of the Web to corporate activities.

For ISPs, caching has
several important advantages:

It reduces
Internet bandwidth usage by eliminating redundant requests
for popular documents.
Leased-line
expenses are reduced or postponed. Benchmarks have shown
that if a cache successfully serves a modest percentage of
user requests, the amount of outbound bandwidth

required can be reduced by up to 30-40 percent. That can
mean a significant cost savings, or the ability to add more
users with the current network. To access a bandwidth
calculator for individual computation, please see http://www.intel.com/network/products/cache_1500.htm
Caching also
provides better QoS, leading directly to higher customer

satisfaction and reduced customer turnover. Less spending is
needed for acquiring new customers.
A caching
solution provides the basis for new value-added Web hosting
services that boost ISP profitability.

Content providers benefit
from higher site availability and a better user experience with
fewer, shorter delays. This creates increased customer
satisfaction, giving cached sites a competitive advantage over
those that are not cached.

Studies indicate that a
delay of only five to eight seconds is enough to frustrate the
average user into retrying or leaving a site. Caching helps
prevent this. And, from an overall commercial viewpoint, users
can visit more sites, do more shopping and purchase more
products if content can be delivered and downloaded faster.

Overview of
Caching Technology

With
and without a solution in place

Without a caching solution
in place, requests for content from a browser and the content
delivered from the origin server must repeatedly take the same
long-distance trip–from the requesting computer to the
computer that has the content, and back (Figure 3). The

following steps are typical:

The Web
browser sends a request for a Uniform Resource Locator (URL)
that refers to a specific Web document on a particular
server on the Internet.
This request
is routed through the normal TCP/IP network transport.
Content
requested from the server (also known as an HTTP server) may
be a static HTML page with links to one or more additional
files, including

graphics. The content may also be a dynamically created page
that is generated from a search engine, a database query or
a Web application.
The HTTP
server returns the requested content to the Web browser one
file at a time. Even a dynamically created page often has
static components that are combined with the dynamic content
to create the final document.
If there is
no cache server in place, the next user who requests the
same document–even if that user is in the next cubicle–must
send a request across the Internet to the originating Web
server and receive the content by return trip.

When caching is used, the
process is far more efficient because frequently accessed
content does not have to repeatedly make the long trip from the
origin server (Figure 3).

The requested
document may be stored on a cache server inside the user’s
corporate LAN, at the user’s ISP, or at some other Network
Access Point (NAP) or Point of Presence (PoP) located closer
to the user than the majority of Web servers.
If the
requested document is stored on the cache server, then the
server will check to make sure the content is current
(fresh). To ensure that a user does not receive a stale
object, freshness parameters are pre-set by content
providers and others, and servers and normally configured
with default algorithms.
If the
content is fresh according to these parameters, then the
transaction is considered a cache "hit," and the
request can be immediately fulfilled from the cache server.
If the
content needs to be refreshed, the cache server will
retrieve updated files from the Internet and send them to
the user, also keeping fresh copies for itself.
The more
frequently a cache can serve user requests, the higher the
hit rate and the better the performance enjoyed by users.

Figures 4: Layer 4 switches can route requests for cacheable data (HTTP, NNTP, etc.) to the cache server while sending other requests to the Internet. Similar
process are involved for FTP file transfers, with an FTP server
handling each request for a file submitted by the FTP client
application. Delays and bottlenecks can be an even bigger
problem with FTP because the size of a typical FTP files is
larger than a typical HTML file. Streaming audio and video are
additional examples of Internet applications that can benefit
from caching content. Internet latency problems can cause
jittery video and delayed or distorted audio. Better use of
bandwidth can be a solution for these problems.

Reducing bandwidth usage

Along with giving users an
improved experience, caching reduces the upstream bandwidth an
ISP has to provide to fulfill user content requirements. A cache
only passes user requests on to the Internet if it isn’t able
to service them. The greater the number of user requests that
can be fulfilled from a cache, the less bandwidth is used to
reach distant origin servers. This traffic reduction means
significant savings for a service provider, since an estimated
one-third of an ISP’s operating costs are recurring
telecommunications charges.

It is true that freshness
updates must be performed, so there would still be traffic from
the ISP out to the Internet even if all requested content were
to be found in the cache server. But by using caching, bandwidth
utilization can be greatly reduced. Caching is even beneficial
when retrieving dynamic documents, because these pages do have
some static elements that can be served from a cache.

Depending on the
distribution of traffic and the scalability of the cache, up to
40 percent (Source: Patricia Seybold Group, 1999) of user HTTP
requests can be taken off the network and fulfilled from the
cache server. This makes networks far more efficient, enabling
better service to be offered at a lower cost. Caching as much
Web content as possible within the ISP while using modest
amounts of upstream bandwidth is a way to give users what they
demand without creating a "black hole" for bandwidth
investment on the part of the service provider.

Deployment
Models

There
are several approaches, or models, for implementing a cache
architecture. Which model is chosen depends on where the cache
is implemented, the primary purpose of the cache and the nature
of the traffic.

Forward proxy

Forward proxy cache is
defined by its reactive nature. With a forward proxy cache
configuration, user requests go through the cache on the way to
the destined Web server. If the cache contains the requested
document, it serves it directly. If it does not have the desired
content, the server acts as a proxy, fetching the content from
the Web server on the user’s behalf.

Reverse proxy

Figures 5: Cache servers may be placed at an ISP PoP to serve requests locally, at an aggregation point on the edge of the internet to reduce bandwidth requirements, or in front of a Web farm to reduce load on content servers. A
cache can also be configured as a fast Web server to accelerate
slower, traditional Web servers. Documents stored in cache are
served at high speed, while documents not in cache–usually
dynamic content and other short-term objects–are requested
when necessary from the origin Web servers. This model is
frequently used to optimize the performance of a Web server
site. The caching system sits in front of one or more Web
servers, intercepting requests and acting as a proxy.

Cache servers can be
deployed throughout a network to create a distributed network of
sites for hosted content, a model that is sometimes referred to
as site replication. In addition to performance benefits for the
user and content provider, reverse proxy caching also has
benefits for the ISP. Those benefits include the ability to
enable load balancing, to offer peak-demand availability
insurance and to provide dynamic mirroring for high
availability.

Transparent caching

Forward proxy caches can
be further configured as either transparent or non-transparent.
A transparent cache sits in the network flow and functions
invisibly to a browser. The benefits of caching are
automatically delivered to clients without anyone having to
reconfigure browsers. For ISPs and enterprise backbone
operations, a transparent configuration is often preferred
because it minimizes the total administrative and support
burden. Individual users and small businesses without IT staff
also appreciate the absence of configuration requirements.

The most popular
implementation is to use a Layer 4 capable switch to interface
cache servers to the Internet (Figure 4). These switches can
inspect network traffic and make decisions above the IP level.
For example, the switch can direct HTTP (or other) traffic to
the cache and send the rest of the traffic directly to the
Internet. The switch can also send requests to specific nodes in
a cache server cluster, a capability that can be used for load
balancing purposes. Using a pair of switches with multiple cache
servers allows for redundancy and failover protection.

Cache locations

To identify ideal cache
deployment points, there are three types of location
characteristics to keep in mind:

Choke point:
Traffic convergence points or choke points are locations where a
large majority of networks traffic passes and would, therefore,
be visible to a cache server. This allows the cache to handle
more requests and store more content than if located somewhere
that is easily bypassed.

High traffic load:
Any area characterized by high traffic conditions allows higher
cache utilization. The more cache hits, the greater the
benefits.

Economic potential:
Points where users will benefit from high cache hit rates while
also reducing upstream bandwidth requirements will provide both
QoS benefits and positive economics for the access provider.

These characteristics are
typically found at major Internet switching locations, dial-in
aggregation points, or corporate gateways (Figure 5). Uses
include standard POP and dial-up

access, NAPs and exchanges, Web hosting, "last mile"
acceleration, satellite-based cache feeding and more. Caching is
even employed as an economical means of updating information for
online news services.

Cache hierarchies

In the event that a
requested document is not stored in cache (a cache
"miss"), the cache server usually must forward the
request to a distant origin server. However, if the cache server
were able to check with another nearby cache instead, the
process could be much faster. This is the idea behind cache
hierarchies.

It is possible to create
relatively small regional caches–for example, a server or
cluster handling a department or limited geographical area–and
link them to larger parent caches that define larger groups or
areas. If a regional cache does not have a requested document,
it can forward the request to the parent cache. This will still
provide faster service than contacting the origin server.
Multiple level hierarchies can be configured, giving cache
servers a sequence of larger and larger caches to query if the
first attempt misses.

By combining capabilities
such as site replication and a linked hierarchical caching
structure, a highly efficient distributed network can be created
for Web hosting over a wide geographical area.

Advantage of
Using a Cache Appliance

While
this paper is intended to provide information on Web caching in
a generic context whenever possible, Intel products are
used in the section below in order to provide a meaningful level
of detail. Unless otherwise indicated, the appliance
functionality and attributes described below are applicable to
the caching appliances offered by Intel.

Cost-effectiveness

By definition, an
appliance (sometimes referred to as a "thin server")
is a device that provides a limited number of dedicated
function, and is therefore able to deliver those functions more
cost-effectively than a multi-purpose device.

This does not mean that
appliances are not robust solutions. In fact, by specializing in
one particular area, they often provide a richer feature set,
superior stability and broader flexibility in terms of
deployment and configuration.

As example, the Intel
NetStructured Cache Appliance’s integrated hardware and
software design has been specifically engineered to provide
robust, carrier-class caching.

Capabilities include:

Speed (the
ability to handle thousands of simultaneous user
connections)
Scalability
(nodes can easily be added as needed to a cache cluster)
Fault
tolerance (contributing to network stability)
Secure
single-point administration (many nodes can be configured at
once)

Ease of installation and use

As a fully integrated
"solution in a box" comprising all of the necessary
hardware and software, an appliance is very easy to install and
configure. The Intel NetStructure Cache Appliance has automated
wizard and intuitive software configuration that make setup
easy.

This is a significant part
of the cost savings provided by appliances, because it takes
minimal time to incorporate the device into the network and does
not require the expertise and expense of a systems
administrator.

Further savings are
provided by the relatively compact size of most appliances.

The Intel NetStructure
Cache Appliance, for instance, comes in a low profile, rack
mountable design. This provides an easy way to increase network
capacity in the same limited space that an infrastructure owner
or operator already has available.

Flexibility

Since it is designed for a
single, specialized purpose, an appliance typically offers a
high degree of deployment flexibility. This appliance is
no exception.

It can be used in a
variety of deployment models, alone or with other enterprise
software, including other caching products. Here are some of the
ways it can be implemented:

Forward proxy
Reverse proxy
Transparent
caching
Part of an
HTTP cache hierarchy
ICP sibling:
The Intel NetStructure Cache Appliance can send ICP
queries to neighboring caches as part of an ICP cache
hierarchy
NNTP news
cache: The Intel appliance caches frequently accessed
news articles and can also receive news feeds for designated
news groups

In addition, the Intel
NetStructure Cache Appliance offers broad support for
content and interoperability protocols:

HTTP versions
0.9 through 1.1
FTP
NNTP
ICP (to help
implement cache hierarchies)
SSL
encryption
WCCP
WPAD

Performance

Key
Requirements

The most
important requirement of a caching solution is the ability
to provide optimized performance. There are two sides to
cache performance:

Operational capacity: This
is addressed by the architecture and the implementation of
the cache server. Along with raw cache capacity,
architectural issues include how the server makes use of
multiple threads of execution, and how well it performs
load balancing in a multiple cache server deployment.

Responsiveness to user
requests: This is determined by the various techniques the
cache server uses to maximize hit rate, including the
structure of hierarchies (see "Cache
Hierarchies," below) and content optimization. Cache
hit rate is a function of many factors, including the
cache size and the load on the cache.

A cache server can be
turned to improve capacity and responsiveness in many
ways. Potential areas of optimization include:

Processing queues for
the various objects that make up a document
Determining whether or
not a requested object is cached
nÂ Delivering the
requested object to the browser if it is not cached
Total throughput based
on how fast incoming requests are handled

Performance depends on how
well these possibilities were understood and used by those
who built the cache server and engineered the software.

Scalability for the Web is
another key requirement that a cache must address. The
effectiveness of caching improves as the traffic served by
the cache increases–the bigger the challenge, the more
valuable the solution. To support very large caches, cache
server clustering or load balancing is necessary, and the
caching solution must support these capabilities.

Cache server support is
also required for a variety of protocols. Network caching
can be applied to content delivered over HTTP, NNTP, FTP
and others. All are characterized by having some
proportion of static content.

Manageability is mandatory
for any caching solution. Cache management includes the
ability to easily install and maintain cache servers, and
to access the wealth of usage and traffic data the servers
can provide. It is often necessary to mange arrays of
cache servers, which may be distributed over great
distances, from a single point of control.

Browser-based management
interfaces are increasingly common as the standard way to
manage distributed systems. The interface should provide
functionality for configuring cache servers, administering
security, setting filters, loading the cache, controlling
the cache system and gathering information from logs.

A caching solution should
be designed to provide high reliability and availability.
Although the duplicative nature of caching has a measure
of fault tolerance built in, the solution must feature
high quality software and a highly reliable platform if it
is to be considered an integral part of the network
infrastructure.

Configuration approaches
such as failover and clustering also contribute to
reliability and availability.

Finally, hardware and
software must be well integrated to achieve the
efficiencies that are at the heart of caching performance.

As previously noted,
performance depends on capacity, including how well the server
makes use of multiple threads of execution, and the ability to
respond quickly to use requests. The Intel cache is designed
for high-performance operation across an broad range of load
conditions.

It aggressively
implements multi-threading–breaking down large transactions
into small, efficient tasks. A threaded event scheduler allows
the Intel NetStructure Cache Appliance to handle
thousands of simultaneous connections and maximize CPU usage.
The appliance is able to respond to multiple requests
simultaneously and efficiently even under peak loads.

The appliance’s
Inktomi Object Store is a custom-designed Web object database
that has been fully optimized for caching. It uses raw disk
input/output to achieve optimized storage and retrieval of
content objects, resulting in much higher speeds than
conventional file systems. In order to provide fast access for
the most frequently requested objects, a RAM cache is
maintained so that hot objects can be read from high-speed
memory instead of from the disk. In addition, all objects are
indexed according to their URL and associated headers. This
means the Intel NetStructure Cache Appliance can store,
retrieve and serve not only web pages, but also parts of web
pages, providing optimized bandwidth savings.

Centralized administration

This cache helps minimize
the cost of administering, maintaining and operating a large
cache system. It offers several centralized management
alternatives to suit the needs of a wide range of environments:

Browser-based interface:
The manager User Interface (UI) offers password-protected,
single point administration for the entire Intel cache cluster.

Command line interface:
A command line interface lets the administrator configure the
system’s network addresses, and control, configure and monitor
the cache.

SNMP management:
This cache supports two Management Information Bases (MIBs) for
management through SNMP facilities. MIB-2 is a well-known
standard MIB. A proprietary Intel Cache MIB provides more
specific node and cluster information.

Performance reporting:
Performance statistics are available at a glance from the
manager UI or the command line interface. Some of the
characteristics that can be managed include: log file formats;
site or content blacklist filtering; anonymization; never-cache,
pin-in-cache, revalidated after; store multiple versions of
cached objects for user-defined or browser-defined differences
in content; domain and hot-name expansion, and content routing.

Scalability and reliability

In recognition of the
mission-critical nature of caching, the Intel NetStructure Cache
Appliance is designed to provide a highly reliable and
available cache service. And, since it is designed to implement
caches at the highest levels of network traffic, including
Network Access Points and on the backbone, it is easy to scale.

This cache achieves
a high degree of scalability through three mechanisms. They
include:

Symmetric
Multi-Processing (SMP)
Clustering
Cache
hierarchies

Multiple threaded
processors provide the in-box performance to accommodate growth,
and clustering provides scalability across several machines by
spreading the workload. The Intel NetStructure Cache Appliance
executes its own cache hierarchy configuration, which is
used in

conjunction with ICP to communicate with other caches.

Clustering technology is
supported by the Intel NetStructure Cache Appliance, combining
the resources of several machines to increase capacity. As new
nodes are added to the cluster they build on existing nodes to
provide additional disk and processing resources. Clustering
also offers failover protection–node failures can be
automatically detected, and traffic is then redistributed to
active nodes.

In today’s complex Web
environment, it is important to consider end-to-end performance
and response time as being the product of many factors, over
which few web sites, service providers or users have control.
Service providers need to provide an optimal user experience,
measured in low latencies for downloads and fast download times.

Various caching approaches
are available, and they can be implemented in a variety of ways
depending on the specific caching requirements. When correctly
placed and configured, caches can significantly improve the user
experience and QoS, while saving service providers significant
costs of providing upstream bandwidth.

Another plus is the added
revenue that caching can bring to service providers by giving
them opportunities to offer service level guarantees and peak
insurance.

The Intel NetStructure
1500 Caching Appliance, featuring Inktomi Traffic Server Engine
caching software, is a carrier-class product cable of delivering
fresh content to a large number of users from a large number of
Web servers.

It is ideal for
enterprises that need to better manage the use of network
resources, provide superior information distribution to
employees, and reduce the administrative burden through
transparent proxy and caching capabilities. Even more
importantly, it gives service providers a superior approach to
managing growth in backend connectivity–growth that otherwise
could expand at an almost infinite rate.

Courtesy:
Intel Corp.