A Better Approach

VoicenData Bureau
New Update

In the past five years, the telecom industry has seen an explosion in file

based storage. This growth has come from an increasing subscriber base and

launch of database services.


The growth in file based storage has fueled the need for better management

approaches, including the use of migration tools to drive better placement of

data. In concert with the growth in file based storage is the need for data

custodians and legal and compliance professionals to more quickly retrieve and

move data. However, the lack of content-aware file services has made content

based search and migration of corporate data challenging.

At a simplistic level, file services can be viewed as the storage

infrastructure serving unstructured or file based data. However, a more

comprehensive look at file services includes more advanced functions performed

on file data, including virtualization, archiving, and migration to help manage

unstructured file data.

However, there is a growing need for file services to leverage content

services that enable content based indexing, classification, and search for file

based data. Business, legal, and regulatory demands are driving the need for

integration of file and content services. File based storage infrastructure must

become more cognizant of the relevance of data it stores based upon a topic,

keyword, customer, or custodian and enable content-triggered migrations in

support of legal holds and retention policies. Moreover, telecom operators need

to perform federated searches based upon content across systems and



Situation Overview

The past five years have seen an explosion in file based storage to

accompany large investments made in block storage infrastructure. Traditionally,

block based storage is suited for highly random I/O environments such as

structured databases and applications supporting transactions. A long-standing

architecture for mission-critical applications, block based storage

architectures are fast and efficient and provide high levels of reliability and

availability with features such as provisioning, virtualization, replication,

and migration. File based storage makes use of standard network protocols such

as CIFS and NFS over IP. When an application sends a request to a file based

storage system, it presents a file. File based storage services provide

functions such as file organization, sharing, virtualization, replication,

migration, archiving, etc.

A challenge with file based storage and block based storage is that they lack

any awareness of the content within the data they store. Block storage deals at

the byte level. File storage deals at the file system level. Neither approach

understands the value of this data to the organization. Firms are facing an

increasing need for storage systems to be more knowledgeable about the content

of the data. This knowledge can result in more intelligent content-centric

policies for migration, search, preservation, retention, and disposition. It is

no longer enough for storage systems to provide only the right levels of

performance, reliability, and availability of an application. Storage systems

need to provide content services such as indexing, classification, and search.

File-level services such as migration can benefit from a full-content indexed

repository and call upon it for policy based management and migration, federated

search, legal holds, retention, classification, and storage tiering based on

content-triggered rules.

The telecom industry has taken some steps towards becoming more content

aware. There are software solutions that allow data to be classified and

indexed. There are hardware or appliance solutions that have built-in

classification, but lack native indexing and search functions. These tools are

components of the desired result, but require manual integration and management

of disparate technologies.


File and Content Services

There are several ways in which a telecom operator as well as enterprises

can address the file and content services:

Software based solution: Applying third party software solutions to achieve

content services allow data to be classified, indexed, and searched. Leveraging

a third party application gives a firm the flexibility to select the

best-of-the-breed solution to meet its specific environment and application

workload. For example, specialized content services solutions are tuned for

indexing and search of specific file types such as audio files.

This is a workable solution that comes with a few challenges. Having the

content services delivered separately from the storage platform requires manual

integration and intervention on the part of the technical team to manage the

storage resources and enable features such as content based archiving or



Any action that must be taken as a result of content services relies on the

hardware or other third party software to execute it. The result is a disconnect

between information driving the action and the action itself.

Hardware based solution: Another approach is to leverage a file mover

API native to the file based storage system. API provides window into the

organization of files within the file based storage system, but requires

integration with a third-party application for policy controls and data

movement. The third party application provides the policy engine for scheduling

and moving data based on the policy between different tiers of storage.

Integrated solution: A third approach to integrating file and content

services is to allow file level services to leverage content services in the

storage infrastructure. Integration between a file migration service and content

service search result allows content-triggered migration of data between

higher-performance file and active archive storage tiers.


Additionally, centralized federated search across different storage tiers,

via centralized content service satisfies business requirements for legal and

regulatory investigations. However, not all vendors have integrated file and

content services. Critical questions in determining the scope of integration of

file and content services include:

Performance: Is migration directly from the file tier to a content

tier supported without requiring processing involvement from third party data

movement software?

Cost: Is an additionally priced third party application providing the

policy engine for triggering migration required, or is migration natively



Maintenance: If a third party application is required, does it support

the storage vendor APIs?

Content awareness: Can a migration policy be triggered based on both

content or keyword search results and metadata attributes?

Chain of custody: If a third party application is required, what is

the impact during storage migrations, and are third party chain of custody

certifications required?


Transparency: Is the migration from one tier to another transparent to

the client and expected access paths? Is the back-up application migration

aware, or is a recall of the migrated files necessary during routine back-up


Find-ability: How easy is it to find content based on keywords across

different file storage tiers, including an active archive tier? Are multiple

searches required by application, or is a federated search available.

Akhilesh Shukla