Advertisment

DISASTER RECOVERY: Telecom: Down, when most needed

author-image
VoicenData Bureau
New Update

The Mumbai deluge of July 26 exposed the softer underbelly of India's

financial capital. True, at 940 mm it was the highest recorded rainfall in a

single day anywhere in the world (forcing geography books to replace Cherrapunji

with Mumbai) but still, the way most basic services like power, telephone

network, and the bank ATMs floundered showed how unprepared India is in the face

of a disaster-natural or otherwise.

Advertisment

As the city gradually limps back to normalcy amidst lots of mutual finger

pointing, Dataquest decided to have a look at how technology fared on the black

Tuesday, and whether it could have really assuaged the problems to a certain

extent. Our investigations reveal one interesting result: in most cases the

problems were rather man-made, or manual, in nature; technology, more or less,

proved to be a savior. However, as all service providers analyze their 7/26

(July 26) learnings, the biggest lesson seems to be: have a proper planning

procedure in place to better harness technology during similar disasters in the

future.

Drowned Networks



The sudden failure of telephone networks was perhaps the worst thing to have

happened. There seemed to be a common problem cutting across all service

providers-most of the cell sites, especially in the suburban areas which

received more rainfall, had no power supply. Though generators were available,

in very few cases were there provisions to stock enough diesel. This lack of

foresight came to light since many suburbs like Andheri, Marol, and Vakola had

no power for days on end and consequently telephone connections in these areas

suffered, even after the 26th.

Where did the station go? High tide at Bombivli station

Advertisment

Tata Teleservices, in particular, seems to have suffered the most. Admits

Sandeep Mathur, "Since most of our cell sites run on Reliance Energy power

supply-which was down for a week in the Andheri region-TTSL cell sites were

down as we had generator capacity for only four hours, after which it was

difficult to even procure diesel." What made matters worse was that TTSL's

equipment in the Marol site, which delivers services to many clients in the

Andheri area, was completely submerged on both 26th and 27th. New equipment, as

replacement, could be brought only after the water level receded a bit on the

27th. However, on the 28th it rained again, and the switching equipment in Marol

got affected.

While Tata's phone services were badly hit, Reliance Infocomm fared better.

The general consensus was that the Reliance network was working the best amongst

all. SP Shukla, Reliance Infocomm, however, does not attribute this to a CDMA

advantage. Rather, he feels that having more cell sites in more areas helped. He

also adds that the extensive network of optic fiber cables also proved to be a

boon-rain water did not have any impact on these. Also, Reliance seemed to

have a better provision for generators, and adequate amount of diesel to run

them.

This proves that it was not a question of a CDMA or a GSM service provider

offering better service on that night. It boiled down to the number of cell

sites, and whether proper redundancy planning had been made for power and diesel

generators. Claims Jayant Khosla, CEO, Bharti Televentures, "The rains had

minimum impact on Airtel transformers and base stations, and even in power

problem areas generators made the sites fully functional. Apart from that our

current battery backup is a minimum of 12 hours and diesel generators were

deployed at cell sites at strategic locations." Amongst other GSM

providers, Orange customers suffered as they had power backup of only two hours

on cell sites. Similarly, there was not enough power back up on BPL and MTNL

cell sites.

Advertisment

Sinking Banks



After phones, another piquant problem Mumbaikars faced was non-functional

ATMs. SBI, HDFC, ICICI, UTI, all banks with large number of ATMs were not

working. In fact, in case of SBI, even in Kolkata a notice was hung on one of

the ATMs, on 27th, that since the ATM switch in Belapur, Navi Mumbai was under

water, it could not function properly. Ramanthan of SBI, however, denies this as

"sensationalism of media" and claims that their operations were fully

in place, thanks to migration to the DR site in Chennai. "On 26th night,

when we realized the gravity of the situation, we switched to Chennai DR for the

ATM networks, as they are critical, and cater to 5400 ATMs across the country.

We also have 10 other bank customers sharing this network and could not afford

any down time. We decided that more than any other services our foreign

operations and ATM services had to be operational and, therefore, they were made

available through our DR site at Chennai. Later, we were able to switch back to

our DC at Belapur for our ATM Networks."

The few instances, he claims, where the ATMs in suburban Mumbai were down

were due to last premise equipment, where there was physical disruption of the

ATM centers like water logging, cable cuts and equipment damage. He also claims

there was never any cash crunch and no problem with cash replenishment.

"The DR site in Chennai is a mirror site and all the log files were

switched; however, we continued to run our core banking apps from the Belapur DC

only," he added. MTNL in Navi Mumbai was flooded and so communication was

disrupted on the 27th, Ramanathan admitted, perhaps making provisions for why

there was disruption in other parts of the country.

Other

banks too claim that it was not that their ATM networks were down, rather it was

the physical ATM centers in severely water-logged areas that were

non-operational. Even VK Ramani of UTI Bank informs that ATMs in the suburbs

that were down were so due to cable cuts and water logging. "In our case

the ATMs went down for a brief period of four hours because there was no power

in our Chembur Datacenter. We considered migrating to our secondary site at

Bangalore, but realized that no DR of a bank can give 100% service all the time.

And since the ATM could not be down for a brief period also, we decided to

continue on the primary site only and that way had 100% service available. Since

there was no power from Reliance Energy, we managed our own power supply and

when we felt that we might have to restock on fuel, the drums were 'rolled

over' to the premises as there was no transportation."

Advertisment

Ramani's argument that DR cannot run 100% services of a bank are echoed by

Srinivas of Sanovi too, a DR service provider of Mumbai who has HDFC as its

customer. "A remote site cannot offer guarantee as the issue here is to

ensure consistent formats between the primary and the secondary sites. There is

data loss during failovers.

Only when the failover is of the organized type, which can happen with the

help of designed software, it is smooth and 100% data is guaranteed, otherwise a

secondary site is prepared to work for durations starting from 7 days to

months."

HDFC Bank was a typical example. It had to revoke its DR for its ATM Network.

Though the datacenter was operational, but end ATM network centers'

functioning was disrupted at various locations owing to the data loss during

failover. The bank took a dual approach where it ran its critical apps from the

mirror site and continued to run the non-critical operations from the primary

site. The learning for most banks seems to be that when there is a communication

failure one has to depend on automation, as minimum people involvement is

advisable. The system behaves like a split-brain-the software views all the

services and starts treating the secondary functions as the primary.

Advertisment

Datacenters-Live Up To Expectations



Mumbai also houses a number of important datacenter hosting sites of crucial
financial services and other organizations. A look back at how Reliance, Tatas

and the NSE coped with their datacenters could be a useful future reference for

service providers encountering similar disasters. The biggest advantage for

Reliance, argues Sunil Gupta, is that "the basic design of the datacenter

is owned by us. Hence, quality of data control is in our control; it runs on a

commercial building, leased out, but owned independently by us." Another

advantage was that they had a totally independent power feeder for the building

in 2 separate grids, with power redundancy also from 2 feeders. And even if

power from the grid went down, they had enormous power capacity to generate

their own power, he added.

Moreover, the Reliance datacenter buildings have Faraday's cage, implying

they are guarded against lightening strikes, are electromagnetic radiation safe,

and designed to absorb an earthquake measuring 8.3 on the Richter scale.

Therefore, all 10 plus banks hosting core banking with Reliance plus other

enterprises like HLL, Godrej and Videocon had no operations disrupted. There was

also the mirror site in Bangalore that ensured that all data was replicated in

Bangalore if a customer opted to go for a secondary site in Bangalore for DR.

Business continuity was not an issue for Satish Naralkar of NSE as they had

an extremely robust DR plan in place: " It was so used to mock drill, that

when tested in a live scenario it more than lived up to expectations."

Power was not a problem as NSE had supply from both Tata and Reliance and also a

strong backup in place for fuel capacity to cater for 72 hrs. The main

datacenter equipment was housed on the 6th floor but few were in the basement,

which had to be protected. Pumps were used to dislodge water, and blocks also

put in place. NSE had prepared back up plans in case the adversity continued

beyond 72 hours.

Advertisment

Though SEBI had declared only the 27th as a holiday, the market was ready to

trade only days after. And as people were stuck in their offices they wanted

their systems to be available and be able to trade. The NSE mirror site is in

Chennai. The primary site, inspite of being in an area which was in the high

alert affected region, Bandra Kurla Complex, was well protected. The satellite

connectivity was on. There were instances of only a few LAN connections which

were down as they were on leased lines from MTNL-the MTNL exchange in the area

was down, Narelkar informs.

For the Tatas, just like their voice services, the disaster provided valuable

learnings. The power had to be shut off as there was danger of short circuit in

the water, with most of the equipment located in the basement. Emergency power

on generators was used, but as there was capacity in place only for 4 hours,

they ran out of fuel. Tata Power proved to deliver efficiently by putting in a

second ring. The only DC affected was the one at Andheri (E) Technopolis Park.

The Vashi and Prabhadevi (VSNL premises) were unaffected as they are located

higher up in the building. By 26th late evening, Mathur informs, power at the DC

was restored with the help of the second ring.

Rajneesh De and Minu

Sirsalewala

Advertisment