The Mumbai deluge of July 26 exposed the softer underbelly of India's
financial capital. True, at 940 mm it was the highest recorded rainfall in a
single day anywhere in the world (forcing geography books to replace Cherrapunji
with Mumbai) but still, the way most basic services like power, telephone
network, and the bank ATMs floundered showed how unprepared India is in the face
of a disaster-natural or otherwise.
As the city gradually limps back to normalcy amidst lots of mutual finger
pointing, Dataquest decided to have a look at how technology fared on the black
Tuesday, and whether it could have really assuaged the problems to a certain
extent. Our investigations reveal one interesting result: in most cases the
problems were rather man-made, or manual, in nature; technology, more or less,
proved to be a savior. However, as all service providers analyze their 7/26
(July 26) learnings, the biggest lesson seems to be: have a proper planning
procedure in place to better harness technology during similar disasters in the
future.
Drowned Networks
The sudden failure of telephone networks was perhaps the worst thing to have
happened. There seemed to be a common problem cutting across all service
providers-most of the cell sites, especially in the suburban areas which
received more rainfall, had no power supply. Though generators were available,
in very few cases were there provisions to stock enough diesel. This lack of
foresight came to light since many suburbs like Andheri, Marol, and Vakola had
no power for days on end and consequently telephone connections in these areas
suffered, even after the 26th.
|
Tata Teleservices, in particular, seems to have suffered the most. Admits
Sandeep Mathur, "Since most of our cell sites run on Reliance Energy power
supply-which was down for a week in the Andheri region-TTSL cell sites were
down as we had generator capacity for only four hours, after which it was
difficult to even procure diesel." What made matters worse was that TTSL's
equipment in the Marol site, which delivers services to many clients in the
Andheri area, was completely submerged on both 26th and 27th. New equipment, as
replacement, could be brought only after the water level receded a bit on the
27th. However, on the 28th it rained again, and the switching equipment in Marol
got affected.
While Tata's phone services were badly hit, Reliance Infocomm fared better.
The general consensus was that the Reliance network was working the best amongst
all. SP Shukla, Reliance Infocomm, however, does not attribute this to a CDMA
advantage. Rather, he feels that having more cell sites in more areas helped. He
also adds that the extensive network of optic fiber cables also proved to be a
boon-rain water did not have any impact on these. Also, Reliance seemed to
have a better provision for generators, and adequate amount of diesel to run
them.
This proves that it was not a question of a CDMA or a GSM service provider
offering better service on that night. It boiled down to the number of cell
sites, and whether proper redundancy planning had been made for power and diesel
generators. Claims Jayant Khosla, CEO, Bharti Televentures, "The rains had
minimum impact on Airtel transformers and base stations, and even in power
problem areas generators made the sites fully functional. Apart from that our
current battery backup is a minimum of 12 hours and diesel generators were
deployed at cell sites at strategic locations." Amongst other GSM
providers, Orange customers suffered as they had power backup of only two hours
on cell sites. Similarly, there was not enough power back up on BPL and MTNL
cell sites.
Sinking Banks
After phones, another piquant problem Mumbaikars faced was non-functional
ATMs. SBI, HDFC, ICICI, UTI, all banks with large number of ATMs were not
working. In fact, in case of SBI, even in Kolkata a notice was hung on one of
the ATMs, on 27th, that since the ATM switch in Belapur, Navi Mumbai was under
water, it could not function properly. Ramanthan of SBI, however, denies this as
"sensationalism of media" and claims that their operations were fully
in place, thanks to migration to the DR site in Chennai. "On 26th night,
when we realized the gravity of the situation, we switched to Chennai DR for the
ATM networks, as they are critical, and cater to 5400 ATMs across the country.
We also have 10 other bank customers sharing this network and could not afford
any down time. We decided that more than any other services our foreign
operations and ATM services had to be operational and, therefore, they were made
available through our DR site at Chennai. Later, we were able to switch back to
our DC at Belapur for our ATM Networks."
The few instances, he claims, where the ATMs in suburban Mumbai were down
were due to last premise equipment, where there was physical disruption of the
ATM centers like water logging, cable cuts and equipment damage. He also claims
there was never any cash crunch and no problem with cash replenishment.
"The DR site in Chennai is a mirror site and all the log files were
switched; however, we continued to run our core banking apps from the Belapur DC
only," he added. MTNL in Navi Mumbai was flooded and so communication was
disrupted on the 27th, Ramanathan admitted, perhaps making provisions for why
there was disruption in other parts of the country.
Other
banks too claim that it was not that their ATM networks were down, rather it was
the physical ATM centers in severely water-logged areas that were
non-operational. Even VK Ramani of UTI Bank informs that ATMs in the suburbs
that were down were so due to cable cuts and water logging. "In our case
the ATMs went down for a brief period of four hours because there was no power
in our Chembur Datacenter. We considered migrating to our secondary site at
Bangalore, but realized that no DR of a bank can give 100% service all the time.
And since the ATM could not be down for a brief period also, we decided to
continue on the primary site only and that way had 100% service available. Since
there was no power from Reliance Energy, we managed our own power supply and
when we felt that we might have to restock on fuel, the drums were 'rolled
over' to the premises as there was no transportation."
Ramani's argument that DR cannot run 100% services of a bank are echoed by
Srinivas of Sanovi too, a DR service provider of Mumbai who has HDFC as its
customer. "A remote site cannot offer guarantee as the issue here is to
ensure consistent formats between the primary and the secondary sites. There is
data loss during failovers.
Only when the failover is of the organized type, which can happen with the
help of designed software, it is smooth and 100% data is guaranteed, otherwise a
secondary site is prepared to work for durations starting from 7 days to
months."
HDFC Bank was a typical example. It had to revoke its DR for its ATM Network.
Though the datacenter was operational, but end ATM network centers'
functioning was disrupted at various locations owing to the data loss during
failover. The bank took a dual approach where it ran its critical apps from the
mirror site and continued to run the non-critical operations from the primary
site. The learning for most banks seems to be that when there is a communication
failure one has to depend on automation, as minimum people involvement is
advisable. The system behaves like a split-brain-the software views all the
services and starts treating the secondary functions as the primary.
Datacenters-Live Up To Expectations
Mumbai also houses a number of important datacenter hosting sites of crucial
financial services and other organizations. A look back at how Reliance, Tatas
and the NSE coped with their datacenters could be a useful future reference for
service providers encountering similar disasters. The biggest advantage for
Reliance, argues Sunil Gupta, is that "the basic design of the datacenter
is owned by us. Hence, quality of data control is in our control; it runs on a
commercial building, leased out, but owned independently by us." Another
advantage was that they had a totally independent power feeder for the building
in 2 separate grids, with power redundancy also from 2 feeders. And even if
power from the grid went down, they had enormous power capacity to generate
their own power, he added.
Moreover, the Reliance datacenter buildings have Faraday's cage, implying
they are guarded against lightening strikes, are electromagnetic radiation safe,
and designed to absorb an earthquake measuring 8.3 on the Richter scale.
Therefore, all 10 plus banks hosting core banking with Reliance plus other
enterprises like HLL, Godrej and Videocon had no operations disrupted. There was
also the mirror site in Bangalore that ensured that all data was replicated in
Bangalore if a customer opted to go for a secondary site in Bangalore for DR.
Business continuity was not an issue for Satish Naralkar of NSE as they had
an extremely robust DR plan in place: " It was so used to mock drill, that
when tested in a live scenario it more than lived up to expectations."
Power was not a problem as NSE had supply from both Tata and Reliance and also a
strong backup in place for fuel capacity to cater for 72 hrs. The main
datacenter equipment was housed on the 6th floor but few were in the basement,
which had to be protected. Pumps were used to dislodge water, and blocks also
put in place. NSE had prepared back up plans in case the adversity continued
beyond 72 hours.
Though SEBI had declared only the 27th as a holiday, the market was ready to
trade only days after. And as people were stuck in their offices they wanted
their systems to be available and be able to trade. The NSE mirror site is in
Chennai. The primary site, inspite of being in an area which was in the high
alert affected region, Bandra Kurla Complex, was well protected. The satellite
connectivity was on. There were instances of only a few LAN connections which
were down as they were on leased lines from MTNL-the MTNL exchange in the area
was down, Narelkar informs.
For the Tatas, just like their voice services, the disaster provided valuable
learnings. The power had to be shut off as there was danger of short circuit in
the water, with most of the equipment located in the basement. Emergency power
on generators was used, but as there was capacity in place only for 4 hours,
they ran out of fuel. Tata Power proved to deliver efficiently by putting in a
second ring. The only DC affected was the one at Andheri (E) Technopolis Park.
The Vashi and Prabhadevi (VSNL premises) were unaffected as they are located
higher up in the building. By 26th late evening, Mathur informs, power at the DC
was restored with the help of the second ring.
Rajneesh De and Minu
Sirsalewala