T&M: Stop Paralysis with Protocol Analysis



The features and capacities embod ied in the modern 3G mobile net work make
for a complex system with many modes, nodes, elements, interfaces, and
protocols. Consequently, problems, when they arise, can also have multiple
sources of origin–in hardware or software. As mobile Internet connectivity
becomes common, the challenge of maintaining uninterrupted data transactions
will require newer and more powerful monitoring solutions and procedures.

Call Set-up Failure
In order to obtain packet data services, the mobile performs registration
with the serving wireless network on the A1 interface and then with the packet
network on the A10/A11 interface. The mobile sends an origination message to the
base station (BS) that includes the packet data service option. This results in
assignment of the traffic channel, establishment of a A10 connection,
establishment of the link layer (PPP) and in cases where Mobile IP is used by
the terminal, Mobile IP registration with the serving packet network. User data
traffic then passes over the A10 connection encapsulated within generic routing
encapsulation (GRE) frames.

The packet control function (PCF) periodically re-registers with the selected
packet data serving node (PDSN) by sending the A11-Registration Request message
before the A10 connection lifetime expires.

A successful call set-up scenario is illustrated in the diagram above. This
standard message sequence chart outlines a series of steps, summarized in items
1—12 to follow. Note that this explanation bypasses the radio
reception/transmission activities of the base transmission system (BTS),
concentrates on the protocol functions instead which begin with the origination
dialogue between the mobile and the BSC.

  • To register for packet data services, the mobile sends an origination
    message over the access channel to the BSS.
  • The BS acknowledges the receipt of the origination message, returning a
    base station Ack order to the mobile.
  • The BS constructs a CM service request message and sends the message to
    the MSC.
  • The mobile switching center (MSC) sends an assignment request message to
    the BSS requesting assignment of radio
    resources. No terrestrial circuit between the MSC and the BS is assigned to
    the packet data call.
  • The BS and the mobile perform radio resource set-up procedures. The packet
    control function (PCF) recognizes that no A10 connection associated with
    this mobile is available and selects a PDSN for this data call.
  • The PCF sends an A11 registration request to the selected PDSN.
  • The A11 registration request is validated and the PDSN accepts the
    connection by returning an A11 registration reply message. Both the PDSN and
    the PCF create a binding record for the A10 connection.
  • After the radio link and A10 connection are setup, the BS sends an
    assignment complete message to the MSC.
  • The mobile and the PDSN establish the link-layer (PPP) connection and then
    perform the MIP registration procedures over the PPP connection.
  • After completing MIP registration, the mobile can send/receive data via
    GRE framing over the A10
  • The PCF periodically sends an A11 registration request message for
    refreshing registration for the A10
  • For a validated A11 registration request, the PDSN returns an A11
    registration reply message. Both the PDSN and the PCF update the A10
    connection binding record.

    This complex process can be a source of problems. A rigorous monitoring
    scheme involving simultaneous observation of the A1 interface and the
    A10/A11 interface is the best way to detect and correct errors early. Here,
    a multi-interface call-trace application is especially productive, since it
    can trace and group all of the procedures related to the activity of each
    single subscriber in a CDMA network, even as the procedures evolve over
    multiple interfaces.

Within the call set-up process, an error in any element or procedural step
can inhibit the remaining steps. For example, suppose that the MSC does not
respond to the CM service request message (Step 3 in the figure) sent by the BSC/PCF
over the A1 interface. This is at times caused by internal MSC problems. If this
prevents the completion of the CM service request, the BSC/PCF cannot assign
radio resources to the mobile station, thus preventing establishment of the
connection. The user finds it impossible to make a data call–a service for
which he has paid a premium. Before a specific timer expires, the PCF sends
periodically A11 registration request message (Step 11) to refresh the
registration for the A10 connection. For a validated A11 registration request,
the PDSN returns an A11 registration reply message (Step 12).

Here, internal problems in the PDSN can cause it to respond late or not at
all. As a result the process of establishing or maintaining the connection
cannot continue. The user is once again unable to make a data call. In both
cases, a protocol analyzer connected to the A1 and A10/A11 interfaces can help
track down the problem. The call trace application can make out the origin of
messages and detect failures to respond.

Inefficient Transmission
Frequently in a CDMA2000 network the TCP user-plane packets are of small
window size. This is a by-product of the soft-start mechanism of in the TCP
protocol and also implies that end-to-end TCP connections are not stable. The
more TCP packets lost in the network and not acknowledged, the smaller the
window size, resulting in dropping and re-establishing of more TCP connections.

To analyze this problem, it is necessary to capture the TCP/IP user plane
packets flowing on the GRE tunnels on the A10 interface. Protocol filtering
allows the tool to home in on just the data of interest. By applying different
types of filtering with increasing level of details, it is possible to ‘drill
down’ and isolate the root cause of the shrinking TCP packet Window size.

Routing Loop Problems
‘Tunnel router loops’ are another class of CDMA2000 network problems.
The problem is caused by misconfiguration in the PDSN routers, which can be
detected by acquiring and analyzing IP traffic on the P-H interface. To
understand tunnel router loops, imagine a subscriber surfing the Web with a
laptop connected to a CDMA2000 handset. Packets addressed to go to a specific
HTTP proxy are routed from the PDSN/FA (foreign agent) to the home agent (HA)
for de-tunneling.

With certain incorrect internal routing configurations, packets destined for
Port 80 www are not de-tunneled by the HA. Instead, they are sent back
downstream toward the PDSN/FA. As a result, multiple packets travel on the same
network segment with the same packet ID, wasting precious bandwidth–and not
reaching the intended destination. In addition, for each repetitive hop a packet
takes between the PDSN/FA and HA nodes, the IP time to live (TTL) field is
decremented. If the packet is stuck in a router loop, the TTL eventually
decrements to zero and the packet is discarded by the network nodes. ‘Lost’
packets must be retransmitted. This means more packet retransmission and reduced
throughput.

As in the earlier examples, the solution is to use protocol filtering to
capture IP packets on the PH interface. Browsing through the captured data by
applying increasingly fine levels of filtering, it is possible to see the
repeating packets and resolve the problem.

Duplication of IP Traffic
PDSN configuration problems can give rise to other types of problems in addition
to tunnel loops. One common issue is associating the PDSN’s logical IP
addresses with more than one physical medium access control (MAC) address. When
this occurs, more than one hardware card has the same IP address. All traffic
sent to that IP address goes to two different hardware entities and receives
responses from both. This effectively doubles the amount of IP traffic
associated with that single IP address on that segment. Once again,
protocol-filtering capabilities are required for effective troubleshooting. A
protocol analyzer should capture IP packets traveling to a specific IP
destination address via the PH interface. Browsing through the data and
filtering helps narrow down the inquiry to give the nature of the problem (the
duplicated address) soon.

Routing Problems
Sometimes internal problems can cause PDSN routers to go offline and come
back online after some time. This can happen frequently on a CDMA2000 core data
network. When a router first comes online, its routing tables are not optimized.
It takes time for the built-in open shortest path first (OSPF) routing algorithm
to learn the best way to route packets. Until the routing tables are optimized,
there will be degradation in quality of service.

By capturing IP packets on the PH interface with a protocol analyzer and
applying filters on the OSPF routing messages, changes in designated router and
changes in neighbors of a router can easily be identified. Using intelligent and
detailed filtering capability on OSPF messages and information elements within
these messages identifying routing problems on an IP network becomes an easy
task.

Troubleshooting activities now require an understanding of both traditional
‘telecom’ concepts related to the circuit-switched domain and new ‘datacom’
concepts related to the packet switched-domain. Protocol analysis tools can play
a bigger role than ever in keeping a network running efficiently. Features such
as multi-interface call tracing and protocol filtering will become critical for
maintenance.

Excerpted from a white paper by Enrico Zanoio and Steve Urvik (monitoring and
protocol test) Tektronix Inc

Leave a Reply

Your email address will not be published. Required fields are marked *