NetStar O&M component Commissioning Failed due to MCA cannot detect the new wave

[Problem Description]: During NetStar O&M component commissioning, it failed, the error message is:

1

[Problem Analysis]: The miss of the new wavelength can be caused by the following factors:
1. The physical fiber misconnection between OTU and M40.
2. The inconsistence between logical fiber and physical fiber. The NetStar O&M component determines which MCA8 board is used to detect the optical power of the OA that reports errors based on logical fiber connections. If no logical fiber connections are established for the MCA8 board but the board is physically connected, then the MDSNetStar O&M component considers that the MCA8 board is not connected.
3. The flatness of the wavelengths of this optical amplifier board is very bad. If the
difference of channel power exceeds 10 dB, the wavelength of the low power will be lost
from MCA spectrum.
4. The board insertion loss or the fiber attenuation of the site is unacceptably high.
To locate the fault, check the optical power detected at the power detection points (namely, the OAs connected to an MCA board) upstream and downstream from the OA that reports the error. If the optical power of the missing wavelength can be detected at one of the power detection points but cannot be detected at the other power detection point, then the fault is likely located between the OA that reports the error and the detection point that fails to detect the wavelength.
As shown in the following figure, site A is the transmitting site, site E is the receiving site,
and other sites are intermediate sites. In the figure, the power detection points that are
connected to an MCA board are provided.

2The figure above shows a network where WSM9 and WSD9 boards are used. The following describes how to handle a fault on such a network. The methods for handling the fault on a network with WSMD4 boards are similar. The following provides the troubleshooting procedure by scenario.

Scenario 1: At site A, the MCA8 board connected to the A101 board cannot detect the
new wavelength.
– Check whether the MCA8 board detects other wavelengths. If it does not detect
other wavelengths but the U2000 displays that there is Rx and Tx optical power on
the A101 board, then check the physical and logical fiber connections between the
MCA8 and A101 boards. If the fiber connections are incorrectly established, correct
them.
– If the MCA8 board detects other wavelengths and the MCA8 board is correctly
connected to the A101 board, then set the attenuation to 5 dB for the channel
carrying the new wavelength on the WSM9 board.
– On the U2000, check whether the optical power of the new wavelength is displayed
in the MCA8 board data. If yes, then check the fiber insertion loss from the OTU
board to the M40 board. If the fiber insertion loss is greater than 1 dB, clean or
replace the fiber from the OTU board to the M40 board.
– If the optical power of the new wavelength is not displayed in the MCA8 board
data, then check the physical fiber connections between the OTU and M40 boards.
If the physical fiber connections are incorrectly established, correct them.
Scenario 2: At site B, the MCA8 board connected to the A101 board cannot detect the
new wavelength.
– Check whether the MCA8 board detects other wavelengths. If it does not detect
other wavelengths but the U2000 displays that there is Rx and Tx optical power on
the A101 board, then check the physical and logical fiber connections between the
MCA8 and A101 boards. If the fiber connections are incorrectly established, correct
them.
– If the MCA8 board detects other wavelengths, ensure that the MCA8 board is
correctly connected to the A101 board.
– Site B is an optical line amplifier (OLA) site. Therefore, determine the point for
adjusting the optical power of the new wavelength at site A. Then, set the
attenuation to 5 dB for the channel carrying the new wavelength on the WSM9
board at site A.
– Determine the fiber connections to the MCA8 board at site A and view the optical
power of the new wavelength displayed for the MCA8 board on the U2000.
– If the MCA8 board detects the optical power of the new wavelength, then check the
fiber insertion loss from the OTU board to the M40 board at site A. If the fiber
insertion loss is greater than 1 dB, then clean or replace the fiber from the OTU
board to the M40 board.

– If the MCA8 board does not detect the optical power of the new wavelength, then
handle the fault by referring to scenario 1.
l Scenario 3: At site C, the MCA8 board connected to the B103 board cannot detect the
new wavelength.
– Check whether the MCA8 board connected to the B103 board detects other
wavelengths. If it does not detect other wavelengths but the U2000 displays that
there is Rx and Tx optical power on the B103 board, then check the physical and
logical fiber connections between the MCA8 and B103 boards. If the fiber
connections are incorrectly established, correct them.
– If the MCA8 board detects other wavelengths, ensure that the MCA8 board is
correctly connected to the B103 board.
– At site C, the B103 board is a receiving OA. Therefore, determine the point for
adjusting the optical power of the new wavelength at site A. Then perform the
operations for scenario 1.
l Scenario 4: At site C, the MCA8 board connected to the A101 board cannot detect the
new wavelength.
– Check whether the MCA8 board detects other wavelengths. If it does not detect
other wavelengths but the U2000 displays that there is Rx and Tx optical power on
the A101 board, then check the physical and logical fiber connections between the
MCA8 and A101 boards. If the fiber connections are incorrectly established, correct
them.
– If the MCA8 board detects other wavelengths, ensure that the MCA8 board is
correctly connected to the A101 board.
– The new wavelength is transparently transmitted at site C. In this case, the A101
board at site C is a transmitting OA. To locate the faulty point, check whether the
MCA8 board connected to the B103 board at site C detects the new wavelength. If
it does not detect the new wavelength, see the methods for scenario 3.
– If the MCA8 board connected to the B103 board detects the new wavelength, then
the faulty point is located between the B103 and A101 boards at site C. In this case,
set the attenuation to 0 dB for the channel carrying the new wavelength on the
WSD9 board and to 5 dB for the channel carrying the new wavelength on the
WSM9 board. After that, check whether the MCA8 board connected to the A101
board detects the new wavelength.
– Check the physical fiber connections between the WSD9 and WSM9 boards if the
following conditions are met: (1) The MCA8 board connected to the A101 board
still does not detect the new wavelength. (2) Other wavelengths are dropped or
added from the WSD9 or WSM9 board. (3) No wavelength except the new
wavelength passes the WSD9 and WSM9 boards.
– Check the physical fiber connections between the B103 and A101 boards if the
MCA8 board connected to the A101 board still does not detect the new wavelength
and no wavelength except the new wavelength passes the WSD9 and WSM9 boards
l Scenario 5: At site D, the MCA8 board connected to the A101 board cannot detect the
new wavelength.
– Check whether the MCA8 board connected to the A101 board detects other
wavelengths. If it does not detect other wavelengths but the U2000 displays that
there is Rx and Tx optical power on the A101 board, then check the physical and
logical fiber connections between the MCA8 and A101 boards. If the fiber
connections are incorrectly established, correct them.– If the MCA8 board detects other wavelengths, ensure that the MCA8 board is
correctly connected to the A101 board.
– Site D is an OLA site. Therefore, determine the point for adjusting the optical
power of the new wavelength at site C, which is an ROADM site.
– At site C, check whether the MCA8 board connected to the A101 board detects the
new wavelength. If it does not, see the methods for scenario 4.
Scenario 6: At site E, the MCA8 board connected to the B103 board cannot detect the
new wavelength.
– Check whether the MCA8 board connected to the B103 board detects other
wavelengths. If it does not detect other wavelengths but the U2000 displays that
there is Rx and Tx optical power on the B103 board, then check the physical and
logical fiber connections between the MCA8 and B103 boards. If the fiber
connections are incorrectly established, correct them.
– If the MCA8 board detects other wavelengths, ensure that the MCA8 board is
correctly connected to the B103 board.
– The B103 board at site E is a receiving OA. Therefore, determine the point for
adjusting the optical power of the new wavelength at site C.
– At site C, check whether the MCA8 board connected to the A101 board detects the
new wavelength. If it does not, see the methods for scenario 4.
Scenario 7: The input optical power of the receiving OTU board at site E is lower than
the lower threshold and reports an OTU-LOF, LOS, or IN-PWR-LOW alarm.
– At site E, check whether the MCA8 board connected to the B103 board detects the
new wavelength. If it does not, see the methods for scenario 6.
– If the MCA8 board detects the new wavelength, set the attenuation to 0 dB for the
channel carrying the new wavelength on the WSD9 board. If the OTU board still
reports the OTU-LOF, LOS, or IN-PWR-LOW alarm, check the physical fiber
connections from the D40 board to the OTU board. If the fiber connections are
incorrectly established, correct them.

The Information about OSN 6800

The OptiX OSN 6800 takes subracks as the basic working units. The subrack of the OptiX OSN 6800 has independent power supply.

Figure shows the structure of the subrack.

OptiX OSN 6800 subrack structure diagram

1

1. Indicator 2. Board area 3. Fiber cabling area
4. Fan tray assembly 5. Air filter 6. Fiber spool
7. Mounting ear    

 

 

The interface area is behind the indicator panel in the upper part of the subrack. Remove the indicator panel before you connect cables.

  • Indicators: They indicate the running status and alarm status of the subrack.
  • Board area: All service boards are in this area. Totally 21 slots are available.
  • Fiber cabling area: Fiber jumpers from the ports on the front panel of the boards are routed to the area before reaching the matched side of the cabinet. The mechanical VOA is also installed in this area.
  • Fan tray assembly: This area contains 10 fans for ventilation and heat dissipation of the subrack.
  • Air filter: It protects the subrack from outside dust in the air. It needs to be taken out and cleaned periodically.
  • Fiber spool: The fiber spool serves to coil the extra fibers. Fixed fiber spools are on two sides of the subrack. The fibers whose extras are coiled in the fiber spool on the cabinet side enter another subrack.
  • Mounting ears: They fix the subrack in the cabinet.
  • Interface area: This area is behind the subrack indicator panel, providing functional interfaces such as management interface, inter-subrack communication interface, alarm output and cascading interface, alarm input and output interface.

The follow lists the technical specifications of the OptiX OSN 6800 subrack.

Technical specifications of the subrack

Item Specification
Dimensions 487 mm (W) × 295 mm (D) × 400 mm (H)
Weight (empty subracka) 13 kg
Maximum power consumption 1200 W
Rated working current 30 A
Nominal working voltage –48 V/–60 V DC
Working voltage range –38.4 V to –72 V DC

a: The empty subrack means no board is installed in the board area, and no fan tray assembly and no air filter is installed.

 

Technical specifications of the fan tray assembly

Item Specification
Dimensions 493.7 mm (W)× 266.6 mm (D) × 56.1 mm (H)
Weight 3.6 kg
Power consumption 120 W

 

The follow list the power consumption of the common units in the OptiX OSN 6800.

Power consumption of the common units in the OptiX OSN 6800

Unit Name Maximum Power Consumptiona Remarks
Subrack OTU subrack 640 W It is the power consumption when the subrack is installed with seventeen OTU10G (LSX), one SCC, two PIUs, one AUX, and one fan tray assembly.
OTM subrack 520 W It is the power consumption when the subrack is installed with one M40V, one D40V, one OAU, one OBU, eight OTU10G (LSX), one SCC, two PIUs, one AUX, and one fan tray assembly.
OLA subrack 270 W It is the power consumption when the subrack is installed with seventeen two OBU101, two OBU103, four VA1, one SC2, one SCC, two PIUs, one AUX, and one fan tray assembly.
OADM subrack 380 W It is the power consumption when the subrack is installed with two OAU101, four VA1, two MR4, four OTU10G, one SC2, one SCC, two PIUs, one AUX, and one fan tray assembly.
Cabinet OTM cabinet (40×10Gbits) 1800 W It is the power consumption when the cabinet is installed with two OTU subrack and one OTM subrack.

a: Indicates that the power consumption of the subrack and cabinet is the value in a certain configuration. The value is for reference only. The actual power consumption of the chassis and cabinet is calculation based on the power consumption of each module.

The power consumption value in the table is the measured value when the ambient temperature is 25 ℃. During the starting up of the equipment or in high or low temperature, the power consumption of the equipment increases. Hence, the actual power consumption of the equipment is 1.2 to 1.5 times of the value in the table.

What’s the MSTP Configuration of stp port cost ?

Function Description

The multiple spanning tree protocol (MSTP) applies to the redundant network. MSTP is an improvement of STP and RSTP. MSTP prevents the proliferation and infinite cycling of the packets in the loop network. In addition, MSTP provides multiple redundant paths for VLAN data transmission to achieve the load-sharing purpose. The MA5680T/MA5683T/MA5608T supports MSTP, which is compatible with the STP and RSTP. It supports MSTP loop network that helps meet various networking requirements.

Function

The stp port cost command is used to set the path cost for the current port. When you need to realize the VLAN load sharing function, and to set proper path cost to enable different VLAN traffic to be forwarded along different physical links, run this command. After the path cost for the current port is set, topology may be changed and the spanning tree may be recalculated.

The undo stp port cost command is used to restore the default path cost for the current port.

NOTE:

Setting the path cost for the Ethernet port causes the recalculation of the spanning tree. Thus, it is recommended to use the default.

Format

stp port frameid/slotid/portid [ instance instance-id ] cost cost

undo stp port frameid/slotid/portid [ instance instance-id ] cost

Parameters

Parameter Description Value
frameid/slotid/portid Indicates the subrack ID, slot ID, and port ID. Enter a slash (/) between the subrack, slot, and port IDs. When you need to query the path cost for the Ethernet port, use this parameter. Please see Differences Between Shelves.
instance instance-id Indicates the spanning tree instance ID. The ID 0 indicates the CIST instance which cannot be set or deleted. Other instances are the Multiple Spanning Tree Instance (MSTI). Numeral type. Range: 0-16.
cost cost Indicates the port path cost. Path cost is a port parameter that indicates the cost of the network connected to the port. The smaller this value is, the better the network connected to the port is. Numeral type. Range: 1-200000.

Modes

Global config mode

Level

Operator level

Usage Guidelines

  • By default, the path cost of the port in each spanning tree instance is that of the port with the matching rate according to the path calculation standard. The following table lists the relationship between the port rate and recommended cost.
    Port Rate Link Type Recommended Cost of 802.1d Recommended Cost of 802.1t Recommended Cost of Legacy
    10Mbit/s Half-duplex 100 2000000 2000
    Full-duplex 99 1999999 2000
    2-port link aggregation 95 1000000 1800
    3-port link aggregation 95 666666 1600
    4-port link aggregation 95 500000 1400
    100Mbit/s Half-duplex 19 200000 200
    Full-duplex 18 199999 200
    2-port link aggregation 15 100000 180
    3-port link aggregation 15 66666 160
    4-port link aggregation 15 50000 140
    1000Mbit/s Full-duplex 4 20000 20
    2-port link aggregation 3 10000 18
    3-port link aggregation 3 6666 16
    4-port link aggregation 3 5000 14
    10Gbit/s Full-duplex 2 2000 2
    2-port link aggregation 1 1000 1
    3-port link aggregation 1 666 1
    4-port link aggregation 1 500 1
  • If you do not configure parameter instance instance-id, the configuration takes effect only on the Common and Internal Spanning Tree (CIST) instance.
  • If parameter instance-id is configured to 0, it indicates the path cost of the port in CIST.
  • Path cost is a parameter related to the rate of the link connected to the port. On the switching device supporting MSTP, the port in different spanning tree instances can have different path costs.
  • Port path cost determines the port role. You can configure one port with different path costs in different MSTIs, so as to enable different VLAN traffic to be forwarded along different physical links and then to realize VLAN load sharing function.
  • When the port path cost changes, MSTP recalculates the port role and transits the port state. The default path costs are different for the ports with different rates.

Example

To set the path cost for port 0/19/0 in spanning tree instance 2 to 400, do as follows:

huawei(config)#stp port
{ frameid/slotid/portid<S><Length 5-18> }:0/19/0
{ cost<K>|disable<K>|edged-port<K>|enable<K>|instance<K>|loop-protection<K>|mche
ck<K>|point-to-point<K>|port-priority<K>|root-protection<K>|transmit-limit<K> }:
instance
{ INTERGER<0-16> }:2
{ cost<K>|port-priority<K> }:cost
{ INTEGER<1-200000> }:400

  Command:
          stp port 0/19/0 instance 2 cost 400

To restore the default path cost for port 0/19/0 in spanning tree instance 2, do as follows:

huawei(config)#undo stp port
{ frameid/slotid/portid<S><Length 5-18> }:0/19/0
{ cost<K>|instance<K>|point-to-point<K>|port-priority<K>|transmit-limit<K> }:
instance
{ INTERGER<0-16> }:2
{ cost<K>|port-priority<K> }:cost

  Command:
          undo stp port 0/19/0 instance 2 cost

System Response

  • The system does not display any message after the command is executed successfully.

Warning of Slave Main Control Board Unregistration on an ATN 950B

Keywords:

ATN 950B, slave main control board, unregistered

Summary:

On an ATN 950B equipped with the AND2CXPE/AND2CXPA/AND2CXPB boards, software does not send packets in the defined order, causing communication failures between the master and slave main control boards. As a result, the slave main control board fails to be registered.

 

[Problem Description]

Application scenario:

The AND2CXPE/AND2CXPA/AND2CXPB boards function as the master and slave main control boards on an ATN 950B.

Trigger conditions:

There is a low probability that the problem occurs if the AND2CXPE/AND2CXPA/AND2CXPB boards function as the master and slave main control boards on a H805GPFD04

.

Symptom:

The slave main control board cannot be registered.

Identification method:

  1. Check the type of the main control boards.

Run the display version command in the user view.

 

This problem occurs only on the GPBD on which the AND2CXPE/AND2CXPA/AND2CXPB boards function as the master and slave main control boards.

[huawei]display  version

……

CXPB(Master) 7  : uptime is 18 days, 22 hours, 36 minutes

StartupTime   2014/07/14   16:19:41

SDRAM Memory Size   : 1024M bytes

FLASH Memory Size   : 128M bytes

CFCARD Memory Size : 498M bytes

RAMDISK Memory Size : 10M  bytes

ANDD00CXPB01 version information

PCB         Version : AND2CXPB REV C  //This warning is involved only when the value AND2CXPE, AND2CXPA, or AND2CXPB is recorded here.

 

  1. Check the board registration status.

Run the display device command in the user view.

[huawei]display  device

ATN950B’s Device status:

Slot #    Type       Online    Register      Status      Primary

– – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – –

1         PIC        Present   Registered    Normal      NA

2         PIC        Present   Registered    Normal      NA

3         PIC        Present   Registered    Normal      NA

4         PIC        Present   Registered    Normal      NA

7         CXP        Present   NA            Normal      Master

8         CXP        Present   Unregistered  Abnormal    Slave//Indicates that the slave main control board is not registered.

9         PWR        Present   Registered    Normal      NA

10        PWR        Present   Registered    Normal      NA

11        FAN        Present   Registered    Normal      NA

 

  1. Check the cause of the board reset. Information on the master main control board shows that the slave main control board encountered an HA failure.

Run the display board-reset 8 command. The parameter 8 indicates the slot number of the unregistered board.

[huawei-diagnose]display  board-reset 8

CXP8 reset information:

— 1. DATE:2014-06-23  TIME:02:15:41  RESET Num:4

—    Reason:VRP HA Module reset slave board //Indicates that the slave main control board is reset due to an HA failure.

 

[Root Cause]

In some cases, software does not send packets in the defined order, causing communication failures between the master and slave main control boards. As a result, the slave main control board fails to be registered.

 

[Impact and Risk]

The slave main control board is not registered and fails to back up the master main control board.

 

[Measures and Solutions]

Recovery measures:

Install a patch on the ATN 950B.

Workarounds:

None

Solutions:

Install the V200R003SPH005 patch or later on the ATN 950B V200R003C00SPC200/ ATN 950B V200R003C10SPC100.

Notice on Prewarning for ARP Entry Aging Failures on MxU

Keywords

ARP, gateway, aging, update,

MA5612, MA5616, MA5620, and MA5626

Summary

The MxU products use the IPOS protocol stack. After the MxU devices run for a long time (for example, longer than 497 days), the Address Resolution Protocol (ARP) entry learned by some MxU devices from the upper-layer device, such as a gateway, cannot automatically age. In this case, if the upper-layer device is replaced or cut over, and the upper-layer device does not actively send ARP request packets to the MxU devices, the ARP entry corresponding to the IP address of the upper-layer device cannot automatically update within a MAC address aging period. The ARP entry recorded on the MxU devices is the MAC address of the upper-layer device before the replacement or cutover. As a result, the MxU cannot communicate with the upper-layer device and accordingly, the management and voice services fail.

Problem Description

  • Trigger Conditions

This issue occurs if the following conditions are met:

  1. The MxU model and version are within the prewarning scope.
  2. The system running time is longer than 497 days.
  3. The device learns or updates the ARP entry of the upper-layer device when the device has been running for 496 days.
  4. The MAC address of the upper-layer device is changed.
  5. The upper-layer device does not actively send ARP request packets to the MxU device to notify the MxU device of ARP entry updating.
    • Symptom

The device management or voice service fails.

After the upper-layer device is replaced or cut over, the MAC address of the upper-layer device in the ARP entry recorded on the MxU is still the original one. In addition, the ARP entry cannot automatically update within a MAC address aging period.

  • Identification Method

Perform the following operations to check whether a fault complies with the prewarning:

  1. Check whether the MAC address corresponding to the gateway IP address in the ARP entry on the MxU is the actual MAC address of the gateway. The gateway IP address is assumed to be 10.144.82.1.

MxU(config)#display arp all

{ <cr>||<K> }:

 

Command:

display arp all

IP Address      MAC Address    VLAN ID Port    ONT Type

10.144.82.1     00e0-fc64-756d 200     0/0 /0  –   Dynamic

10.144.82.91    001b-2191-b586 200     0/0 /0  –   Dynamic

10.144.83.224   4c1f-cc7d-6393 200     0/0 /0  –   Dynamic

—   3 entries found   —

If the MAC address recorded in the ARP entry is different from the actual MAC address of the gateway, this fault complies with the prewarning.

  1. Check whether the MxU model and version are within the prewarning scope.
  2. Check whether the MxU has been running for over 497 days and whether time reversal occurs on the MxU.

Perform the following operations to determine time reversal:

  1. Check and record the system running time (Uptime).

MxU(config)#display version

{ <cr>|backplane<K>|frameid/slotid<S><Length 1-15> }:

 

Command:

display version

VERSION : MA5616V800R308C02

PRODUCT : MA5616

PATCH:SPC200 SPH518 HP2118

Copyright (c) Huawei Technologies Co., Ltd. 1998-2011 All rights reserved

Uptime is 2 day(s), 5 hour(s), 42 minute(s), 2 second(s)

 

  1. Check and record the current system time T1.

MxU(config)#display time

{ <cr>|date-format<K>|dst<K>|time-stamp<K> }:

 

Command:

display time

2014-01-22 02:22:56+08:00

 

  1. Check and record the system start time T2.

MA5616(config)#diagnose

 

MA5616(diagnose)%%su

Challenge:ZCZUBOWB

Please input password:

 

MA5616(su)%%display lastwords all

 

+++++++++++++++ Display current lastwords Info: +++++++++++++

 

**********************************************************************

System Start Time            : 2013-01-14 02:07:13.250 , Week: Fri

System Start CpuTick         : 0x00000000 908c5ce3

System Last CpuTick          : 0x000029f3 c89fd402

System Total Running CpuTick : 0x000029f3 3813771f

MilliSecs Per CpuTick        : 0x00010441

System Total Running Time    : 692301.607 (s.ms)

In normal cases, (T1 – T2) = Uptime value. The system time resets and starts timing again after the device has been running for 497 days. If (T1 – T2) > Uptime value, time reversal occurs.

If the fault complies with the preceding three conditions, the fault is within the prewarning scope.

Root Cause

The ARP entry updating failure is caused by a bug of the device software in obtaining system running time. The system time reverses after the device has been running for 497 days. If the device learns or updates an ARP entry before the time reversal, the ARP entry becomes abnormal and fails to automatically update within a MAC address aging period. If the upper-layer device does not actively send an ARP request message for the ARP entry, the ARP entry does not update.

The following section provides an example to describe the fault cause: The system running time is assumed to be Tsystem and ARP aging period is AagTime.

  • When the device has been running for 497 days (Tsystem= 497), the device learns or updates an ARP entry. Then, the ARP entry learning time is T1 = Tsystem = 497 and the next aging time of the ARP entry is Tage = T1 + (AagTime/2) = 497 + (AagTime/2).
  • In normal cases, if the system running time Tsystemreaches or exceeds Tage, the ARP entry ages.
  • However, the system time Tsystemreverses if the system running time is longer than 497 days. Therefore, after the device continues running for T’ days, the system running time Tsystem is T’ (0 + T’). When the next ARP entry aging period starts, the system running time Tsystem is much less than the ARP entry aging time Tage [Tage = 497 + (AagTime/2)]. As a result, the ARP entry cannot update or age.

Impact and Risk

The management and voice services on the MxU are affected. The broadband service is not affected.

Measures and Solutions

  • Recovery Measures

Run the #reset arp dynamic command on the affected MxU to rectify the fault.

MxU(config)#reset arp dynamic

This operation may take several minutes, please wait…success

  • Workarounds

The workarounds are the same as recovery measures.

  • Preventive Measures
  • For the MA5612 (H832CCFE), MA5616, MA5621/MA5621A, MA5623A, and MA5662, upgrade the device to V800R312C00 SPH208.
  • For the MA5620/MA5626 (H822EPUB), Huawei will release V800R312C00 SPH209 on February 28, 2014 to resolve this issue.
  • For other MxU devices, Huawei will release patches to resolve this issue. For details, contact the prewarning contact persons.

Prewarning Retraction Conditions

This prewarning can be retracted if issue triggering conditions are not met.

Attachment

None

Typical OCS Networking

Typical OCS Networking

The OptiX OSN 8800 can interconnect with the NG SDH equipment to form a hybrid network, which results in a complete transport solution. The OptiX OSN 8800 intelligent optical switching platform (the OptiX OSN 8800 for short) functions as optical core switching (OCS) equipment. It is mainly used as key service grooming nodes at national backbones, provincial backbones, and metropolitan area network (MAN) backbones.

Figure 1 shows typical OCS networking.

Figure 1 Typical OCS networking
2

Precaution of the NMS Failing to Synchronize ONUs

Keywords:

ONU version information, FTP, synchronization failure

Summary:

When an NMS synchronizes with ONUs from an OLT through the FTP, the character string length of the version information (such as the equipment ID, ONT model, ONU hardware version, ONU primary software version, ONU secondary software version) reported by some ONUs exceeds the maximum length defined in the protocol, and the reported character string does not contain any string tokenizer. As a result, the NMS fails to synchronize with the ONUs, and customers cannot manage the newly-deployed ONUs or deploy new services.

[Problem Description]

Trigger condition:

  1. The U2000 is earlier than V100R006C02CP3207 and the OLT is the MA5680T V800R011 earlier than V800R011C00SPC107 or MA5600T V800R012 earlier than V800R012C00SPC103.
  2. The NMS synchronizes with ONUs through the FTP.
  3. The character string length of the version information reported by some ONUs exceeds the maximum length defined in the protocol, and the reported character string does not contain any string tokenizer. (As defined in the protocol, the maximum length of the equipment ID or ONT model is 20 bytes and that of ONU hardware version, primary software version, and secondary software version is 16 bytes.)

When the above conditions are met, the NMS fails to synchronize with the ONUs.

Symptom:

The following describes the possible symptoms:

  1. The ONUs cannot be displayed on the NMS network topology.
  2. ONU information, such as the ONU SN (GPON), LOID (EPON), and equipment ID, cannot be displayed on the NMS interface.
  3. ONU information cannot be found on the OLT.

Identification method:

If the above symptoms occur, use the following methods to determine whether they are caused by the problem in this precaution.

If both the NMS and the OLT are in the version ranges mentioned above, proceed to the next step.

  1. On the NMS:

Synchronize with the ONUs on the NMS, and reproduce the synchronization failure. View the NMS synchronization log, U2000\server\var\logs\Develop\BmsAccess_9961\BmsAccess_*_*.log or U2000\server\var\logs\Develop\BmsAccess_1\BmsAccess_*_*.log (* indicates the date and time), and check whether the following error message is displayed:

Call Function return Error:DoUpdate(pTableDesc, pSrcTable, pOutPutTable)

If not, the problem described in this precaution does not occur. The synchronization failure is caused by other causes.

If yes, the character string length exceeds the maximum length when the NMS parses the POD file. Therefore, the information cannot be written into the NMS database. The problem can be solved by using the suggested solutions in this precaution (see “Measures and Solutions”).

  1. On the OLT:

Query all ONT version information on the faulty OLT. The command output of the GPON or EPON ONT version information is as follows:

 

In the command output, if the type name, software version, or hardware version of a certain ONU exceeds the maximum length listed in the following table, the character string reported by the ONU does not contain any string tokenizer, which results in the problem described in this precaution. The problem can be solved by using the suggested solutions in this precaution (see “Measures and Solutions”).

GPON ONU EPON ONU
Information Maximum Length Information Maximum Length
Equipment-ID 20 bytes ONT model 20 bytes
ONT Version 16 bytes ONT hardware version 16 bytes
Main Software Version 16 bytes ONT software version 16 bytes
Standby Software Version 16 bytes

Use the following method to confirm the character string: Copy the ONT model or software version information to the UltraEdit, and choose View > Display Ruler from the main menu.

[Root Cause]

The version information reported by the ONU does not contain a string tokenizer.

The following uses the ONU software version as an example.

Different ONUs use different methods to report the ONU software version. Generally, an ONU adds a string tokenizer \0 at the end of the reported character string. In this way, the upper-layer device detects how many bytes the software version reported by the ONU contains.

If an ONU reports a 16-byte character string to the OLT but does not add a string tokenizer at the end of the character string, the OLT cannot determine how many bytes this character string contains. Therefore, the character string length of the ONU software version reported by the OLT to the NMS is uncertain, and may exceed the maximum length supported by the NMS. In addition, the NMS does not support this abnormal behavior and considers that the character string length of the ONU software version exceeds the maximum length. The NMS fails to parse the ONU software version. Consequently, the NMS fails to synchronize with the ONU.

[Impact and Risk]

If the character string length of the reported ONU software version exceeds the maximum length defined in the protocol and reported ONU software version does not contain a string tokenizer, the NMS fails to synchronize with the ONU through the FTP. As a result, customers cannot manage the newly-deployed ONUs or deploy new services.

[Measures and Solutions]

Preventive measure:

The NMS synchronizes with ONUs through the SNMP instead of the FTP, but the synchronization efficiency will be reduced by about 80%.

If you need to use this solution, contact Huawei R&D engineers for confirmation in advance to prevent NMS performance from being affected. The specific methods for synchronizing with ONUs through the SNMP are as follows:

  1. Back up theU2000\server\nemgr\nemgr_access\dcp\platform\v100\feature\gdm\mxu_conf_dev_feature.xml file.
  2. Copy the backup file and edit it. Find <feature name=”PollOptimize” support=”1″>in the file as follows:

 

  1. The <dev type> structure in the file indicates an NE type and its version range information, among which the dev typefield uniquely identifies the NE type. Delete the corresponding <dev type> structure based on the faulty equipment type on the live network.

Use the MA5600T and MA5680T as examples. In the dev type=”41″ and dev type=”45″ fields, 41 indicates the MA5600T, and 45 indicates the MA5680T. Delete the corresponding structures (the following two lines in the given XML file) for the NMS to synchronize with these two types of NEs through SNMP:

<dev type=”41″ area=”[MA5600V800R006C03B000,MA5600V800R006C03BZ);[MA5600V800R006C32B010,MA5600V800R006C33BZ);[MA5600V800R006C72B010,MA5600V800R006C72BZ);[MA5600V800R007C00,MA5600V800RZ)”/>

<dev type=”45″ area=”[MA5600V800R006C32B010,MA5600V800R006C32BZ);[MA5600V800R006C72B010,MA5600V800R006C72BZ);[MA5600V800R007C00,MA5600V800R007CZ);[MA5680V800R007,MA5680V800R099CZ);[MA5683V800R007,MA5683V800R099CZ)”/>

The values of Dev type and the specific NE types are mapped as follows.

Dev type Equipment Type
41 MA5600T
45 MA5680T
75 MA5603T
2331 MA5608T

 

  1. Restart the following processes: profile, BmsAccess, BmsCommon, TL1NbiDm, inTL1NbiDm, BmsTimingTask, BmsTest, BmsAtur, and BmsHGMPDm. For processes that cannot be found, ignore them.

Solution:

A patch for the NMS is released to resolve this issue. Patch version: U2000 V100R006C02SPC303 (to be released on September 30, 2013)

A patch for the OLT is released to resolve this issue. Patch version: MA5600T V800R011C00SPC109 (to be released on September 15, 2013)

Patch version: MA5600T V800R012C00SPC103 (to be released on August 20, 3013)

The problem can be resolved by loading the NMS patch or the OLT patch.