Document fins/I0857-1


FIN #: I0857-1

SYNOPSIS: Certain Sun Fire 3800 systems with duplicate MAC addresses running on
          common subnets may experience network issues

DATE: Aug/09/02

KEYWORDS: Certain Sun Fire 3800 systems with duplicate MAC addresses running on
          common subnets may experience network issues


---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------  
                            FIELD INFORMATION NOTICE
                  (For Authorized Distribution by SunService)

           

SYNOPSIS: Certain Sun Fire 3800 systems with duplicate MAC addresses
          running on common subnets may experience network issues.
      

SunAlert:           No

TOP FIN/FCO REPORT: Yes 
 
PRODUCT_REFERENCE:  MAC Address on Sun Fire 3800 
 
PRODUCT CATEGORY:   Server / SW Admin


PRODUCTS AFFECTED:  

Systems Affected:
-----------------  
Mkt_ID   Platform   Model   Description          Serial Number
------   --------   -----   -----------          -------------
  -        S8        ALL    Sun Fire 3800              -


X-Options Affected:
-------------------
Mkt_ID   Platform   Model   Description   Serial Number
------   --------   -----   -----------   -------------
  -         -         -          -              -


PART NUMBERS AFFECTED: 

Part Number   Description                         Model
-----------   -----------                         -----
501-5465-04   ASSY SP CENTERPLANE S8                -
501-5876-01   ASSY SP CENTERPLANE Untested          -


REFERENCES:

ESC:    532580 - SC0 drops network packets. 

Manual: 805-7373-13: Sun Fire 6800/4810/4800/3800 Systems Platform 
                     Administration Manual.
        805-7372-13: Sun Fire 6800/4810/4800/3800 System Controller 
                     Command Reference Manual.

URL:    http://acts.ebay/3800mac/   [FOR INTERNAL USE ONLY]
        http://apac-scc.singapore/  
        http://cccweb.ebay.sun.com/ccc/groups/sunfire/   
        http://jfk.france/serengeti/sscc/sscc.html 
        http://sdpsweb.ebay/FIN_FCO/FIN/FINI0857-1_dir/MAC_Addr_List.sxc

     
PROBLEM DESCRIPTION:

A small number of Sun Fire 3800 systems were shipped to customers with
MAC addresses which are duplicates of addresses programmed into other
3800 systems.  If two of these systems with duplicate MAC addresses are
configured in specific ways and run on the same subnet, one or both
systems may experience various network issues.

Affected Sun Fire 3800 systems were manufactured between April 4, 2001
and August 19, 2001.  Corrective action procedures to reprogram systems
were implemented by Enterprise Services Sun Fire Control Centers in
the second half of calendar year 2001.  This FIN documents additional
information which became available during 2002 to help identify affected
systems.

An analysis of affected systems shipped from Sun has been performed.  A
list of affected systems by centerplane serial number, system serial
number and the lowest affected MAC address value is available at
(StarOffice 6.0 spreadsheet):

    http://sdpsweb.ebay/FIN_FCO/FIN/FINI0857-1_dir/MAC_Addr_List.sxc
	
The Geo breakout for affected Sun Fire 3800 systems is as follows:

                                       53%     31%    16%
                             Overall Americas  EMEA   APAC
      Estimated Number of
      Affected Systems  --->   411     218     127     66
         
411 systems worldwide were incorrectly programmed with MAC addresses.
The actual number of systems which may exhibit the failure is very
small based on current system configuration practices and the low
probability that customers own systems with overlapping MAC addresses.
Less than twenty (20) systems have exhibited the network performance
issues to date.
      	
The following online command can help to identify affected systems:

           sc-hostname:SC> showplatform -p mac

                         MAC Address          HostID  
                         -----------------    --------
           Domain A      08:00:20:d8:a8:55    80d8a855
           Domain B      08:00:20:d8:a8:56    80d8a856
           SSC0          08:00:20:d8:a8:59    80d8a859
           SSC1          08:00:20:d8:a8:5a    80d8a85a
         
The displayed MAC addresses can then be examined to see if the value
is on the affected system list.
     
The lowest HEX value MAC address of an affected system is 0003BA0247CC
The highest HEX value MAC address of an affected system is 080020FF9BC1
Some, but not all, of the systems within the MAC address range noted
directly above are affected.  The affected systems will not exhibit the
network issue unless they are operated on the same subnet.

A WW Operations supplier reported that a number of Sun Fire 3800
centerplanes were manufactured with incorrectly programmed MAC
addresses.  Sun Fire 3800 systems require four MAC addresses for use by
two potential system domains and two system controllers.

Instead of programming the system centerplanes with MAC addresses in
groups of four unique values, the centerplanes were incorrectly
programmed with MAC addresses that may overlap with MAC addresses
assigned to other Sun Fire 3800 centerplanes.

For example, if the numbers "1", "2", etc. represent sequential
MAC
addresses:

   Centerplane A was programmed with MAC address "1", "2",
"3", "4"
   Centerplane B was programmed with MAC address "2", "3",
"4", "5"
   Centerplane C was programmed with MAC address "3", "4",
"5", "6"
   
Sun Fire 3800 systems that are configured to only run with one active
system domain and operate the system controllers on separate networks as
recommended by Sun's best practices are will not exhibit this network
performances issue.

Network failures may occur if the following conditions exist in the
customer environment AND the customer owns several systems whose MAC
addresses overlap with other systems:

   a. Servers are operated with two domains and each domain has a
      a network interface host adapter card.
   b. System controllers for the affected system are operated on the
      same network as their system domains.    
   c. An affected server may overlap with up to six other systems with
      overlapping MAC addresses.  At least two systems with an overlapping
      MAC addresses would need to be on the same subnet.
      
   The issue WILL NOT BE exhibited on affected systems when:
   
     a. Only one affected system is owned by a customer.
     b. When the numeric values of the primary system MAC addresses
        (the MAC addresses for Domain A) on affected systems differ by
        four (HEX) or more EVEN when the domain and system controllers
        share a common network.
       
   Example for two affected systems that will not exhibit the issue:
   
      . The following two affected systems are on a common network AND
        System #1 MAC address used by its domain A:  0003BA0247A1
        System #2 MAC address used by its domain A:  0003BA0247AA
        
        The system that has MAC addresses which overlap with System #1
        has the a primary MAC address 0003BA0247A0 and is owned by
        another customer.
        
        The system that has MAC addresses which overlap with System #2
        has the a primary MAC address 0003BA0247AB and is owned by
        another customer as well.
        
      . System #1 is configured by the user to use two system domains AND
        
      . Contrary to recommended best practices, the system controllers
        share the same network as the system domains.
                                             
Systems #1 and #2 will not exhibit the issue.  Why?  Even though the
systems were incorrectly programmed, the primary (domain A) system MAC
addresses differ by 9 (HEX).  Therefore these two systems will not
exhibit the issue because the primary MAC addresses of the systems
differ by four or more HEX.  None of the four MAC addresses assigned to
each system will overlap with each other.

Very few customers run their Sun Fire 3800 systems with two domains.
Further, Sun's best practices recommends to customers that they connect
their systems controllers on separate, isolated networks from their
production systems.  To date, less that twenty (20) systems have been
reported to exhibit degraded network performance due to the issue
documented in this FIN.

Sun Fire 3800 systems which are affected by this issue may experience
the following types of network issues:

	 * The system or the customer network appears to hang.
	 * Intermittent failures on the system or the network.
	 * One or more servers will stop communicating.
	 * General network performance degradation.
	 * MAC address table corruption.

In addition, the following issue may arise with software vendors:

   * Customers may not be able to obtain software licenses for systems
   with
     overlapping MAC addresses for selected applications.  This issue
     may occur when a software license is already issued for another
     customer whose system has a common MAC address.  Some software
     licenses, such as those created for Oracle applications are linked
     to system MAC addresses.

A technical assessment was performed in June 2001.  No formal Stop Ship
was required to implement the corrective action within Sun
manufacturing plants.  FRU inventory was not affected by the root cause
of the affected centerplanes used in manufacturing.

Affected centerplanes returned to Sun for repair are reprogrammed
during the standard repair process.  The procedure for the FRUs erases
the existing MAC address / Host ID information to prepare the assembly
for use as a repaired FRU.  Therefore, no new repair procedures are
required to correct a repaired centerplane.

 
IMPLEMENTATION: 

         ---
        |   |   MANDATORY (Fully Proactive)
         ---    
         
  
         ---
        | X |   CONTROLLED PROACTIVE (per Sun Geo Plan) 
         --- 
         
                                
         ---
        |   |   REACTIVE (As Required)
         ---


CORRECTIVE ACTION:

The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives who may encounter the above
mentioned problem.

This product issue should be corrected by perform the following:
      
   Reprogram the Sun Fire 3800 centerplane with new MAC address
   and Host ID numbers.  A procedure for this process is provided at
   URL http://acts.ebay/3800mac/

   This procedure will need to be provided by Sun staff to channel
   partners who are trained to correct the centerplane MAC
   addresses.  Internal web pages are not accessible to channel
   partners.

   At the URL location, detailed steps and photographs are
   provided.  Note that only specially trained individuals have been
   designated in each TimeZone to perform the repairs.  A special
   tool is required to enable the reprogramming of the affected
   systems.  These tools are controlled by the Service Control
   Center managers in each of the TimeZones.

   Field staff who identify an affected system are instructed to
   contact their regional Service Control Center for further
   instructions.

   The following URLs provide contact information for the three Sun
   Fire Service Control Centers.

      APAC:     http://apac-scc.singapore/ 
      Americas: http://cccweb.ebay.sun.com/ccc/groups/sunfire/ 
      EMEA:     http://jfk.france/serengeti/sscc/sscc.html

   When the customer will not allow FRUs to be reprogrammed at their
   sites due to security considerations, the local field service and
   TimeZone Service Control Center staff may choose to perform no
   corrective action.  This decision would be based on a detailed
   understanding of the risk factors and system configurations that
   are required to exhibit the failure mode.

   Local account teams will have to assess the individual customer
   exposure to the issue and customer data center operation
   practices to determine if field service corrective action is
   appropriate.
                    
Note that the above corrective action procedures have been implemented
in all three TimeZone under the supervision of the Service Control
Centers.  It is recommended that the Service Control Centers continue
to supervise the investigation of affected systems and supply the
special tools and MAC addresses to the field staff.

Due to the nature of the repairs that need to be performed,
reprogramming of centerplanes must be performed by specially trained
Sun personnel.  The special tools were manufactured by Sun
manufacturing at no cost to Enterprise Services.  These tools were
shipped to selected service engineers designated TimeZone Control
Center staff.

Channel partners who are responsible for system maintenance may be
required to perform centerplane replacements and to reprogram
replacement centerplanes with assistance from regional Sun System
Support Engineers (SSEs).  Whenever possible, Sun staff should perform
the reprogramming instead of the channel partner staff to minimize
unrecovered costs by the channel partners.


COMMENTS:

None

============================================================================

Implementation Footnote:

i)   In case of MANDATORY FINs, Enterprise Services will attempt to    
     contact all affected customers to recommend implementation of 
     the FIN. 
   
ii)  For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical    
     support teams will recommend implementation of the FIN  (to their  
     respective accounts), at the convenience of the customer. 

iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the   
     need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network 
browser as follows:
 
SunWeb Access:
-------------- 
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/

* From there, select the appropriate link to query or browse the FIN and
  FCO Homepage collections.
 
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/

* From there, select the appropriate link to browse the FIN or FCO index.

Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@sdpsweb.EBay
--------------------------------------------------------------------------


Copyright (c) 1997-2003 Sun Microsystems, Inc.