Document fins/I0664-1
FIN #: I0664-1
SYNOPSIS: Rebooting the SSP of an E10K may cause problem
DATE: Apr/09/01
KEYWORDS: Rebooting the SSP of an E10K may cause problem
---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------
FIELD INFORMATION NOTICE
(For Authorized Distribution by SunService)
SYNOPSIS: Rebooting the System Service Processor (SSP) of an E10000
system while SSP processes are active can cause all domains
to stop.
TOP FIN/FCO REPORT: Yes
PRODUCT_REFERENCE: E10000 SSP
PRODUCT CATEGORY: Server / SW Admin
PRODUCTS AFFECTED:
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
Systems Affected
----------------
- E10000 ALL Ultra Enterprise 10000 -
X-Options Affected
------------------
- - - - -
PART NUMBERS AFFECTED:
Part Number Description Model
----------- ----------- -----
- - -
REFERENCES:
BugId: 4394892 - Potential problem with actionsysclock causing arbstops
after cbe (control board executive) connects.
4365492 - Heartbeat Failures caused 5 domains to panic.
PatchId: 110732 - SSP 3.1.1: Heartbeat Failures caused 5 domains to panic.
110733 - SSP 3.2: Heartbeat Failures caused 5 domains to panic.
110734 - SSP 3.3: Heartbeat Failures caused 5 domains to panic.
110735 - SSP 3.4: Heartbeat Failures caused 5 domains to panic.
ESC: 527270
527381
528173
528506
MANUAL: 806-1500-10: SSP 3.2 User's Guide.
806-1502-05: Sun Enterprise 10000 SSP 3.2 Installation Guide and
Release Notes.
806-2886-10: SSP 3.3 Installation Guide and Release Notes.
806-4872-10: SSP 3.4 Installation Guide and Release Notes.
806-2887-10: SSP 3.3 User's Guide
806-4870-10: SSP 3.4 User's Guide
806-2888-10: SSP 3.3 Reference Manual
806-4871-10: SSP 3.4 Reference Manual
Sun Alert: SA-24898
PROBLEM DESCRIPTION:
Rebooting the System Service Processor (SSP) of an Ultra Enterprise
10000 system while SSP processes and daemons are active can cause all
of the system domains to crash. When a reboot of an SSP occurs while
the cbe_reset process is running, all active domains may crash and a
heartbeat failure will be detected. The cbe_reset process is used to
initialize the Control Board Executive (CBE) image on the primary
control board.
Error messages for these domain crashes will be reported in the domain
specific message files located on the SSP in the following directory:
/var/opt/SUNWssp/adm/'domain-name'
They may include a hostreset message for all processors of the E10000
and a resulting hostresetdump file will be created in this directory
with a current time stamp.
The problem can occur with the following SSP software releases:
SSP 3.1
SSP 3.1.1
SSP 3.2
SSP 3.3
SSP 3.4
The problem has been fixed by a patch for the different releases
of the SSP software.
SSP 3.1.1 110732
SSP 3.2 110733
SSP 3.3 110734
SSP 3.4 110735
IMPLEMENTATION:
---
| | MANDATORY (Fully Pro-Active)
---
---
| X | CONTROLLED PRO-ACTIVE (per Sun Geo Plan)
---
---
| | REACTIVE (As Required)
---
CORRECTIVE ACTION:
An Authorized Enterprise Field Service Representative may avoid the
above mentioned problems by following the procedures in either Step A
or Step B below.
A. (Preferred) Install the appropriate patch for the SSP version in use:
SSP 3.1.1 110732
SSP 3.2 110733
SSP 3.3 110734
SSP 3.4 110735
The above patches will upgrade the flashprom firmware on the control
board from revision 3.46 to 3.47. The patch README gives detailed
instructions for upgrading the firmware. If using SSP 3.1, please
upgrade to one of the above SSP software releases and apply the correct
patch.
NOTE: Please read the patch release notes very carefully, when
patching SSP 3.4 with respect to disabling SSP Failover.
B. As a workaround, before the patches are installed, the problem may be
avoided by doing the following:
1. Don't reboot or halt the SSP until it has fully initialized. That is,
one of the following messages appear in /var/opt/SUNWssp/adm/messages:
Startup of SSP as MAIN complete (for SSP 3.4)
Startup of SSP complete (for SSP 3.3 or earlier)
2. Don't reboot or halt the SSP if cb_reset is running.
(# ps -ef | grep cb_reset)
3. Follow this procedure to reboot an SSP:
* Stop the SSP processes: /etc/init.d/ssp stop
* Stop in.rarpd: /etc/init.d/nfs.server stop
* Kill all in.tftpd processes, if active
* Reboot the SSP.
COMMENTS:
----------------------------------------------------------------------------
Implementation Footnote:
i) In case of MANDATORY FINs, Enterprise Services will attempt to
contact all affected customers to recommend implementation of
the FIN.
ii) For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical
support teams will recommend implementation of the FIN (to their
respective accounts), at the convenience of the customer.
iii) For REACTIVE FINs, Enterprise Services will implement the FIN as the
need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network
browser as follows:
SunWeb Access:
--------------
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/
* From there, select the appropriate link to query or browse the FIN and
FCO Homepage collections.
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/
* From there, select the appropriate link to browse the FIN or FCO index.
Supporting Documents:
---------------------
* Supporting documents for FIN/FCOs can be found on Edist. Edist can be
accessed internally at the following URL: http://edist.corp/.
* From there, follow the hyperlink path of "Enterprise Services Documenta-
tion" and click on "FIN & FCO attachments", then choose the
appropriate
folder, FIN or FCO. This will display supporting directories/files for
FINs or FCOs.
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@Sun.COM
---------------------------------------------------------------------------
Copyright (c) 1997-2003 Sun Microsystems, Inc.