Document fins/I0820-1
FIN #: I0820-1
SYNOPSIS: Sun Fire 15K domains may panic due to problem with Schizo 2.2 ASICs
on hsPCI I/O Boards
DATE: Apr/30/02
KEYWORDS: Sun Fire 15K domains may panic due to problem with Schizo 2.2 ASICs
on hsPCI I/O Boards
---------------------------------------------------------------------
- Sun Proprietary/Confidential: Internal Use Only -
---------------------------------------------------------------------
FIELD INFORMATION NOTICE
(For Authorized Distribution by SunService)
SYNOPSIS: Sun Fire 15K domains may panic due to problem with Schizo 2.2
ASICs on hsPCI I/O Boards.
Sun Alert: No
TOP FIN/FCO REPORT: Yes
PRODUCT_REFERENCE: Sun Fire 15K
PRODUCT CATEGORY: Server / SW Admin
PRODUCTS AFFECTED:
Systems Affected:
----------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
- F15K ALL Sun Fire 15K -
X-Options affected:
-------------------
Mkt_ID Platform Model Description Serial Number
------ -------- ----- ----------- -------------
X4575A - - HSPCI Assy F15000 Base + Lic -
PART NUMBERS AFFECTED:
Part Number Description Model
----------- ----------- -----
501-5397-08 or lower ASSY HSPCI I/O Board -
REFERENCES:
BugId: 4530368 - isp_scsi_impl_pktfree panics under system stress.
4655317 - 4 F15K panic reproducably with isp_scsi_impl_pktfree:
freeing free packet.
PatchId: 112665
FCO: A0193-1
ECO: WO_23260
WO_23410
PROBLEM DESCRIPTION:
Sun has identified a bug in revision 2.2 of the Schizo ASIC located on
the hsPCI I/O boards, which may cause a domain panic. All Sun Fire 15K
systems shipped prior to April 1, 2002 can experience a device driver
panic failure.
Please note that all Sun Fire 15K servers shipped since April 1, 2002
are not impacted by this bug. This bug has no impact on any Sun Fire
12K systems.
Domain panics may occur with the following Sun Fire 15K configurations:
Hardware Components:
hsPCI I/O Board Revision 2.2 (501-5397-08 or lower)
JNI 32-bit PCI-to-Fibre Channel HBA (FCI-1063-x)
SunSwift PCI SCSI (Fresh Choice) HBA (X1032A)
Nexus Driver:
pcisch (PCI Bus nexus driver 1.199)
NOTE: Version 1.199 of the Nexus Driver is the default version shipped
with Solaris 8. This is the version installed unless Patch
112665 has been applied.
Domain Configuration:
Domains utilizing ISP(SCSI HBA Driver) or JNI FCI-1063-x HBA.
The ISP and/or the JNI drivers are the only two which have produced
this failure. Not all configurations with the ISP and/or the JNI
driver are affected. The issue only affects specific configurations.
Also, the failure manifests itself as a panic, with these drivers in
the panic string, which helps to identify the failure.
When the panic has been seen with the ISP driver, the panic string is:
"isp_scsi_impl_pktfree: freeing free packet"
For the JNI driver used with the JNI FCI-1063-x HBA, the system will log
"INB_SCSI_COMPLETE interrupt with INVALID tag" errors before the domain
panics with a "panic assertion failed:"
The root cause of this problem is a transaction ordering issue within
the I/O controller in the F15K. Simply put, the I/O controller does
not follow certain ordering rules. The I/O controller has data
remaining from a previous read/write while the current transaction is
being processed.
The solution is an I/O controller upgrade, bundled with a corresponding
PCI nexus driver update, to properly order the transactions and
synchronize the software and hardware.
Sun has chosen to proactively fix this bug in the field. A Mandatory
FCO (A0193-1) will be issued which will require the replacement of
hsPCI assemblies that contain Schizo 2.2 ASICs. These will be replaced
with new hsPCI assemblies containing Schizo 2.3 ASICs. All Spares/FRU
Stock will also be purged.
Here are details for the fixed versions of the hsPCI I/O Board (501-5397):
-09 = Reworked -07 to replace the Schizo 2.2 with 2.3 (per ECO WO_23260
on 03/07/02).
-10 = Never built.
-11 = New build as well as -08 reworked to to replace the Schizo 2.2 with
2.3 (per ECO WO_23410 on 2/27/02, ECO was amended from phase to cut
in & rework on 4/3/02).
The probability of a customer experiencing this bug is very low. Sun
has created a fix for this bug and is in the process of implementing it
in the field at no cost to the customer.
Mandatory FCO A0193-1 will be released on or about May/15/2002. This
will require all down revision hsPCI I/O assemblies (part 501-5397-08
or lower) to be replaced. Prior to the release of the FCO, customers
who match the failure signature of bug 4530368/4655317 should have
their case escalated to CPRE. CPRE will verify the diagnosis of bug
4530368/4655317 and will submit the confirmation to GEO VP's for
expedited approval of parts.
A complete solution will also require a software I/O driver point
patch, 112665. Contact CPRE for information and access to this
patch. This patch will be available via SunSolve once the FCO is
released.
IMPLEMENTATION:
---
| | MANDATORY (Fully Proactive)
---
---
| X | CONTROLLED PROACTIVE (per Sun Geo Plan)
---
---
| | REACTIVE (As Required)
---
CORRECTIVE ACTION:
The following recommendation is provided as a guideline for authorized
Enterprise Services Field Representatives who may encounter the above
mentioned problem.
If the above failure is detected, please escalate to HES CPRE and
provide the following data/information:
1. Explorer output from the System Controller.
2. What is the current urgency of your particular customer?
3. How often have they experienced this panic/errors since initially
reported?
4. Is the customer willing to replace hardware along with a possible
kernel patch upgrade, and willing to install a point patch?
For procuring replacement parts, the following is needed:
A. Shipping address and contact information for that location.
B. Quantity of hsPCI I/O boards (501-5397) required at that location.
C. A requested ship date for that location.
Please note that Sun recommends that you implement FCO A0192-1 (AXQ 6.0
ASIC based Expander boards) at the same time that you implement FCO
A0193-1 to minimize customer disruption.
COMMENTS:
None
============================================================================
Implementation Footnote:
i) In case of MANDATORY FINs, Enterprise Services will attempt to
contact all affected customers to recommend implementation of
the FIN.
ii) For CONTROLLED PROACTIVE FINs, Enterprise Services mission critical
support teams will recommend implementation of the FIN (to their
respective accounts), at the convenience of the customer.
iii) For REACTIVE FINs, Enterprise Services will implement the FIN as
the need arises.
----------------------------------------------------------------------------
All released FINs and FCOs can be accessed using your favorite network
browser as follows:
SunWeb Access:
--------------
* Access the top level URL of http://sdpsweb.ebay/FIN_FCO/
* From there, select the appropriate link to query or browse the FIN and
FCO Homepage collections.
SunSolve Online Access:
-----------------------
* Access the SunSolve Online URL at http://sunsolve.Corp/
* From there, select the appropriate link to browse the FIN or FCO
index.
Internet Access:
----------------
* Access the top level URL of https://infoserver.Sun.COM
--------------------------------------------------------------------------
General:
--------
* Send questions or comments to finfco-manager@sdpsweb.EBay
--------------------------------------------------------------------------
Copyright (c) 1997-2003 Sun Microsystems, Inc.