SRDB ID | Synopsis | Date | ||
48483 | Sun Fire[TM] 12K/15K: POST: IBIST Failures | 21 Nov 2002 |
Status | Issued |
Description |
- Problem Statement: POST: IBIST Failures - Symptoms: POST reports an IBIST failure. Some examples:
#1: stage ibist: Interconnect BIST... AXQ-RMX IBIST... ERR: IBIST error: AXQ EX3 RMX C0 Exp 0x0aaaaaaaa Obs 0x03c345555 XOR 0x0969effff. FAIL EXB EX3: IBIST failure Primary service FRU is EXB EX3. Secondary service FRU is CSB C0 or the logic centerplane. #2: stage ibist: Interconnect BIST... ERR: IBIST error: DMX C1/D0 SDI EX4/S3 Error bits = 0x1555554. FAIL EXB EX4: IBIST failure
SOLUTION SUMMARY:
- Troubleshooting: IBIST is the Interconnect built-in-self-test between two ASICs. One of the ASICs acts as the master driving preset/programmable bit patterns, and the other ASIC receives the patterns and then echoes them back. If the echoed pattern received by the master does not match the original pattern, the test fails. In the first example above, AXQ EX3 is the master and RMX0 is the slave. The AXQ EX3 is expecting the pattern 0x0aaaaaaaa, but 0x03c345555 is received. 0x0aaaaaaaa XOR 0x03c345555 = 0x0969effff shows the bits in error. However, note that example #1 is bug4704614 , corrected in SMS 1.2 patch112488-10 (or higher). - Resolution: If the IBIST failure is an AXQ<-->RMX0 error, first confirm that POST patch112488-10 (or higher) is applied to the system. Otherwise, all IBIST failures within close proximity must be considered when deciding the appropriate FRU. If there's only a single failure, as shown above, it is logical to replace what POST suggests as the primary FRU: EX3 in this example. However, if multiple IBIST failures are present, they must be considered holistically. For example, suppose SDI2 on 4 expanders all report IBIST failures to a given DMX. Taken together, this would call the DMX (i.e., the centerplane) into question as it is unlikely that multiple expanders would fail. Finally, improper board seating is a possible cause for IBIST failures. If a service action involving a suspect FRU was recently conducted, check seating. - Summary of part number and patch ID's112488-10 - References and bug IDs4704614 - Additional background information: For details on what IBIST tests are available, refer to the online documentation in 'redx'. redx> ? ibist Under no circumstances should IBIST be executed on a component supporting a running domain. It will crash all domains relying on that component. Furthermore, if IBIST is run manually, the component must be power cycled after completion to return the ASIC(s) to a known, clean state. Refer to bug4743556 for an example of why. - Meta-Data/Problem categorization: Product/Platform: SF12K/SF15K Category: - Keywords 15K, 12K, SF15K, SF12K, Sun Fire 15K, Enterprise, Server, Sun Fire 12K, post, ibist
INTERNAL SUMMARY:
SUBMITTER: Scott Davenport BUG REPORT ID: 4704614, 4704614, 4743556 PATCH ID: 112488-10, 112488-10, 112488-10 APPLIES TO: Hardware/Sun Fire /15000, Hardware/Sun Fire /12000 ATTACHMENTS: