InfoDoc ID | Synopsis | Date | ||
49202 | Sun Fire[TM] 3800-6800: CPU/Memory Board Dynamic Reconfiguration (DR) Considerations | 8 Jan 2003 |
Status | Issued |
Description |
This document provides guidance for using dynamic reconfiguration (DR) on CPU/Memory boards in Sun Fire 3800-6800 systems. Using DR for configuration changes, servicing CPU/Memory boards increases the overall application uptime. The Solaris[TM] Operating Environment (OE) and applications keep running while DR tasks are performed.
TERMS:
When describing DR operations, this document uses terms as defined by the Solaris OE DR command cfgadm.
Configure Adding a CPU/Memory board to a running Domain.
Disconnect Removing a CPU/Memory board from running Domain.
ASSUMPTIONS:
It's assumed that the Domain and the Sun Fire System Controller (SSC) are running the minimum Solaris/Firmware versions required for supporting DR:
All loaded third party device drivers have to fulfill the Device Driver specifications (See Writing Device Drivers
http://www.sun.com/io_technologies/pci/pci.cards.cat.html
CONFIGURING A CPU/MEMORY BOARD INTO A RUNNING DOMAIN:
High level overview of Procedure:
Detailed Procedure:
sunfire-sc0:SC> showboards -p prom Component Compatible Version --------- ---------- ------- SSC0 Reference 5.13.4 SB0 Yes 5.13.4 /N0/SB2 Yes 5.13.4 /N0/IB6 Yes 5.13.4 /N0/IB8 Yes 5.13.4
If the CPU/Memory board has a different Firmware level, use the flashupdate command on the SSC to change it appropriately.
#cfgadm Ap_Id Type Receptacle Occupant Condition N0.IB6 PCI_I/O_Boa connected configured ok N0.SB0 CPU_Board_V connected configured ok N0.SB2 CPU_Board_V disconnected unconfigured unknown c0 scsi-bus connected configured unknown
#cfgadm -o platform=diag=default -c configure N0.SB2
Dec 9 09:19:01 domA unix: cpu 0 initialization complete - restarted Dec 9 09:19:01 domA unix: cpu 1 initialization complete - restarted Dec 9 09:19:01 domA unix: cpu 2 initialization complete - restarted Dec 9 09:19:01 domA unix: cpu 3 initialization complete - restarte
DISCONNECTING A CPU/MEMORY BOARD FROM A RUNNING DOMAIN
High level overview of Procedure:
Detailed Procedure:
#cfgadm -av | grep memory Ap_Id Receptacle Occupant Condition Information When Type Busy Phys_Id N0.SB0::memory connected configured ok base address 0x0, 8388608 KBytes total, 1529776 KBytes permanent Jul 26 13:30 memory n /devices/ssm@0,0:N0.SB1::memory N0.SB2::memory connected configured ok base address 0x400000000, 8388608 KBytes total Jul 26 13:30 memory n /devices/ssm@0,0:N0.SB2::memory
In the example above, CPU/Memory SB0 contains permanent memory.
sunfire-sc0:SC> showdomain -p bootparams diag-level = quick verbosity-level = max error-level = max interleave-scope = within-board
#pbind process id 181: 0
In this example process id 181 is bound to CPU 0. If CPU 0 is on the CPU/Memory board which should be disconnected, the process must be bound to a different CPU. This can be done with pbind as well.
If the requirements are fulfilled, proceed to step 3. If the CPU/Memory contains permanent memory, additional requirements have to be met. The system will automatically check for these conditions and the DR operation will abort if not met.
#ps -efc UID PID PPID CLS PRI STIME TTY TIME CMD root 0 0 SYS 96 02:56:47 ? 0:00 sched root 1 0 TS 58 02:56:47 ? 0:00 /etc/init - root 367 1 RT 140 19:23:16 ? 0:00 /opt/perf/bin/midaemon
Real Time processes can be identified by the RT tag in the CLS column. In the above example, the midaemon with PID 367 is running in the RT class.
# modinfo | grep STMS 120 781e4000 4834 - 1 STMS (Multipath Interface Library) # modinfo | grep scsi_vhci 121 781ea000 6a20 225 1 scsi_vhci (Sun Multiplexed SCSI vHCI)
If the requirements are checked and satisfied, initiate the disconnect operation with the cfgadm command:
#cfgadm -c disconnect N0.SB0
On disconnecting a CPU/Memory board, the following messages are logged in /var/adm/messages:
Dec 9 09:12:43 domA genunix: /ssm@0,0/memory-controller@3,400000 (mc-us36) offline Dec 9 09:12:43 domA genunix: /ssm@0,0/memory-controller@2,400000 (mc-us35) offline Dec 9 09:12:43 domA genunix: /ssm@0,0/memory-controller@1,400000 (mc-us39) offline Dec 9 09:12:43 domA genunix: /ssm@0,0/memory-controller@0,400000 (mc-us34) offline
References:
Sun Fire 3800-6800 Servers Dynamic Reconfiguration Blueprint (
Sun Fire 3800-6800 Systems Dynamic Reconfiguration Users Guide (
Sun Cluster 3.0 Concepts (
Writing Device Drivers (
man page cfgadm, cfgadm_sbd, rcmscript
Keywords:
Sun Fire 3800-6800, Dynamic Reconfiguration, DR, best practices, permanent
INTERNAL SUMMARY:This is a living document. As features/requirement change, all attempts to keep this document current will be made. If while using its content, an oversight or discrepancy is noted, contact the submitter.
Internally the following URLs are most useful:
http://pts-americas.west.sun.com/esg/msg/techinfo/platform/sun_fire/
http://systems.corp.sun.com/tools/salestools/datacenter/avail/dr/index.html
SUBMITTER: Peter Gonscherowski BUG REPORT ID: 4618861 APPLIES TO: Hardware/Sun Fire /3800, Hardware/Sun Fire /4800, Hardware/Sun Fire /4810, Hardware/Sun Fire /6800 ATTACHMENTS: