C H A P T E R 4 |
DR Operations and Software Components on the Domain |
This chapter contains descriptions of the four general DR operations: connect, configure, disconnect, and unconfigure. For more information on how to perform these operations, see Chapter 6 .
This chapter also contains information about the various software components that work together to accomplish DR operations. The components that are used during a DR operation depend entirely on the point of initiation of the DR operation. For instance, if you initiate the DR operation from the Sun Fire 15K system controller (SC), the system uses more software components to accomplish the DR operation than if you initiate the DR operation from the domain.
For more information about the software components that reside on the SC, refer to the System Management Services (SMS) 1.2 Dynamic Reconfiguration User Guide .
This section contains descriptions of the four general DR operations: connect, configure, disconnect, and unconfigure. These operations are described from the point of view of the domain, and do not contain information that is specific to the SC.
Before you perform DR operations for the first time on a domain after it has been booted, make sure the board is available to the domain. To display a list of boards that are available to the domain, use the cfgadm command with its -l option.
An error may occur if you attempt to perform DR operations on a board that:
Is not listed in the domain's ACL and is not assigned to the domain; or
Is listed in the domain's ACL, but is assigned to another domain.
In either of these cases, the board is not available to the domain. For more information about the ACL refer to the System Management Services (SMS) 1.2 Administrator Guide .
During the connect operation, DR attempts to assign the slot to the domain if a system board is available and if it is not part of any logical domain. After the slot has been assigned, DR requests that the SC power on and test the board. After the board has been tested, DR requests the SC to connect the board electronically to the system bus, which makes the board part of the physical domain. The operating system then probes the components on the board.
To connect a system board through the domain rather than the SC, use the cfgadm (1M) command as follows:
where x represents the number (0 to 17) of the board.
The syntax of the cfgadm (1M) command to connect an I/O board is as follows:
where x represents the number (0 to 17) of the board.
The states and conditions for the attachment point before a board is inserted are:
After the board is physically inserted, the states and conditions are:
After the attachment point is logically connected, the states and conditions are:
During the configure operation, DR attempts to connect the board slot if its state is disconnected. It then traverses the tree of devices that was created during the connect operation. (DR creates Solaris device tree nodes and attaches device drivers if necessary.)
The CPUs are added to the CPU list; and memory is initialized and added to the system memory pool. After the configure function has completed successfully, the CPUs and memory are ready for use.
For I/O devices, use the mount (1M) and the ifconfig (1M) commands before the devices can be used.
When you configure a board into a domain using cfgadm , the board is automatically connect and configured
To configure a CPU on a system board through the domain rather than the SC, use the cfgadm (1M) command as follows:
where x represents the board's number (0 to 17) and y represents the CPU number (0 to 3).
The syntax of the cfgadm (1M) command to configure memory is as follows:
where x represents the board number (0 to 17) for a particular board. For memory, the command applies to all the memory on the system board.
To configure all the CPUs and memory on a system board, use the following command:
The syntax of the cfgadm (1M) command to configure a bus on an I/O board is as follows:
where x represents the board number (0 to 17) and y represents the PCI number (0 to 3).
To configure all the busses on an I/O board, use the following command:
The states and conditions for a configured attachment point are:
Now the system is aware of the usable devices that reside on the board, and all devices can be mounted or configured for use.
During a disconnect operation, DR attempts to perform the tasks related to the unconfigure operation, and requests that the SC program the interconnect to remove the system board from the physical domain.
A board can be in the disconnected state without being powered off. However, the board must be powered off and in the disconnected state before you can remove it from the slot.
The syntax of the cfgadm (1M) command to disconnect the board is as follows:
where x represents the board number (0 to 17).
Before the board is disconnected, the states and conditions are:
After the board is disconnected, the states and conditions are:
The unconfigure operation can consist of a single operation or two separate operations, depending on the presence of permanent memory. If the system board hosts permanent memory, before the unconfigure operation DR moves the memory contents from the specified board to available memory on a target board in the domain. See the section for more information about boards that host permanent memory.
If the reconfiguration coordination manager ( RCM) is present, then DR informs the RCM about the DR operation. The RCM informs client applications, and the client applications perform preparatory tasks such as stopping the usage of devices. The clients communicate their readiness to the RCM, and the RCM communicates its readiness to DR. Depending on the responses, DR either continues, or aborts the operation and reports an error to the user.
During the unconfigure operation, DR unconfigures the board resources from the Solaris operating environment and leaves the board in the unconfigured state.
If the board hosts CPUs and/or memory, DR removes them from the Solaris operating environment, making them unusable to the operating system. If the board is an I/O board, DR detaches the device drivers.
The following paragraphs and examples specifically illustrate the unconfigure operation for permanent memory.
In the following code examples, the permanent memory on board 0 must be moved to another board in the domain, board 1. Board 0 is the source board, and board 1 is the target board.
For brevity, the CPU information has been removed from the code examples. On the domain, the unconfigure operation is started with the cfgadm(1M) command:
First, a block of memory on the target board that resides in the same address range as the permanent memory on the source board must be deleted. During this phase, the source board, the target board, and the memory attachment points are marked as busy. You can display the status with the following command:
After the memory has been deleted on the target board, it is marked as unconfigured. The memory on the source board remains configured, but it is still marked as busy, as in the following example.
Ap_Id Type Receptacle Occupant Busy SB0 CPU connected configured y SB0::memory memory connected configured y SB1 CPU connected configured y SB1::memory memory connected unconfigured n |
The memory from the source board is then copied to the target board. After it has been copied, the occupancy state for the memory is switched. The memory on the source board becomes unconfigured, and the memory on the target board becomes configured. At this point in the process, only the source board remains busy, as in the following example.
Ap_Id Type Receptacle Occupant Busy SB0 CPU connected configured y SB0::memory memory connected unconfigured n SB1 CPU connected configured n SB1::memory memory connected configured n |
After the entire process has been completed, the memory on the source board remains unconfigured, and the attachment points are not busy, as in the following example.
Ap_Id Type Receptacle Occupant Busy SB0 CPU connected configured n SB0::memory memory connected unconfigured n SB1 CPU connected configured n SB1::memory memory connected configured n |
The permanent memory has been moved, and the memory on the source board has been unconfigured. At this point, you can initiate a new status change operation on either board.
This section describes the software components that reside on the domain and make DR operations possible. However, it does not contain descriptions of all of the DR components on the Sun Fire 15K platform. Refer to the System Management Services (SMS) 1.2 Dynamic Reconfiguration User Guide for descriptions of the software components that reside on the Sun Fire 15K system controller (SC).
The domain configuration server ( DCS) is a daemon process that runs on a Sun Fire 15K domain and is started by inetd (1M) when the first remote DR request is received. A single instance of the DCS runs in each domain on the Sun Fire 15K. The DCS accepts DR requests from the domain configuration agent ( DCA) that runs on the SC. After the DCS accepts a DR operation, it performs the request and returns the results to the DCA. Refer to the System Management Services (SMS) 1.2 Dynamic Reconfiguration User Guide for more information about the DCA.
The DR driver consists of a platform independent driver, named dr , and a platform-specific module, named drmach . The DR driver uses standard features of the Solaris operating environment whenever possible to control DR operations, and it calls the platform-specific module as needed. The DR driver is responsible for creating minor nodes in the file system that are used as attachment points for DR operations.
The reconfiguration coordination manager (RCM) is a daemon process that coordinates DR operations on resources in the domain. The RCM daemon uses generic application program interfaces (APIs) to coordinate DR operations between DR initiators and RCM clients.
The RCM consumers consist of DR initiators, which request DR operations, and DR clients, which react to DR requests. Normally, the DR initiator is the configuration administration command, cfgadm (1M). However, it can also be a GUI such as Sun Management Center.
Software layers that export high-level resources comprised of one or more hardware devices (for example, multipathing applications)
Applications that monitor DR operations (for example, Sun Management Center)
Entities on a remote system, such as the system controller on a server
DR uses the Solaris system events framework to notify other software entities of changes that result from a DR operation. DR accomplishes this by sending DR events to the system event daemon, syseventd , which, in turn, sends the events to the subscribers of DR events. For more information about the system events daemon, refer to the syseventd (1M) man page.
Copyright © 2002, Sun Microsystems, Inc. All rights reserved.