C H A P T E R 1 |
Introduction to System Management Services |
This manual describes the System Management Services (SMS) 1.2 software that is available with the Sun Fire 15K server system.
The Sun Fire 15K server is a member of the next-generation Sun Fire server family.
The system controller (SC) in the Sun Fire 15K is a multifunction, Nordica-based printed circuit board (PCB) which provides critical services and resources required for the operation and control of the Sun Fire system. In this book, the system controller is simply called the SC .
The Sun Fire 15K system is often referred to as the platform . System boards within the platform can be logically grouped together into separately bootable systems called dynamic system domains , or simply domains .
Up to 18 domains can exist simultaneously on a single platform. (Domains are introduced in this chapter, and are described in more detail in Chapter 4 .) The system management services (SMS) software lets you control and monitor domains, as well as the platform itself.
SMS software packages are installed on the SC. In addition, SMS communicates with the Sun Fire 15K system over an Ethernet connection, see Management Network Services .
This version of SMS 1.2 supports Sun Fire servers running the Solaris 9 05/02 operating environment.
SMS 1.2 is compatible with Sun Fire 15K domains that are running the Solaris 8 02/02 through Solaris 9 05/02 operating environments. The commands provided with the SMS software can be used remotely.
Note - Graphical user interfaces for many of the commands in SMS are provided by Sun Management Center. For more information, see Sun Management Center. |
SMS enables the platform administrator to perform the following tasks:
Administrate domains by logically grouping domain configurable units (DCU) together. DCUs are system boards such as: CPU and I/O boards. Domains are able to run their own operating systems and handle their own workloads. See Chapter 4 .
Dynamically reconfigure a domain so that currently installed system boards can be logically attached to or detached from the operating system while the domain continues running in multiuser mode. This feature is known as dynamic reconfiguration and is described in the System Management Services (SMS) 1.2 Dynamic Reconfiguration User Guide (A system board can be physically swapped in and out when it is not attached to a domain, while the system continues running in multiuser mode.)
Perform automatic dynamic reconfiguration of domains using a script. Refer to the System Management Services (SMS) 1.2 Dynamic Reconfiguration User Guide .
Monitor and display the temperatures, currents, and voltage levels of one or more system boards or domains.
Monitor and control power to the components within a platform.
Execute diagnostic programs such as power-on self-test (POST).
Warns you of impending problems, such as high temperatures or malfunctioning power supplies.
Notifies you when a software error or failure has occurred.
Monitors a dual SC configuration for single points of failure and performs an automatic failover from the main SC to the spare or from the primary control board to the spare control board, depending on the failure condition detected.
Automatically reboots a domain after a system software failure (such as a panic).
Keeps logs of interactions between the SC environment and the domains.
Provides support for the Sun Fire 15K system dual grid power option.
SMS enables the domain administrator to perform the following tasks:
Administrate domains by logically grouping domain configurable units (DCU) together. DCUs are system boards such as: CPU and I/O boards. Domains are able to run their own operating systems and handle their own workloads. See Chapter 4 .
Boot domains for which the administrator has privileges.
Dynamically reconfigure a domain for which the administrator has privileges, so that currently installed system boards can be logically attached to or detached from the operating system while the domain continues running in multiuser mode. This feature is known as dynamic reconfiguration and is described in the System Management Services (SMS) 1.2 Dynamic Reconfiguration User Guide . (A system board can be physically swapped in and out when it is not attached to a domain, while the system continues running in multiuser mode.)
Perform automatic dynamic reconfiguration of domains using a script for which the administrator has privileges. Refer to the System Management Services (SMS) 1.2 Dynamic Reconfiguration User Guide .
Monitor and display the temperatures, currents, and voltage levels of one or more system boards or domains for which the administrator has privileges.
Execute diagnostic programs such as power-on self-test (POST) for which the administrator has privileges.
The following features are provided in this release of Sun Fire 15K SMS:
Dynamic System Domain (DSD) Configuration
Configured Domain Services
Domain Control Capabilities
Domain Status Reporting
Hardware Control Capabilities
Hardware Status Monitoring, Reporting and Handling
Hardware Error Monitoring, Reporting and Handling
System Controller (SC) failover
Configurable Administrative Privileges
Dynamic FRUID
SMS architecture is best described as distributed client-server. init (1M) starts (and restarts as necessary) one process: ssd (1M). ssd is responsible for monitoring all other SMS processes and restarting them as necessary. See FIGURE 3-1 .
The Sun Fire 15K platform, the SC, and other workstations communicate over Ethernet. You perform SMS operations by entering commands on the SC console after remotely logging in to the SC from another workstation on the local area network. You must log in as a user with the appropriate platform or domain privileges if you want to perform SMS operations (such as monitoring and controlling the platform).
Note - The domains require at least one SC to be powered on or they will hang. |
Dual controller boards are supported within the Sun Fire 15K platform. One board is designated as the primary or main controller board, and the other is designated as the spare controller board. If the main control board fails, the failover capability automatically switches to the spare control board as described in Chapter 8 .
Most domain configurable units are active components and you need to check the system state before powering off any DCU.
Note - Circuit breakers must be on whenever a board is present, including expander boards, whether or not the board is powered on. |
For details, see Power Control .
Administration tasks on the Sun Fire 15K system are secured by group privilege requirements. Upon installation, SMS installs the following 39 UNIX groups to the
/etc/group
file.
platadmn- Platform administrator
platoper - Platform operator
platsvc - Platform service
dmn[ A...R ]admn - domain [ domain_id|domain_tag ] administrator (18)
dmn[ A...R ]rcfg - domain [ domain_id|domain_tag ] configurator (18)
smsconfig (1M) allows an administrator to add, remove and list members of platform and domain groups as well as set platform and domain directory privileges using the -a , -r and -l options.
smsconfig also can configure SMS to use alternate group names including NIS managed groups using the -g option. Group information entries can come from any of the sources for groups specified in the /etc/nsswitch.conf file (refer to nsswitch.conf (4)). For instance, if domain A was known by its domain tag as the "Production Domain," an administrator could create a NIS group with the same name and configure SMS to use this group as the domain A administrator group instead of the default, dmnaadmn . For more information, refer to the System Management Services (SMS) 1.2 Installation Guide and Release Notes , see Administration Models and refer to the smsconfig man page.
You can interact with the SC and the domains on the Sun Fire 15K system by using SMS commands.
SMS provides a command line interface to the various functions and features it contains.
For the examples in this guide, the sc_name is sc0 and sms-user is the user-name of the administrator, operator, configurator or service personnel logged onto the system.
The privileges allotted to the user are determined by the platform or domain groups to which the user belongs. In these examples, the sms-user is assumed to have both platform and domain administrator privileges, unless otherwise noted.
For more information on the function and creation of SMS user groups, refer to the System Management Services (SMS) 1.2 Installation Guide and Release Notes and see Administration Models .
Note - This procedure assumes that smsconfig -m has already been run. If smsconfig -m has not been run, you will receive the following error when SMS attempts to start and SMS will exit. |
2. Log in to the SC and verify that SMS software startup has completed. Type:
3. Wait until showplatform finishes displaying platform status:
At this point you can begin using SMS programs.
An SMS console window provides a command line interface from the SC to the Solaris operating environment on the domain(s).
1. Log in to the SC, if you have not already done so.
Note Note - You must have domain privileges for the domain on which you wish to run console. |
console creates a remote connection to the domain's virtual console driver, making the window in which the command is executed a "console window" for the specified domain ( domain_id or domain_tag ).
The following options are available:
-f
Opens a domain console window with "locked write" permission, terminates all other open sessions, and prevents new ones from being opened. This constitutes an "exclusive session." Use it only when you need exclusive use of the console (for example, for private debugging). To restore multiple-session mode, either release the lock (~^) or terminate the console session (~.).
-g
Opens a console window with "unlocked write" permission. If another session has "unlocked write" permission, the new console window takes it away. If another session has "locked" permission, this request is denied and a read-only session is started.
-l
Opens a console window with "locked write" permission. If another session has "unlocked write" permission, the new console window takes it away. If another session has "locked" permission, the request is denied and a read-only session is started.
-r
If console is invoked without any options when no other console windows are running for that domain, it comes up in exclusive "locked write" mode session.
If console is invoked without any options when one or more non-exclusive console windows are running for that domain, it will come up in "read-only" mode.
Locked write permission is more secure. It can only be taken away if another console is opened using console -f or if ~* (tilde-asterisk) is entered from another running console window. In both cases, the new console session is an "exclusive session", and all other sessions are forcibly detached from the domain virtual console.
console can utilize either IOSRAM or the internal management (I1 MAN) network for domain console communication. You can manually toggle the communication path by using the ~= (tilde-equal sign) command. Doing so is useful if the network becomes inoperable, in which case the console sessions appears to be hung.
Many console sessions can be attached simultaneously to a domain, but only one console will have write permissions; all others will have read-only permissions. Write permissions are in either "locked" or "unlocked" mode.
In a domain console window, a tilde ( ~ ) that appears as the first character of a line is interpreted as an escape signal that directs console to perform some special action, as follows:
rlogin also processes tilde-escape sequences whenever a tilde is seen at the beginning of a new line. If you need to send a tilde sequence at the beginning of a line and you are connected using rlogin , use two tildes (the first escapes the second for rlogin ). Alternatively, do not enter a tilde at the beginning of a line when running inside of an rlogin window.
If you use a kill -9 command to terminate a console session, the window or terminal in which the console command was executed goes into raw mode, and appears hung. Type ^j , then stty sane , then ^j to escape this condition,
In the domain console window, vi (1) runs properly and the escape sequences (tilde commands) work as intended only if the environment variable TERM has the same setting as that of the console window.
If you need to resize the window, type:
For more information on domain console, see Domain Console and refer to the console man page.
In the event that a system controller hangs and that console cannot be reached directly, SMS provides the smsconnectsc command to remotely connect to the hung SC. This command works from either the main or spare SC. For more information and examples, refer to the smsconnectsc man page.
Sun Management Center for the Sun Fire 15K is an extensible monitoring and management tool that provides a system administrator with the ability to manage the Sun Fire 15K system. Sun Management Center integrates standard SNMP based management structures with new intelligent and autonomous agent and management technology based on the client/ server paradigm.
Sun Management Center is used as the GUI and SNMP manager/agent infrastructure for the Sun Fire 15K system. The features and functions of Sun Management Center are not covered in this manual. For more information, refer to the Sun Management Center User's Guide .
Copyright © 2002, Sun Microsystems, Inc. All rights reserved.