InfoDoc ID   Synopsis   Date
19997   Procedure to change E10K control board IP addresses   22 Jul 1999

Status Issued

Description
Procedure to change E10K control board IP addresses

The following procedure was tested in the HAS Broomfield lab.


*********
*WARNING*
*********

	The following procedure has been demonstrated in the HAS lab in 
	Broomfield, CO to change the IP address of the primary control
	board.  At the time, no domains were fully up and running. Although
	it is believed that this procedure should pose no threat to running
	domains, it has not been thoroughly tested, wrung out, or completely
	analyzed.

	This procedure is currently U N S U P P O R T E D ! ! !
	Proceed at your own risk.

	Having said that, the wise (i.e. cautious) System Adminstrator would
	shutdown all running domains during this procedure, and then reboot
	the domains after completing the procedure.


1.  Back up your current /etc/hosts file

	cp /etc/hosts /etc/hosts.bak

2.  Make a copy of  your /etc/hosts file

	cp /etc/hosts /etc/hosts.new

3.  Edit the new hosts file, and change the control board and private SSP interface
    IP addresses to their new values.  Save your changes.

4.  Copy the new hosts file to /etc/hosts

	cp /etc/hosts.new /etc/hosts

5.  Kill the boot daemons, rpc.bootparamd and in.rarpd, so that the SSP will not 
    respond to the control board broadcast messages for an IP address when we reset
    the SSP

    	ps -ef | egrep ''rarpd|bootpar''
	[ note the pid''s and then use ''kill'' ]
	kill <in.rarpd pid>
	kill <bootparamd pid>

6.  Reconfigure the SSP for the new control boards:

	# /opt/SUNWssp/bin/ssp_config cb

	Answer 'No' to the first question then accept the defaults for everything
	except when the program asks for the IP address:  here you enter the new
	IP address for the cb.

	ssp_config will create new cb entries in the /tftpboot directory.

	[ See the bottom of this document for an example of what running ssp_config cb
	  might look like. ]

7.  Reset the control boards.  It is VERY IMPORTANT to do this now, before the 
    SSP private interfaces are changed.  [ If you FAIL to reset the control boards now,
    you WILL lose contact with them, and the only way to re-establish contact with them
    would be to re-plumb the private network interfaces to the original network IP 
    addresses! ]

    	ssp# cb_reset -v name_of_cb0
	ssp# db_reset -v name_of_cb1

    At this point, the cb''s reboot, and start asking for their IP address, but 
    because we killed in.rarpd and bootparamd in step 5, no one will answer these
    requests (if somebody does answer, then we have serious problems.)

8.  At this point, we are ready to re-plumb the SSP private net interfaces.  We can
    do this manually, or we can or we can simply reboot the SSP.  However, before the 
    SSP is rebooted, you must update /etc/netmasks, if your new private nets are not
    standard class C subnets.  For example, if you are going to have cb0 at
    10.4.5.1 and cb1 at 10.4.5.9 and they are supposed to be in different subnets, then
    you would need to subdivide the network so that no more than eight IPs are in a
    subnet.  Since 2^^3 == 8, you need to preserve the least significant 3 bits of the
    IP address for the machine mask.  So your netmask would be:  11111000 which equals
    248.  So you would need to put the following entry in /etc/netmasks:

	    10.4.5.0 	255.255.255.248

    Now you are ready to either reboot the SSP or plumb the interfaces by hand:

    a) reboot SSP

    -or-

    NOTE: for the following commands, I am going to use ''qfe0'' as the primary
          network interface to the first control board (cb0), and ''qfe1'' as the
	  secondary interface to the second control board (db1).

    b) ifconfig qfe0 down
    c) ifconfig qfe1 down
    d) ifconfig qfe0 up inet <new qfe0 ip address> netmask <new qfe0 netmask> -trailers up
    e) ifconfig qfe1 up inet <new qfe1 ip address> netmask <new qfe1 netmask> -trailers up

    		where netmask would be 255.255.255.248 (or 0xfffffff8) in the 
		example above

9.  If you rebooted the SSP above, this step is unnecessary.  However, if you plumbed
    the interfaces by hand... you will need to manually restart in.rarpd and bootparamd:

		/usr/sbin/in.rarpd -a
		/usr/sbin/rpc.bootparamd

    NOTE:  it can take a seemingly long time before the control boards get their new IP
    	   addresses (during which the fans on the E10K platform will be on HIGH).  You 
	   can tell that the control boards (or the primary control board) had its new
	   IP address, has rebooted and is up and running, by the fact that the E10K 
	   fans will have returned to the lower speed (and are hence quieter).

	   I saw wait times of 5-10 minutes before the control board cycled through all of
	   its network request modes.

	   You can also snoop the private network interface on the SSP to watch for the
	   time when the primary control board gets its new IP address:

	   	ssp# snoop -d qfe0	(if qfe0 is your private net interface to cb0)




Example ''ssp_config cb'' session:

    # /opt/SUNWssp/bin/ssp_config cb
    Configuring control boards

    Platform name   = mc10k
    Control Board 0 = mc10k-cb0 => 
    Control Board 1 = mc10k-cb1 => 
    Primary Control Board = mc10k-cb0

    Is this correct? (y/n): n
    Do you have a control board 0? (y/n): y
    Please enter the host name of the control board 0 [mc10k-cb0]: 
    I could not automatically determine the IP address of mc10k-cb0.
    Please enter the IP address of mc10k-cb0: 10.0.1.1			   <---- change IP
    You should make sure that this host/IP address is set up properly in
    the /etc/inet/hosts file or in your local name service system.
    Do you have a control board 1? (y/n): y
    Please enter the host name of the control board 1 [mc10k-cb1]: 
    I could not automatically determine the IP address of mc10k-cb1.
    Please enter the IP address of mc10k-cb1: 10.0.1.9			   <---- change IP
    You should make sure that this host/IP address is set up properly in
    the /etc/inet/hosts file or in your local name service system.

    Please identify the primary control board.
    Is Control Board 0 [mc10k-cb0] the primary? (y/n) y

    Platform name   = mc10k
    Control Board 0 = mc10k-cb0 => 10.0.1.1
    Control Board 1 = mc10k-cb1 => 10.0.1.9
    Primary Control Board = mc10k-cb0

    Is this correct? (y/n): y

    # ls -lt /tftpboot
    total 9286
    -r--r--r--   5 root     root      858682 Jul 18 07:40 0A000101	    <---- new files
    -r--r--r--   5 root     root          10 Jul 18 07:40 0A000101.cb_port  <---/
    -r--r--r--   5 root     root      858682 Jul 18 07:40 0A000109          <--/
    -r--r--r--   5 root     root          10 Jul 18 07:40 0A000109.cb_port  <-/
    -r--r--r--   5 root     root      858682 Jul 18 07:40 C0A8010A         <====== old files
    -r--r--r--   5 root     root          10 Jul 18 07:40 C0A8010A.cb_port <======/
    -r--r--r--   5 root     root      858682 Jul 18 07:40 C0A8020A         <=====/
    -r--r--r--   5 root     root          10 Jul 18 07:40 C0A8020A.cb_port <====/
    -r--r--r--   5 root     root          10 Jul 18 07:40 cb_port
    -r--r--r--   5 root     root      858682 Jul 18 07:40 cbe.ima
    -r--r--r--   1 root     root      201204 Jul 18 07:40 flash_boot.ima
    lrwxrwxrwx   1 root     other         29 Jul 18 07:31 AC140AC5 -> inetboot.SUN4U1.Solaris_2.6-1
    -rw-r--r--   1 root     other        328 Jul 18 07:31 rm.172.20.10.197
    lrwxrwxrwx   1 root     other         29 Jul 18 07:31 AC140AC5.SUN4U1 -> inetboot.SUN4U1.Solaris_2.6-1
    lrwxrwxrwx   1 root     other         29 Jul 15 17:08 AC140AC4 -> inetboot.SUN4U1.Solaris_2.6-1
    lrwxrwxrwx   1 root     other         29 Jul 15 17:08 AC140AC4.SUN4U1 -> inetboot.SUN4U1.Solaris_2.6-1
    -rwxr-xr-x   1 root     other     175808 Jul 15 17:08 inetboot.SUN4U1.Solaris_2.6-1
    -rw-r-----   1 root     other        328 Jul 15 17:08 rm.172.20.10.196




INTERNAL SUMMARY:

Customer called back and said the ticket can be closed, that the procedure went fine,
with one problem.  On Solaris 2.5.1 (which is what *most* SSPs are), there apparently is a
problem with netmasks other than your typical 0xffffff00 mask.  The Cu was using a mask
of 0xfffffff8, and was using IP addresses of:

	10.0.1.1	cb0
	10.0.1.9	cb1

which must be separate subnets since cb0 and cb1 have to be on separate private networks.
Apparently with Solaris 2.5.1, netmasks are not correctly pulled out of the /etc/netmasks
file.  The Cu got around this problem by either custom creation of an /etc/rcS.d script,
or the modification of an existing /etc/rcS.d script.  So, FYI, when using this procedure
to change control board IP addresses... beware of non-standard sub-net masks on Solaris 2.5.1

Cu got around this subnet mask problem on 2.5.1 with some customer scripts, stated that the
procedure worked well otherwise, that he is up and running, the CB IPs have been successfully
changed, and that the ticket is good to close.

SUBMITTER: Stephen Camp APPLIES TO: Hardware/Ultra Workstations ATTACHMENTS:


Copyright (c) 1997-2003 Sun Microsystems, Inc.