SRDB ID | Synopsis | Date | ||
48153 | Sun Fire[TM] 12K/15K: fomd error 8542 | 30 Oct 2002 |
Status | Issued |
Description |
- Problem Statement: Propagation/retrieval of a file during system controller datasync fails with fomd error 8542 - Symptoms: After a System Controller (SC) failover is activated and file propagation begins, the /var/opt/SUNWSMS/SMS1.2/adm/platform/messages file will have errors similar to: Apr 17 23:56:39 2002 xc46-sc0 fomd[390]: [8542 22273080859978 WARNING FOI2Net.cc 1592] Propagation/retrieval of "/var/opt/SUNWSMS/adm/B/post/post020307.1031.02.log" failed - "rcmd: socket: Cannot assign requested address" Apr 17 23:56:39 2002 xc46-sc0 fomd[390]: [8542 22273209520193 WARNING FOI2Net.cc 1592] Propagation/retrieval of "/var/opt/SUNWSMS/adm/B/post/post020306.1042.56.log" failed - "rcmd: socket: Cannot assign requested address" Apr 17 23:56:39 2002 xc46-sc0 fomd[390]: [8542 22273376277457 WARNING FOI2Net.cc 1592] Propagation/retrieval of "/var/opt/SUNWSMS/adm/B/post/post020307.1211.49.log" failed - "rcmd: socket: Cannot assign requested address" Apr 17 23:56:40 2002 xc46-sc0 fomd[390]: [8542 22274024132384 WARNING FOI2Net.cc 1592] Propagation/retrieval of "/var/opt/SUNWSMS/adm/B/post/post020307.1208.58.log" failed - "rcmd: socket: Cannot assign requested address" Apr 17 23:56:40 2002 xc46-sc0 fomd[390]: [8542 22274088989722 WARNING FOI2Net.cc 1592] Propagation/retrieval of "/var/opt/SUNWSMS/adm/B/post/post020307.1222.42.log" failed - "rcmd: socket: Cannot assign requested address"
SOLUTION SUMMARY:
- Troubleshooting: See messages in /var/opt/SUNWSMS/SMS1.2/adm/platform/messages file. - Resolution: These errors can be safely ignored. This problem is being addressed by SMS bug #4472333 . File propagation still works correctly, but it takes longer (see Additional Background information at the end of the article). - Summary of part number and patch ID's Patch will be available in near future. - References and bug IDs Bug #4472333 - Additional background information: When fomd propagates files, rsh uses reserved TCP ports for communicating with the other host. Closing a connection puts the port in TIME_WAIT state for a short time. If you have a large number of files to be propagated, each TCP port will be in TIME_WAIT at the end of each connection. As most of the ports are in TIME_WAIT, the system runs out of reserved ports very quickly. bind() fails when the number of free reserved ports is less than 1/2 of reserved ports. This results in the error message: "socket: Cannot assign requested address". The fomd at boot time tries to transfer a large number of files corresponding to all domains using rcp and uses rsh to change the mode/permissions. This causes the problem at boot time for systems which have many files to be synced. A work around fix was integrated (bug 4472333) which optimizes the use of rsh(1). This does not eliminate the appearance of the errors, but file propagation will complete more quickly than before. A real fix should completely eliminate the dependence on rsh(1). However, since the rcmd TCP timeout in Solaris 9 is smaller, the problem might automatically go away when/if the SC moves to Solaris[TM] 9. - Meta-Data/Problem categorization: Product/Platform: SF12K/SF15K Category: - Keywords SMS daemon fomd
INTERNAL SUMMARY:
SUBMITTER: Vasant Butala BUG REPORT ID: 4472333, 4472333 APPLIES TO: Hardware/Sun Fire /15000, Hardware/Sun Fire /12000 ATTACHMENTS: