SRDB ID | Synopsis | Date | ||
47299 | Sun Fire[TM] 12K/15K: fomd Propagation/retrieval errors; failed - unable to create transfer file | 25 Sep 2002 |
Status | Issued |
Description |
Problem 1:
The Starcat System Controller logs the following error message:
Mar 15 18:24:09 2002 vscpsc01 fomd[427]: [8542 181746764574373 WARNING FOI2Net.cc 1521] Propagation/retrieval of "/var/opt/SUNWSMS/adm/A/dump/dump.Reset.0313.1636.06" failed - unable to create transfer file
Problem 2:
The System Controller logs the following error message:
Mar 23 23:08:00 2002 f15k1sc0-hme0 fomd[114]: [3102 282891203886465 ERR Permissions.cc 493] Invalid Group Info Mar 23 23:08:00 2002 f15k1sc0-hme0 fomd[114]: [50073 282891204730121 ERR FOMDExec.cc 416] getpwuid(3C) failed: ecode=0 Mar 23 23:08:00 2002 f15k1sc0-hme0 fomd[114]: [8595 282891205453071 ERR FOMDExec.cc 156] Call to FOMDExec::validateUID()" is invalid - check the platform message log Mar 23 23:08:00 2002 f15k1sc0-hme0 fomd[114]: [8542 282891206205539 WARNING FOI2Net.cc 1525] Propagation/retrieval of "/var/opt/SUNWSMS/adm/B/dump/dump.Reset.0320.1449.09" failed - unable to create transfer file Mar 23 23:08:00 2002 f15k1sc0-hme0 fomd[114]: [3102 282891207926446 ERR Permissions.cc 493] Invalid Group Info
The System Controller logs the following error message:
May 3 14:46:41 2002 xc46-sc1 fomd[388]: [8542 24566426433605 WARNING FOI2Net.cc 1525] Propagation/retrieval of "/var/opt/SUNWSMS/data/.failover/chkpt/6.1.1.0" failed - unable to create transfer file May 3 14:46:47 2002 xc46-sc1 fomd[388]: [8542 24572138401725 WARNING FOI2Net.cc 1525] Propagation/retrieval of "/var/opt/SUNWSMS/data/.failover/chkpt/6.1.1.0" failed - unable to create transfer fileSOLUTION SUMMARY:
Explanation for Problem 1:
This message indicates that the dump files cannot be properly written out to the correct dump directory on the system controller.
Action:
Make sure the /etc/passwd file has the valid UID for dsmd. Valid UID for all SMS daemons should be as follows:
sms-codd:x:10:2:SMS Capacity On Demand Daemon:: sms-dca:x:11:2:SMS Domain Configuration Agent:: sms-dsmd:x:12:2:SMS Domain Status Monitoring Daemon:: sms-dxs:x:13:2:SMS Domain Server:: sms-efe:x:14:2:SMS Event Front-End Daemon:: sms-esmd:x:15:2:SMS Environ. Status Monitoring Daemon:: sms-fomd:x:16:2:SMS Failover Management Daemon:: sms-frad:x:17:2:SMS FRU Access Daemon:: sms-osd:x:18:2:SMS OBP Service Daemon:: sms-pcd:x:19:2:SMS Platform Config. Database Daemon:: sms-tmd:x:20:2:SMS Task Management Daemon:: sms-svc:x:6:10:SMS Service User:/export/home/sms-svc:/bin/csh
Explanation for Problem 2:
In this situation, it appears that explorer version 3.5.2 running on the system controller has caused this error to occur. This happens because explorer creates dump.Reset.xxx files as user 100. You will also see data sync problems.
From David Lafko, CPRE engineer:
What you have observed is a side-effect of how explorer collects its data. Explorer 3.5.2 and above creates a temporary user, assigns the necessary SMS permissions through smsconfig and collects output from commands like 'showboards'. When explorer exits, it deletes the temporary account it uses to collect data. That is why you see files with an unknown uid.
'showxirstate', which explorer runs, creates the dump.Reset files you mention. This is expected behavior. You can safely delete the dump.Reset files files with unknown uid. This will stop the file propagation failure messages.
Action:
Use Explorer version 3.6.2 for collecting SC explorer data. 3.6.2 contains all file propagation fixes of this type. As stated above, the files which contain the unknown UID can be safely deleted.
Explanation for Problem 3:
These errors can be safely ignored. See Article #
Keywords: fomd, propagation, retrieval, transfer, unable, create
INTERNAL SUMMARY:
SUBMITTER: Joshua Freeman APPLIES TO: AFO Vertical Team Docs/HAS, Hardware/Sun Fire /15000 ATTACHMENTS: