SRDB ID |
|
Synopsis |
|
Date |
17724 |
|
CC: Fatal error in ld: Bus Error (core dumped) |
|
25 Sep 1998 |
Intermittently (10-50% of the time) when compiling or running make,
the linker will core dump with a bus error.
This occurs usually on very large Enterprise systems (E3000, 4000 or larger).
eg: + Code builds on other 2.6 ystems with 2 & 3 cpus, but link fails
5% of time on this 4 cpu E5000. They have compared patchlists & say
they are the same. E5000 has 2 Gig SWAP & 1 Gig RAM
+ Problem has also shown up as part of an Oracle install.
We have two E4500 systems that are identical except one has an A3000
disk array and the other has an A7000 disk array. The local storage is
identical, and that is where everything except the oracle tablespace
files are kept. The root and var filesystems are eencapsulated UFS,
the primary swap is encapsulated raw, and all other filesystems are VxFS.
Both systems exhibit the same intermittent linking problems,
primarily with the modules: wrap, libclntsh.so and oracle.
The linking of the rdbms module is now failing on an ld command,
reporting Bus Error.
# file core
core: ELF 32-bit MSB core file SPARC Version 1, from 'ld'
# adb core
core file = core -- program ``ld'' on platform SUNW,Ultra-Enterprise
SIGBUS: Bus Error
From the Oracle install log: Bus Error - core dumped
*** Error code 138
+ Using workshop 4.2 and C++ compiler, linker is core dumping.
He is running 5/98 Solaris 2.6 on an E10000.
(dbx) where
=>[1] _memcpy(0x0, 0x0, 0x0, 0x0, 0x0, 0x0) at 0xef6e0814
[2] _memcpy(0xd78e0000, 0xe8c77e78, 0x10, 0x580, 0x38, 0xd78d9030),
at 0xef6e0750
[3] xlate(0xefffe1ac, 0x2,oxef60316c, 0x7618, 0x2, 0x7618), at 0xef60305c
[4] wrt32(0x2, 0x0, 0x1, 0xd783bc48, 0xd91e945c, 0xd91ebf84), at 0xef6120d8
[5] _elf32_update(0x1a8f778, 0x8, 0xd91e8d04, 0xef629b38, 0x1, 0xd911d104),
at 0xef6126b4
[6] create_outfile(0xef6d1328, 0xbfffffff, 0xef635b44, 0x0, 0xefb35b24,
0xd90fe85c), at 0xef6a92ec
[7] ld_main(0x20000000, 0x21x08, 0xef6bf4f2, 0xefd0250, 0xef6d1328, 0x0),
at 0xef6b85cc
[8] main(0x56, oxefffe3ec, 0xef6b811c, 0x118e5, 0xef7c14b0, 0x0) at 0x110c4
Truss of failed run shows
waitid(P_PGID, 5306, 0xEFFFE0B0, WEXITED|WTRAPPED) (sleeping...)
5316: Incurred fault #5, FLTACCESS %pc = 0xEF724524
5316: siginfo: SIGBUS BUS_OBJERR addr=0xEE012000 errno=61441
5316: Received signal #10, SIGBUS [default]
5316: siginfo: SIGBUS BUS_OBJERR addr=0xEE012000 errno=61441
5316: *** process killed ***
5314: waitid(P_PGID, 5306, 0xEFFFE0B0, WEXITED|WTRAPPED) = 0
SOLUTION SUMMARY:
This is a bug in the Veritas VxFS software not the linker.
It does not occur on a UFS file system.
Download the point release 3.2.4 from this web site.
http://sunsolve2.sun.com/beta/vxfs
The bug is described in bugs 4137397 (4164910 dup) and 4103710
The key bit of information here is the truss output.
5316: siginfo: SIGBUS BUS_OBJERR addr=0xEE012000 errno=61441
The Engineer's description from bug 4103710:
The key's are the 'SIGBUS BUS_OBJERR', this signal is only returned when
a pagefault occurs as we're mapping in the backup storage from the
underlying filesystem. The 'errno=61441' is the error code that
the underlying filesystem is passing up. he errorno is the 'error'
returned by VOP_ADDMAP() which is provided by the underlying filesystem.
The underlying file system in all cases turned out to be VxFS 3.2.1.1.
INTERNAL SUMMARY:
The point release is in an poorly named directory.
It is not beta software but a valid release from Veritas.
FIN I0401 describes why the patch process regarding
Veritas software is different.
Sun is a reseller of the Veritas File System product.
It is not a Sun branded product, therefore, nothing is done with the code,
the product kit comes from Veritas directly (CD and docs) in turn,
Sun re-packages the product and stocks the distribution center
with the re-packaged product.
Veritas does not develop patches, they create point releases (each point
release can be viewed as a total package). Since Sun does not have access
to the code, we can not put it into a patch format, so Sun has followed
Veritas' format and release "point releases".
SUBMITTER: Richard Barker
APPLIES TO: Hardware, Operating Systems/Solaris/Solaris 2.x
ATTACHMENTS:
Copyright (c) 1997-2003 Sun Microsystems, Inc.