Why do my programs occasionally die with ``Signal 11'' errors?

Signal 11 errors are caused when your process has attempted to access memory which the operating system has not granted it access to. If something like this is happening at seemingly random intervals then you need to start investigating things very carefully.

These problems can usually be attributed to either:

  1. If the problem is occurring only in a specific application that you are developing yourself it is probably a bug in your code.

  2. If it is a problem with part of the base FreeBSD system, it may also be buggy code, but more often than not these problems are found and fixed long before us general FAQ readers get to use these bits of code (that is what -current is for).

In particular, a dead giveaway that this is not a FreeBSD bug is if you see the problem when you are compiling a program, but the activity that the compiler is carrying out changes each time.

For example, suppose you are running ``make buildworld'', and the compile fails while trying to compile ls.c into ls.o. If you then run ``make buildworld'' again, and the compile fails in the same place then this is a broken build -- try updating your sources and try again. If the compile fails elsewhere then this is almost certainly hardware.

What you should do:

In the first case you can use a debugger e.g. gdb to find the point in the program which is attempting to access a bogus address and then fix it.

In the second case you need to verify that it is not your hardware at fault.

Common causes of this include:

  1. Your hard disks might be overheating: Check the fans in your case are still working, as your disk (and perhaps other hardware might be overheating).

  2. The processor running is overheating: This might be because the processor has been overclocked, or the fan on the processor might have died. In either case you need to ensure that you have hardware running at what it is specified to run at, at least while trying to solve this problem. i.e. Clock it back to the default settings.

    If you are overclocking then note that it is far cheaper to have a slow system than a fried system that needs replacing! Also the wider community is not often sympathetic to problems on overclocked systems, whether you believe it is safe or not.

  3. Dodgy memory: If you have multiple memory SIMMS/DIMMS installed then pull them all out and try running the machine with each SIMM or DIMM individually and narrow the problem down to either the problematic DIMM/SIMM or perhaps even a combination.

  4. Over-optimistic Motherboard settings: In your BIOS settings, and some motherboard jumpers you have options to set various timings, mostly the defaults will be sufficient, but sometimes, setting the wait states on RAM too low, or setting the ``RAM Speed: Turbo'' option, or similar in the BIOS will cause strange behavior. A possible idea is to set to BIOS defaults, but it might be worth noting down your settings first!

  5. Unclean or insufficient power to the motherboard. If you have any unused I/O boards, hard disks, or CDROMs in your system, try temporarily removing them or disconnecting the power cable from them, to see if your power supply can manage a smaller load. Or try another power supply, preferably one with a little more power (for instance, if your current power supply is rated at 250 Watts try one rated at 300 Watts).

You should also read the SIG11 FAQ (listed below) which has excellent explanations of all these problems, albeit from a Linux viewpoint. It also discusses how memory testing software or hardware can still pass faulty memory.

Finally, if none of this has helped it is possible that you have just found a bug in FreeBSD, and you should follow the instructions to send a problem report.

There is an extensive FAQ on this at the SIG11 problem FAQ



UNIXguide.net
Suggest a Site