IBM Software Admins -- Customizing Your Linux Core Dumps
Bill Malchisky May 1 2012 09:00:00 PM
Applications crash. It is a fact of life with administration, as we have to clean-up the mess when they do. In Linux these memory protection situations, CPU violations, or other software bug snafus typically manifest themselves in a core file, when occurring outside of Domino--which uses its own type of crash files.In UNIX or Linux, you may see the occasional core file laying around the / filesystem, or throughout your home directory if you were running the program that crashed. But what if you do not want them to appear there? Perhaps you are tight for space on / and the thought of a 500MB core file appearing at 2am then causing your monitoring tool to trigger a low disk space warning while you are in REM sleep is well, unpalatable. Here are two points to help you when in Linux.
1) Enabling core files to use the process ID when creating
First you want to enable the core files to use the PID file, otherwise, you will just get one file entitled core which will be overwritten by each successive core dump, unless you relocate this single file with an automated process to minimize the chance of an overwrite, it can be very ineffective. If you enable the feature with kernels 2.5+, then you have many options.
Let's start with the first step: #vi /etc/sysctl.conf
Note: make a backup of the file, despite your making a one character change
Search for lines containing core: /core
kernel.core_uses_pid is by default set to "0"; change it to "1", sans the quotes
You an also enable the setuid to dump core, if it is enabled for the program that is running during the crash. If you want to do this append the following string to the sysctl.conf file
kernel.suid_dumpable = 2
Note: if you have a restricted box, then you can use "1" instead, which is debug mode and dumps all processes when possible; else, the value of two sets the dumped core to readable by root only, which locks-down the file so that regular users are unable to view it; the benefit herein is that any sensitive information about the program or system which might have been in memory is only seen by the administration account; this will help keep the box safe
2) Setting the default directory and filename for core files
#vi /proc/sys/kernel/core_pattern
This is a one line file. Just set the directory to where you want them to go, and how you want the file name to be constructed. It is also advisable that if you are moving the files off of the root filesystem (/), that you avoid placing them on a subdirectory that is part of /, otherwise, your disk consumption will remain unchanged on that file system, and you could potentially fill-it unknowingly. If you have ample space on /var or /tmp and they are mounted on a different file system than /, you can create a directory under there and have your core dumps in one place
/tmp/corefiles/core-%e-%p-%t
This will provide you a file with a listing something like this, if the process, 'foo' crashed earlier today, when it had the process ID (PID) of 15479; the time is the number of seconds from 1 January 1970.
$ls /tmp/corefiles
core-foo-15479-1335899059
If you want to disable certain non-root users from writing core dumps, then adjust their write access to the corefiles directory so they are unable to do so, then no cores will be written by those accounts. You can accomplish this by either editing the group membership list or removing world read/write privileges, shown below.
$ls -l /tmp/corefiles
drwrwxrwx 2 root root 4096 Apr 30 02:30 corefiles
#chmod 770 /tmp/corefiles
#ls -l /tmp/corefiles
drwrwx--- 2 root root 4096 Apr 30 02:30 corefiles
3) Bonus: Testing Your Changes
If you want to see if you have everything working properly, here is how you can force a core file:
$kill -s SIGSEGV $$
Note: notice I am suggesting your user account here, versus the root account. This will dump core from your current shell (if you are using ssh to access to the box, you will need to login again). If you performed steps one and two properly, you will find this core file where you designated it. Type an $ls -lh /
Here's hoping you will see very few core files. But if you do see one, you will have a much better way to manage these files, while determining crash frequency, and by whom, in a secure and useful way.
- Comments [0]