Firmware assisted dump (fadump) HOWTO

Introduction

Firmware assisted dump is supported only on powerpc architecture. The goal of
firmware-assisted dump is to enable the dump of a crashed system, and to do so
from a fully-reset system, and to minimize the total elapsed time until the
system is back in production use. A complete documentation on implementation
can be found at Documentation/powerpc/firmware-assisted-dump.txt in upstream
linux kernel tree from 3.4 version and above.

Please note that the firmware-assisted dump feature is only available on Power6
and above systems with recent firmware versions.

Overview

Fadump

Fadump is a robust kernel crash dumping mechanism to get reliable kernel crash
dump with assistance from firmware. This approach does not use kexec, instead
firmware assists in booting the kernel while preserving memory contents. Unlike
kdump, the system is fully reset, and loaded with a fresh copy of the kernel.
In particular, PCI and I/O devices are reinitialized and are in a clean,
consistent state.  This second kernel, often called a capture kernel, boots
with very little memory and captures the dump image.

The first kernel registers the sections of memory with the Power firmware for
dump preservation during OS initialization. These registered sections of memory
are reserved by the first kernel during early boot. When a system crashes, the
Power firmware fully resets the system, preserves all the system memory
contents, save the low memory (boot memory of size larger of 5% of system
RAM or 256MB) of RAM to the previous registered region. It will also save
system registers, and hardware PTE's.

Fadump is supported only on ppc64 platform. The standard kernel and capture
kernel are one and the same on ppc64.

If you're reading this document, you should already have kexec-tools
installed. If not, you install it via the following command:

    # yum install kexec-tools

Fadump Operational Flow:

Like kdump, fadump also exports the ELF formatted kernel crash dump through
/proc/vmcore. Hence existing kdump infrastructure can be used to capture fadump
vmcore. The idea is to keep the functionality transparent to end user. From
user perspective there is no change in the way kdump init script works.

However, unlike kdump, fadump does not pre-load kdump kernel and initrd into
reserved memory, instead it always uses default OS initrd during second boot
after crash. Hence, for fadump, the kdump init script holds the steps to capture
vmcore. The second boot reads the /etc/kdump.conf file and captures the vmcore
accordingly.

The control flow of fadump works as follows:
01. System panics.
02. At the crash, kernel informs power firmware that kernel has crashed.
03. Firmware takes the control and reboots the entire system preserving
    only the memory (resets all other devices).
04. The reboot follows the normal booting process (non-kexec).
05. The boot loader loads the default kernel and initrd from /boot
06. The default initrd loads and runs /init
07. kdump init script checks if '/proc/device-tree/rtas/ibm,kernel-dump'
    node exists before executing steps to capture vmcore. This node is
    added to the device-tree by firmware to notify the kernel that it is
    booting after a system crash.
09. Captures dump according to configuration in "/etc/kdump.conf" file.
    [Note: See "Configuration in fadump mode" section for the list of
    configuration options supported in fadump mode]
10. Is dump capture successful (yes goto 13, no goto 11)
11. If a dump target (either filesystem or network based) is specified,
    try capturing vmcore to "path" (if specfied in /etc/kdump.conf) or
    "/var/crash". [Note: This is the fallback dump capture mechanism in
    error scenarios]
12. Perform default action specified in /etc/kdump.conf. If unspecified,
    goto 13. [Note: For the list of supported default actions in fadump
    mode, see "Default action" section]
13. Reboot

How to configure fadump:

Again, we assume if you're reading this document, you should already have
kexec-tools installed. If not, you install it via the following command:

    # yum install kexec-tools

To be able to do much of anything interesting in the way of debug analysis,
you'll also need to install the kernel-debuginfo package, of the same arch
as your running kernel, and the crash utility:

    # yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash

Next up, we need to modify some boot parameters to enable firmware assisted
dump. Append "fadump=on" to the end of your kernel boot parameters. Optionally,
user can also append 'fadump_reserve_mem=X' kernel cmdline to specify size of the
memory to reserve for boot memory dump preservation.

	# edit /etc/yaboot.conf
	# ybin -v

The term 'boot memory' means size of the low memory chunk that is required for
a kernel to boot successfully when booted with restricted memory.  By default,
the boot memory size will be the larger of 5% of system RAM or 256MB.
Alternatively, user can also specify boot memory size through boot parameter
'fadump_reserve_mem=' which will override the default calculated size. Use this
option if default boot memory size is not sufficient for second kernel to boot
successfully.

After making said changes, reboot your system, so that the specified memory is
reserved and left untouched by the normal system. Take note that the output of
'free -m' will show X MB less memory than without this parameter, which is
expected. If you see OOM (Out Of Memory) error messages while loading capture
kernel, then you should bump up the memory reservation size.

Now that you've got that reserved memory region set up, you want to turn on
the kdump init script:

    # chkconfig kdump on

Then, start up kdump as well:

    # service kdump start

This should turn on the firmware assisted functionality in kernel by
echo'ing 1 to /sys/kernel/fadump_registered, leaving the system ready
to capture a vmcore upon crashing. To test this out, you can force-crash
your system by echo'ing a c into /proc/sysrq-trigger:

    # echo c > /proc/sysrq-trigger

You should see some panic output, followed by the system reset and booting into
fresh copy of kernel. When default initrd loads and runs /init, vmcore should
be copied out to disk (by default, in /var/crash/<YYYY.MM.DD-HH:MM:SS>/vmcore),
then the system rebooted back into your normal kernel.

Once back to your normal kernel, you can use the previously installed crash
kernel in conjunction with the previously installed kernel-debuginfo to
perform postmortem analysis:

    # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
    /var/crash/2006-08-23-15:34/vmcore

    crash> bt

and so on...

Saving vmcore-dmesg.txt
----------------------
Kernel log bufferes are one of the most important information available
in vmcore. Now before saving vmcore, kernel log bufferes are extracted
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will
not be available if dump target is raw device.

Dump Triggering methods:

This section talks about the various ways, other than a Kernel Panic, in which
fadump can be triggered. The following methods assume that fadump is configured
on your system, with the scripts enabled as described in the section above.

1) AltSysRq C

FAdump can be triggered with the combination of the 'Alt','SysRq' and 'C'
keyboard keys. Please refer to the following link for more details:

http://kbase.redhat.com/faq/FAQ_43_5559.shtm

In addition, on PowerPC boxes, fadump can also be triggered via Hardware
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.

2) Kernel OOPs

If we want to generate a dump everytime the Kernel OOPses, we can achieve this
by setting the 'Panic On OOPs' option as follows:

    # echo 1 > /proc/sys/kernel/panic_on_oops

3) PowerPC specific methods:

On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if
XMON is configured). To configure XMON one needs to compile the kernel with
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with
CONFIG_XMON and booting the kernel with xmon=on option.

Following are the ways to remotely issue a soft reset on PowerPC boxes, which
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an
'Enter' here will trigger the dump.

3.1) HMC

Hardware Management Console(HMC) available on Power machines allow partitions
to be reset remotely. This is specially useful in hang situations where the
system is not accepting any keyboard inputs.


3.2) Blade Management Console for Blade Center

To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in
the Blade Management Console. Select the corresponding blade for which you want
to initate the dump and then click "Restart blade with NMI". This issues a
system reset and invokes xmon debugger.

Configuration in fadump mode:

The configuration file /etc/kdump.conf provides options like path, nfs, ssh, raw,
disk_timeout, default, kdump_post, extra_bins, etc. Only the options listed below
are supported/applicable in fadump mode:

    ext2
    ext3
    ext4
    raw
    path
    core_collector
    default

All other options in /etc/kdump.conf file are ignored in fadump mode.

Default action

"default" specifies the action to be performed in case dump saving to
intended dump target fails. If user has not configured "default", it
is assumed to be "reboot". That is, if saving dump to dump target fails,
by default system will reboot.

A user can change the default action to suit his/her needs better. Following
are available defaults in fadump mode.

reboot   -> reboot the system

halt     -> halt the system

poweroff -> shutdown the system

Note: When saving dump to the intended target fails, as a last resort
before proceeding with default action, the script tries to save core to
"path" set in /etc/kdump.conf file using makedumpfile tool (with tool
options "-c --message-level 1 -d 31").


Compression and filtering

The 'core_collector' parameter in kdump.conf allows you to specify a custom
dump capture method. The most common alternate method is makedumpfile, which
is a dump filtering and compression utility provided with kexec-tools. This
tool can drastically reduce the size of your vmcore files, which becomes very
useful on systems with large amounts of memory.

A typical setup is 'core_collector makedumpfile -c', but check the output of
'/sbin/makedumpfile --help' for a list of all available options (-i and -g
don't need to be specified, they're automatically taken care of).
