Documentation/kdump/kdump.txt - maze/linux - Git at Google

 Documentation for kdump - the kexec-based crash dumping solution
 ================================================================

 DESIGN
 ======

 Kdump uses kexec to reboot to a second kernel whenever a dump needs to be taken.
 This second kernel is booted with very little memory. The first kernel reserves
 the section of memory that the second kernel uses. This ensures that on-going
 DMA from the first kernel does not corrupt the second kernel.

 All the necessary information about Core image is encoded in ELF format and
 stored in reserved area of memory before crash. Physical address of start of
 ELF header is passed to new kernel through command line parameter elfcorehdr=.

 On i386, the first 640 KB of physical memory is needed to boot, irrespective
 of where the kernel loads. Hence, this region is backed up by kexec just before
 rebooting into the new kernel.

 In the second kernel, "old memory" can be accessed in two ways.

 - The first one is through a /dev/oldmem device interface. A capture utility
   can read the device file and write out the memory in raw format. This is raw
   dump of memory and analysis/capture tool should be intelligent enough to
   determine where to look for the right information. ELF headers (elfcorehdr=)
   can become handy here.

 - The second interface is through /proc/vmcore. This exports the dump as an ELF
   format file which can be written out using any file copy command
   (cp, scp, etc). Further, gdb can be used to perform limited debugging on
   the dump file. This method ensures methods ensure that there is correct
   ordering of the dump pages (corresponding to the first 640 KB that has been
   relocated).

 SETUP
 =====

 1) Download http://www.xmission.com/~ebiederm/files/kexec/kexec-tools-1.101.tar.gz
    and apply http://lse.sourceforge.net/kdump/patches/kexec-tools-1.101-kdump.patch
    and after that build the source.

 2) Download and build the appropriate (latest) kexec/kdump (-mm) kernel
    patchset and apply it to the vanilla kernel tree.

    Two kernels need to be built in order to get this feature working.

   A) First kernel:
    a) Enable "kexec system call" feature (in Processor type and features).
 	CONFIG_KEXEC=y
    b) This kernel's physical load address should be the default value of
       0x100000 (0x100000, 1 MB) (in Processor type and features).
 	CONFIG_PHYSICAL_START=0x100000
    c) Enable "sysfs file system support" (in Pseudo filesystems).
 	CONFIG_SYSFS=y
    d) Boot into first kernel with the command line parameter "crashkernel=Y@X".
       Use appropriate values for X and Y. Y denotes how much memory to reserve
       for the second kernel, and X denotes at what physical address the reserved
       memory section starts. For example: "crashkernel=64M@16M".

   B) Second kernel:
    a) Enable "kernel crash dumps" feature (in Processor type and features).
 	CONFIG_CRASH_DUMP=y
    b) Specify a suitable value for "Physical address where the kernel is
       loaded" (in Processor type and features). Typically this value
       should be same as X (See option d) above, e.g., 16 MB or 0x1000000.
 	CONFIG_PHYSICAL_START=0x1000000
    c) Enable "/proc/vmcore support" (Optional, in Pseudo filesystems).
 	CONFIG_PROC_VMCORE=y
    d) Disable SMP support and build a UP kernel (Until it is fixed).
    	CONFIG_SMP=n
    e) Enable "Local APIC support on uniprocessors".
    	CONFIG_X86_UP_APIC=y
    f) Enable "IO-APIC support on uniprocessors"
    	CONFIG_X86_UP_IOAPIC=y

   Note:   i) Options a) and b) depend upon "Configure standard kernel features
 	     (for small systems)" (under General setup).
 	 ii) Option a) also depends on CONFIG_HIGHMEM (under Processor
 		type and features).
 	iii) Both option a) and b) are under "Processor type and features".

 3) Boot into the first kernel. You are now ready to try out kexec-based crash
    dumps.

 4) Load the second kernel to be booted using:

    kexec -p <second-kernel> --crash-dump --args-linux --append="root=<root-dev>
    init 1 irqpoll"

    Note: i) <second-kernel> has to be a vmlinux image. bzImage will not work,
 	    as of now.
 	ii) By default ELF headers are stored in ELF32 format (for i386). This
 	    is sufficient to represent the physical memory up to 4GB. To store
 	    headers in ELF64 format, specifiy "--elf64-core-headers" on the
 	    kexec command line additionally.
        iii) Specify "irqpoll" as command line parameter. This reduces driver
             initialization failures in second kernel due to shared interrupts.

 5) System reboots into the second kernel when a panic occurs. A module can be
    written to force the panic or "ALT-SysRq-c" can be used initiate a crash
    dump for testing purposes.

 6) Write out the dump file using

    cp /proc/vmcore <dump-file>

    Dump memory can also be accessed as a /dev/oldmem device for a linear/raw
    view.  To create the device, type:

    mknod /dev/oldmem c 1 12

    Use "dd" with suitable options for count, bs and skip to access specific
    portions of the dump.

    Entire memory:  dd if=/dev/oldmem of=oldmem.001

 ANALYSIS
 ========

 Limited analysis can be done using gdb on the dump file copied out of
 /proc/vmcore. Use vmlinux built with -g and run

   gdb vmlinux <dump-file>

 Stack trace for the task on processor 0, register display, memory display
 work fine.

 Note: gdb cannot analyse core files generated in ELF64 format for i386.

 TODO
 ====

 1) Provide a kernel pages filtering mechanism so that core file size is not
    insane on systems having huge memory banks.
 2) Modify "crash" tool to make it recognize this dump.

 CONTACT
 =======

 Vivek Goyal (vgoyal@in.ibm.com)
 Maneesh Soni (maneesh@in.ibm.com)
	Documentation for kdump - the kexec-based crash dumping solution
	================================================================

	DESIGN
	======

	Kdump uses kexec to reboot to a second kernel whenever a dump needs to be taken.
	This second kernel is booted with very little memory. The first kernel reserves
	the section of memory that the second kernel uses. This ensures that on-going
	DMA from the first kernel does not corrupt the second kernel.

	All the necessary information about Core image is encoded in ELF format and
	stored in reserved area of memory before crash. Physical address of start of
	ELF header is passed to new kernel through command line parameter elfcorehdr=.

	On i386, the first 640 KB of physical memory is needed to boot, irrespective
	of where the kernel loads. Hence, this region is backed up by kexec just before
	rebooting into the new kernel.

	In the second kernel, "old memory" can be accessed in two ways.

	- The first one is through a /dev/oldmem device interface. A capture utility
	can read the device file and write out the memory in raw format. This is raw
	dump of memory and analysis/capture tool should be intelligent enough to
	determine where to look for the right information. ELF headers (elfcorehdr=)
	can become handy here.

	- The second interface is through /proc/vmcore. This exports the dump as an ELF
	format file which can be written out using any file copy command
	(cp, scp, etc). Further, gdb can be used to perform limited debugging on
	the dump file. This method ensures methods ensure that there is correct
	ordering of the dump pages (corresponding to the first 640 KB that has been
	relocated).

	SETUP
	=====

	1) Download http://www.xmission.com/~ebiederm/files/kexec/kexec-tools-1.101.tar.gz
	and apply http://lse.sourceforge.net/kdump/patches/kexec-tools-1.101-kdump.patch
	and after that build the source.

	2) Download and build the appropriate (latest) kexec/kdump (-mm) kernel
	patchset and apply it to the vanilla kernel tree.

	Two kernels need to be built in order to get this feature working.

	A) First kernel:
	a) Enable "kexec system call" feature (in Processor type and features).
	CONFIG_KEXEC=y
	b) This kernel's physical load address should be the default value of
	0x100000 (0x100000, 1 MB) (in Processor type and features).
	CONFIG_PHYSICAL_START=0x100000
	c) Enable "sysfs file system support" (in Pseudo filesystems).
	CONFIG_SYSFS=y
	d) Boot into first kernel with the command line parameter "crashkernel=Y@X".
	Use appropriate values for X and Y. Y denotes how much memory to reserve
	for the second kernel, and X denotes at what physical address the reserved
	memory section starts. For example: "crashkernel=64M@16M".

	B) Second kernel:
	a) Enable "kernel crash dumps" feature (in Processor type and features).
	CONFIG_CRASH_DUMP=y
	b) Specify a suitable value for "Physical address where the kernel is
	loaded" (in Processor type and features). Typically this value
	should be same as X (See option d) above, e.g., 16 MB or 0x1000000.
	CONFIG_PHYSICAL_START=0x1000000
	c) Enable "/proc/vmcore support" (Optional, in Pseudo filesystems).
	CONFIG_PROC_VMCORE=y
	d) Disable SMP support and build a UP kernel (Until it is fixed).
	CONFIG_SMP=n
	e) Enable "Local APIC support on uniprocessors".
	CONFIG_X86_UP_APIC=y
	f) Enable "IO-APIC support on uniprocessors"
	CONFIG_X86_UP_IOAPIC=y

	Note: i) Options a) and b) depend upon "Configure standard kernel features
	(for small systems)" (under General setup).
	ii) Option a) also depends on CONFIG_HIGHMEM (under Processor
	type and features).
	iii) Both option a) and b) are under "Processor type and features".

	3) Boot into the first kernel. You are now ready to try out kexec-based crash
	dumps.

	4) Load the second kernel to be booted using:

	kexec -p <second-kernel> --crash-dump --args-linux --append="root=<root-dev>
	init 1 irqpoll"

	Note: i) <second-kernel> has to be a vmlinux image. bzImage will not work,
	as of now.
	ii) By default ELF headers are stored in ELF32 format (for i386). This
	is sufficient to represent the physical memory up to 4GB. To store
	headers in ELF64 format, specifiy "--elf64-core-headers" on the
	kexec command line additionally.
	iii) Specify "irqpoll" as command line parameter. This reduces driver
	initialization failures in second kernel due to shared interrupts.

	5) System reboots into the second kernel when a panic occurs. A module can be
	written to force the panic or "ALT-SysRq-c" can be used initiate a crash
	dump for testing purposes.

	6) Write out the dump file using

	cp /proc/vmcore <dump-file>

	Dump memory can also be accessed as a /dev/oldmem device for a linear/raw
	view. To create the device, type:

	mknod /dev/oldmem c 1 12

	Use "dd" with suitable options for count, bs and skip to access specific
	portions of the dump.

	Entire memory: dd if=/dev/oldmem of=oldmem.001

	ANALYSIS
	========

	Limited analysis can be done using gdb on the dump file copied out of
	/proc/vmcore. Use vmlinux built with -g and run

	gdb vmlinux <dump-file>

	Stack trace for the task on processor 0, register display, memory display
	work fine.

	Note: gdb cannot analyse core files generated in ELF64 format for i386.

	TODO
	====

	1) Provide a kernel pages filtering mechanism so that core file size is not
	insane on systems having huge memory banks.
	2) Modify "crash" tool to make it recognize this dump.

	CONTACT
	=======

	Vivek Goyal (vgoyal@in.ibm.com)
	Maneesh Soni (maneesh@in.ibm.com)