From 8956dc597b561a1bdaff44c847a5cd5344be4999 Mon Sep 17 00:00:00 2001 From: Amrita Date: Mon, 12 Jan 2026 17:14:35 +0530 Subject: [PATCH 01/17] adds toc structure for kdump --- DC-SLES-kdump | 16 +++++ articles/kdump.asm.xml | 134 +++++++++++++++++++++++++++++++++++ concepts/about-kdump.xml | 29 ++++++++ tasks/configure-kdump.xml | 28 ++++++++ tasks/setup-kdump.xml | 28 ++++++++ tasks/troubleshoot-kdump.xml | 28 ++++++++ 6 files changed, 263 insertions(+) create mode 100644 DC-SLES-kdump create mode 100644 articles/kdump.asm.xml create mode 100644 concepts/about-kdump.xml create mode 100644 tasks/configure-kdump.xml create mode 100644 tasks/setup-kdump.xml create mode 100644 tasks/troubleshoot-kdump.xml diff --git a/DC-SLES-kdump b/DC-SLES-kdump new file mode 100644 index 000000000..f64295e9d --- /dev/null +++ b/DC-SLES-kdump @@ -0,0 +1,16 @@ +# This file originates from the project https://github.com/openSUSE/doc-kit +# This file can be edited downstream. + +MAIN="kdump.asm.xml" +# Point to the ID of the of your assembly +#ROOTID="article-example" +SRC_DIR="articles" +IMG_SRC_DIR="images" + +PROFOS="sles" +PROFCONDITION="suse-product" +#PROFCONDITION="suse-product;beta" +#PROFCONDITION="community-project" + +STYLEROOT="/usr/share/xml/docbook/stylesheet/suse2022-ns" +FALLBACK_STYLEROOT="/usr/share/xml/docbook/stylesheet/suse-ns" \ No newline at end of file diff --git a/articles/kdump.asm.xml b/articles/kdump.asm.xml new file mode 100644 index 000000000..7e5c98b0d --- /dev/null +++ b/articles/kdump.asm.xml @@ -0,0 +1,134 @@ + + + + %entities; +]> + + + + + + + + + + + + + + Introduction to &kdump; + + 2026-01-12 + + + Initial version + + + + + + + + + + Smart Docs + + + + Administration + Configuration + Security + + + + + + https://bugzilla.suse.com/enter_bug.cgi + Documentation + SUSE Linux Enterprise Server 16.0 + amrita.sakthivel@suse.com + + yes + + + + + &x86-64; + &power; + &zseries; + &aarch64; + + + + + &productname; + + + + Introduction to kdump + Learn how to use kdump when your system crashes. kdump is a + kernel crash dumping mechanism. When a system encounters a fatal error, kdump allows the system to save the contents of its memory to a file so you can analyze exactly what went wrong. + + + + Use kdump to analyze system crashes + + + + + WHAT? + + + + + + + + WHY? + + + + + + + + EFFORT + + +The average reading time of this article is approximately 40 minutes. + + + + + REQUIREMENTS + + + + +Linux fundamentals: Understanding basic Linux commands, file permissions, directory structures +and use of the command line. + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/concepts/about-kdump.xml b/concepts/about-kdump.xml new file mode 100644 index 000000000..1139364d2 --- /dev/null +++ b/concepts/about-kdump.xml @@ -0,0 +1,29 @@ + + + %entities; +]> + + + + + + + About &kdump; + + + + + + + + + + \ No newline at end of file diff --git a/tasks/configure-kdump.xml b/tasks/configure-kdump.xml new file mode 100644 index 000000000..230ce770a --- /dev/null +++ b/tasks/configure-kdump.xml @@ -0,0 +1,28 @@ + + + %entities; +]> + + + + + + + + Configuring &kdump; + + + + + + + + \ No newline at end of file diff --git a/tasks/setup-kdump.xml b/tasks/setup-kdump.xml new file mode 100644 index 000000000..979a5fd22 --- /dev/null +++ b/tasks/setup-kdump.xml @@ -0,0 +1,28 @@ + + + %entities; +]> + + + + + + + + Setting up &kdump; + + + + + + + + \ No newline at end of file diff --git a/tasks/troubleshoot-kdump.xml b/tasks/troubleshoot-kdump.xml new file mode 100644 index 000000000..c58b30597 --- /dev/null +++ b/tasks/troubleshoot-kdump.xml @@ -0,0 +1,28 @@ + + + %entities; +]> + + + + + + + + Common troubleshooting &kdump; issues + + + + + + + + \ No newline at end of file From 9a09a2adbbe41cab6074d76794d9e756e13b98bd Mon Sep 17 00:00:00 2001 From: Amrita Date: Tue, 13 Jan 2026 15:20:46 +0530 Subject: [PATCH 02/17] intro --- articles/kdump.asm.xml | 1 - concepts/about-kdump.xml | 3 ++- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/articles/kdump.asm.xml b/articles/kdump.asm.xml index 7e5c98b0d..423634e0a 100644 --- a/articles/kdump.asm.xml +++ b/articles/kdump.asm.xml @@ -79,7 +79,6 @@ Use kdump to analyze system crashes - diff --git a/concepts/about-kdump.xml b/concepts/about-kdump.xml index 1139364d2..dec94a21e 100644 --- a/concepts/about-kdump.xml +++ b/concepts/about-kdump.xml @@ -20,7 +20,8 @@ - +&kdump; is a kernel crash dumping mechanism that captures the system’s memory state into a vmcore file when +system crash occurs. A vmcore file is a snapshot of your computer's system memory (RAM) taken at the exact moment the Linux kernel crashed. From 02c57dc1bb0af91d12b95b51bea8a3cc8b489cee Mon Sep 17 00:00:00 2001 From: Amrita Date: Tue, 13 Jan 2026 15:32:49 +0530 Subject: [PATCH 03/17] intro --- concepts/about-kdump.xml | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/concepts/about-kdump.xml b/concepts/about-kdump.xml index dec94a21e..498d5ed8e 100644 --- a/concepts/about-kdump.xml +++ b/concepts/about-kdump.xml @@ -25,6 +25,16 @@ system crash occurs. A vmcore file is a snapshot of your computer's system memor - - +
+ Why is &kdump; important? + +
+
+ About the vmcore file + +
+
+ What is &kexec;? + +
\ No newline at end of file From c711428ce3a8bfd5a290ac1dcf3b3f12a924d67b Mon Sep 17 00:00:00 2001 From: Amrita Date: Tue, 20 Jan 2026 13:09:05 +0530 Subject: [PATCH 04/17] vmcore --- concepts/about-kdump.xml | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/concepts/about-kdump.xml b/concepts/about-kdump.xml index 498d5ed8e..dfb76effd 100644 --- a/concepts/about-kdump.xml +++ b/concepts/about-kdump.xml @@ -27,13 +27,28 @@ system crash occurs. A vmcore file is a snapshot of your computer's system memor
Why is &kdump; important? - -
+ The primary importance of &kdump; lies in its ability to capture a snapshot of a system's memory at the exact moment of a critical failure. + When a Linux kernel experiences a fatal error that halts all operations—standard logging services like syslog or journald usually fail along with it. + This often leaving no record of what went wrong. &kdump; bypasses this limitation by using &kexec; to boot a secondary capture kernel in a reserved slice of RAM. + This allows the system to remain stable enough to save the volatile memory (RAM) into a persistent file, known as a vmcore. + Without this tool, administrators are often left with nothing but a blank screen or a frozen console, making it nearly impossible to diagnose the root cause of intermittent or silent system crashes. + The vmcore file is a snapshot of the RAM and includes: + +The Kernel state:All active kernel data structures, global variables and the call stack, which is what the CPU was doing when it died. +Process information:A list of every process that was running, including their individual stacks and registers. +Memory pages:Depending on your settings, it can contain the actual data held in RAM by applications. +VMCOREINFO:special section that tells analysis tools how the kernel's memory was laid out so they can make sense of the raw data. + +
About the vmcore file - + + A vmcore file is a snapshot of your system's physical memory (RAM) taken at the exact moment the Linux kernel crashed. + When a system panics, the &kdump; service uses 7kexec; to boot a small, separate capture kernel that stays in a reserved slice of RAM. + This capture kernel’s main function is to look back at the crashed memory and save it to a vmcore file so that you can figure out what happened after the system reboots. +
-
+
What is &kexec;?
From fa9b2ca09383ef934e20548fdcf3ee634233ed2c Mon Sep 17 00:00:00 2001 From: Amrita Date: Tue, 20 Jan 2026 13:19:03 +0530 Subject: [PATCH 05/17] placement of list --- concepts/about-kdump.xml | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/concepts/about-kdump.xml b/concepts/about-kdump.xml index dfb76effd..0144dc495 100644 --- a/concepts/about-kdump.xml +++ b/concepts/about-kdump.xml @@ -32,14 +32,7 @@ system crash occurs. A vmcore file is a snapshot of your computer's system memor This often leaving no record of what went wrong. &kdump; bypasses this limitation by using &kexec; to boot a secondary capture kernel in a reserved slice of RAM. This allows the system to remain stable enough to save the volatile memory (RAM) into a persistent file, known as a vmcore. Without this tool, administrators are often left with nothing but a blank screen or a frozen console, making it nearly impossible to diagnose the root cause of intermittent or silent system crashes. - The vmcore file is a snapshot of the RAM and includes: - -The Kernel state:All active kernel data structures, global variables and the call stack, which is what the CPU was doing when it died. -Process information:A list of every process that was running, including their individual stacks and registers. -Memory pages:Depending on your settings, it can contain the actual data held in RAM by applications. -VMCOREINFO:special section that tells analysis tools how the kernel's memory was laid out so they can make sense of the raw data. - -
+
About the vmcore file @@ -47,6 +40,13 @@ system crash occurs. A vmcore file is a snapshot of your computer's system memor When a system panics, the &kdump; service uses 7kexec; to boot a small, separate capture kernel that stays in a reserved slice of RAM. This capture kernel’s main function is to look back at the crashed memory and save it to a vmcore file so that you can figure out what happened after the system reboots. + The vmcore file is a snapshot of the RAM and includes: + +The Kernel state:All active kernel data structures, global variables and the call stack, which is what the CPU was doing when it died. +Process information:A list of every process that was running, including their individual stacks and registers. +Memory pages:Depending on your settings, it can contain the actual data held in RAM by applications. +VMCOREINFO:special section that tells analysis tools how the kernel's memory was laid out so they can make sense of the raw data. +
What is &kexec;? From 4d968d9f349c4dd334b15c35156807149ca11c75 Mon Sep 17 00:00:00 2001 From: Amrita Date: Tue, 20 Jan 2026 14:48:32 +0530 Subject: [PATCH 06/17] installing kdump --- tasks/setup-kdump.xml | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/tasks/setup-kdump.xml b/tasks/setup-kdump.xml index 979a5fd22..7d4224801 100644 --- a/tasks/setup-kdump.xml +++ b/tasks/setup-kdump.xml @@ -17,12 +17,14 @@ xmlns:trans="http://docbook.org/ns/transclusion"> - Setting up &kdump; - - - - - - - + Setting up a dump environment + + To perform a dump when your system crashes, you need the following packages: + + kdump: Contains the scripts, &systemd; services kdump.service and configuration files /etc/sysconfig/kdump. + kexec-tools: The low-level utility that allows the system to boot into the capture kernel without a hard reset. + makedumpfile: Used by the &kdump; scripts to compress the memory dump and filter out unnecessary data (like zero pages), preventing the dump file from being as large as your entire RAM. + + You can install the packages: + &prompt.sudo; zypper install kdump kexec-tools makedumpfile \ No newline at end of file From 2912b0bd6a369816fd1b5b3bfa7eb8e0e06c08d2 Mon Sep 17 00:00:00 2001 From: Amrita Date: Tue, 20 Jan 2026 15:21:55 +0530 Subject: [PATCH 07/17] trubleshoot --- tasks/troubleshoot-kdump.xml | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/tasks/troubleshoot-kdump.xml b/tasks/troubleshoot-kdump.xml index c58b30597..5100a3e36 100644 --- a/tasks/troubleshoot-kdump.xml +++ b/tasks/troubleshoot-kdump.xml @@ -20,9 +20,24 @@ Common troubleshooting &kdump; issues - - +When troubleshooting &kdump;, the process usually fails at one of three stages: during boot (memory reservation), during the crash when the the dump does not start or during the save process when the file is not written. + - + Here are some common troubleshooting scenarios, you may encounter: + + The command systemctl status kdump gives an error kdump: failed to load kdump kernel or No crashkernel reserved. + Cause:The main kernel did not set aside enough memory for the capture kernel. + Fix:Check the current reserved memory; at /proc/iomem | grep "Crash kernel" If it is small, increase the crashkernel value in /etc/default/grub. Rebuild the GRUB config and reboot. + + + The system crashes, the capture kernel boots, but the vmcore file is not found in /var/crash after the reboot. + + Cause:The capture kernel is a minimal environment. If you are saving the dump to an NFS share, an SSH server or a specialized partition (like LVM or LUKS), the capture kernel may lack the drivers or network configuration to reach the destination. + + Fix:Check /etc/kdump.conf to see where the dump is being sent. If you are using network storage, ensure the &kdump; service has pre-built the network into the initrd. Then run the command mkdumprd -f to refresh the capture kernel's drivers. + + + + \ No newline at end of file From 01c5092c2372b5b6da465094aa2b03718a34c823 Mon Sep 17 00:00:00 2001 From: Amrita Date: Wed, 21 Jan 2026 15:16:58 +0530 Subject: [PATCH 08/17] trobleshooting --- tasks/troubleshoot-kdump.xml | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/tasks/troubleshoot-kdump.xml b/tasks/troubleshoot-kdump.xml index 5100a3e36..a8dbe6e78 100644 --- a/tasks/troubleshoot-kdump.xml +++ b/tasks/troubleshoot-kdump.xml @@ -39,5 +39,29 @@ When troubleshooting &kdump;, the process usually fails at one of three stages: + The dump starts, but the progress bar stops and the system reboots without a complete file. + + Cause: A vmcore can be massive; equal to your RAM size. If your /var + partition is small, the dump fills the disk and fails. + + Fix:Use compression and filtering;Ensure your /etc/kdump.conf has the line + core_collector makedumpfile -d 31 -c. Alternatively, you can can change the path in kdump.conf to a partition with more space. + + + +The system freezes completely but the &kdump; kernel never boots. + + Cause: This usually happens during a hard lockup, where the CPU is so stuck that it cannot even execute the instruction to switch to the &kdump; kernel. + It can also happen if secure boot is blocking &kexec; from loading an unsigned capture kernel. + Fix: Enable the NMI watchdog in /etc/sysctl.conf (kernel.nmi_watchdog = 1) so the system can panic itself if it detects a freeze. You can +also run kexec -l KERNEL_PATH command manually to see if it returns a permission denied or signature error. + + + + +You can also use the following quick status commands: + +kdumpctl show / kdump-config show: Overall health and memory reservation +cat /sys/kernel/kexec_crash_loaded: 1 means 1 means the gun is loaded and ready to fire, 0 means it is not. \ No newline at end of file From 09f9f4f35419b8bc3a89ca6c382aadea1e72d5fb Mon Sep 17 00:00:00 2001 From: Amrita Date: Wed, 21 Jan 2026 15:29:17 +0530 Subject: [PATCH 09/17] more info --- articles/kdump.asm.xml | 2 ++ glues/more-info-kdump.xml | 42 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) create mode 100644 glues/more-info-kdump.xml diff --git a/articles/kdump.asm.xml b/articles/kdump.asm.xml index 423634e0a..334bacef8 100644 --- a/articles/kdump.asm.xml +++ b/articles/kdump.asm.xml @@ -17,6 +17,7 @@ + @@ -125,6 +126,7 @@ and use of the command line. + diff --git a/glues/more-info-kdump.xml b/glues/more-info-kdump.xml new file mode 100644 index 000000000..a6f8e63d7 --- /dev/null +++ b/glues/more-info-kdump.xml @@ -0,0 +1,42 @@ + + + + + %entities; +]> + + + For more information + + + + + + + + For information on &kdump;, refer to the following resources: + + + + + Official Linux kernel documentation: + + + + + + Man page for saving kernel dumps in &suse;: + + + + + From 30f208dd331d066b8d62353462401c0674d76235 Mon Sep 17 00:00:00 2001 From: Amrita Date: Mon, 2 Feb 2026 11:27:46 +0530 Subject: [PATCH 10/17] more info --- concepts/about-kdump.xml | 28 ++++++++++++++++++++++++++-- glues/more-info-kdump.xml | 6 ++++++ 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/concepts/about-kdump.xml b/concepts/about-kdump.xml index 0144dc495..d7b1bf6b3 100644 --- a/concepts/about-kdump.xml +++ b/concepts/about-kdump.xml @@ -33,6 +33,16 @@ system crash occurs. A vmcore file is a snapshot of your computer's system memor This allows the system to remain stable enough to save the volatile memory (RAM) into a persistent file, known as a vmcore. Without this tool, administrators are often left with nothing but a blank screen or a frozen console, making it nearly impossible to diagnose the root cause of intermittent or silent system crashes.
+
+ Understanding the dual-kernel model + Dual-kernel model is the usage of a second, isolated kernel to handle a system crash safely. When the main system kernel fails, you can not trust it to write its own crash logs to disk—the memory that might be corrupted + because the kernel itself is no longer reliable. The dual-kernel approach solves this by jumping into a completely different environment. + The model relies on two distinct kernels residing in memory simultaneously: + + The production (primary) kernel: is the kernel you use every day. It runs your applications and services. + The capture (crash) kernel:is a lightweight, minimal kernel specifically compiled to run in a small, reserved area of RAM. It only wakes up when the primary kernel panics. + +
About the vmcore file @@ -50,6 +60,20 @@ system crash occurs. A vmcore file is a snapshot of your computer's system memor
What is &kexec;? - -
+ &kexec; is a system call that functions as a software-defined boot loader, allowing a running kernel to bypass the hardware BIOS/UEFI stage and directly hand over control to a new kernel. + By loading the secondary kernel's image and parameters into memory while the system is still active, &kexec; performs a warm-up boot that preserves the state of RAM and significantly reduces downtime. + This mechanism is the backbone of the &kdump; dual-kernel model, as it provides a reliable way to jump from a crashing production environment into a clean recovery environment for data capture. + The kexec-tools package contains a script called kexec-bootloader. This script reads the boot loader configuration and runs &kexec; using the same kernel options as the normal boot loader. + The most important component of &kexec; is the /sbin/kexec command. You can load a kernel with &kexec; in two ways: + +Load the kernel to the address space of a production kernel for a regular reboot: +&prompt.sudo; kexec -l KERNEL_IMAGE +You can later boot to this kernel with the command kexec -e. + +Load the kernel to a reserved area of memory: + &prompt.sudo; kexec -p KERNEL_IMAGE + This kernel is booted automatically when the system crashes. + + + \ No newline at end of file diff --git a/glues/more-info-kdump.xml b/glues/more-info-kdump.xml index a6f8e63d7..7e522700e 100644 --- a/glues/more-info-kdump.xml +++ b/glues/more-info-kdump.xml @@ -37,6 +37,12 @@ Man page for saving kernel dumps in &suse;: + + + + Man page for &kexec;: + + From 31c29d7b0e9b860263c1c0614f3a39dc6e4d9b49 Mon Sep 17 00:00:00 2001 From: Amrita Date: Mon, 2 Feb 2026 13:37:26 +0530 Subject: [PATCH 11/17] configure kdump --- concepts/about-kdump.xml | 2 +- tasks/configure-kdump.xml | 53 +++++++++++++++++++++++++++++++++++---- tasks/setup-kdump.xml | 1 + 3 files changed, 50 insertions(+), 6 deletions(-) diff --git a/concepts/about-kdump.xml b/concepts/about-kdump.xml index d7b1bf6b3..600554223 100644 --- a/concepts/about-kdump.xml +++ b/concepts/about-kdump.xml @@ -40,7 +40,7 @@ system crash occurs. A vmcore file is a snapshot of your computer's system memor The model relies on two distinct kernels residing in memory simultaneously: The production (primary) kernel: is the kernel you use every day. It runs your applications and services. - The capture (crash) kernel:is a lightweight, minimal kernel specifically compiled to run in a small, reserved area of RAM. It only wakes up when the primary kernel panics. + The capture (crash) kernel: is a lightweight, minimal kernel specifically compiled to run in a small, reserved area of RAM. It only wakes up when the primary kernel panics.
diff --git a/tasks/configure-kdump.xml b/tasks/configure-kdump.xml index 230ce770a..065942012 100644 --- a/tasks/configure-kdump.xml +++ b/tasks/configure-kdump.xml @@ -20,9 +20,52 @@ Configuring &kdump; - - - - - + To boot another kernel and preserve the data of the production kernel when the system crashes, you need to reserve a dedicated area of the system memory. + The production kernel never loads to this area because it must be always available. It is used for the capture kernel so that the memory pages of the production kernel can be preserved. + + + + To use &kexec; with a capture kernel and to use &kdump; in any way, RAM needs to be allocated for the capture kernel. + To configure the reserved memory for the capture kernel, you must modify the crashkernel= parameter within the GRUB configuration file. + This value defines the specific block of RAM sequestered for the secondary kernel and its optimal size is typically determined by the total physical memory available in the system. + + + Calculating the allocation size + Find the base value for your system, run: +&prompt.sudo; kdumptool calibrate + Total: 49074 + Low: 72 + High: 180 + MinLow: 72 + MaxLow: 3085 + MinHigh: 0 + MaxHigh: 45824 + +Total:Your total system RAM. +Low:The minimum memory required in the low memory zone (first 4GB) for the kernel to boot. +High:The recommended amount for the high memory zone. This covers the actual work of saving the crash dump. +MinLow/MaxLow:The safe range for the low reservation. You are currently at the absolute minimum. +MinHigh/MaxHigh: +MaxHigh:he range available for high reservation. + +All values are in megabytes. Note the Low value. + +Based on your system architecture, adapt the Low or High value from the previous step for the number of LUN kernel paths (paths to storage devices) attached to the system. + A sensible value in megabytes can be calculated using this formula: +&prompt.sudo; SIZE_LOW = RECOMMENDATION + (LUNs / 2) +&prompt.sudo; SIZE_HIGH = RECOMMENDATION + (LUNs / 2) + +SIZE_LOW/SIZE_HIGH:The resulting value for Low/High. +RECOMMENDATION:The value recommended by the commandkdumptool calibratefor Low/High. +LUNs:The maximum number of LUN kernel paths that you expect to ever create on your system. + Exclude multipath devices from this number, as these are ignored. To get the current number of LUNs available on your system, run: +cat /proc/scsi/scsi | grep Lun | wc -l + + + +Set the values in the correct location. Append the following kernel option to your boot loader configuration: +crashkernel=SIZE_HIGH,high crashkernel= SIZE_LOW,low +crashkernel= SIZE_LOW + + \ No newline at end of file diff --git a/tasks/setup-kdump.xml b/tasks/setup-kdump.xml index 7d4224801..8d333c8f6 100644 --- a/tasks/setup-kdump.xml +++ b/tasks/setup-kdump.xml @@ -27,4 +27,5 @@ You can install the packages: &prompt.sudo; zypper install kdump kexec-tools makedumpfile + \ No newline at end of file From 65b5e3ee5000c427e71baa95ed882b8045537eaa Mon Sep 17 00:00:00 2001 From: Amrita Date: Mon, 2 Feb 2026 13:49:41 +0530 Subject: [PATCH 12/17] configure and asm file --- articles/kdump.asm.xml | 4 ++-- tasks/configure-kdump.xml | 13 ++++++++++++- 2 files changed, 14 insertions(+), 3 deletions(-) diff --git a/articles/kdump.asm.xml b/articles/kdump.asm.xml index 334bacef8..369e65050 100644 --- a/articles/kdump.asm.xml +++ b/articles/kdump.asm.xml @@ -86,7 +86,7 @@ WHAT? - + &kdump; is the standard error-recovery and crash-dumping mechanism for the Linux kernel. Its primary purpose is to capture a snapshot of the system's memory (a vmcore file) at the exact moment a kernel crashes (kernel panic). @@ -94,7 +94,7 @@ WHY? - + Mastering &kdump; is essential for administrators and developers because it leverages a dual-kernel mechanism to capture a memory snapshot during a crash. This transforms mysterious system failures into diagnosable vmcore files that ensure production stability and reduce troubleshooting time. diff --git a/tasks/configure-kdump.xml b/tasks/configure-kdump.xml index 065942012..122078f55 100644 --- a/tasks/configure-kdump.xml +++ b/tasks/configure-kdump.xml @@ -67,5 +67,16 @@ crashkernel=SIZE_HIGH,high crashkernel= SIZE_LOW,low crashkernel= SIZE_LOW - +The changes won't take effect until the boot loader is rebuilt and the system is restarted to reserve the memory. +&prompt.sudo; grub2-mkconfig -o /boot/grub2/grub.cfg + +After restarting, confirm that the primary kernel has successfully allocated the memory for the secondary capture kernel. +cat /sys/kernel/kexec_crash_size + + +Ensure the &kdump; service is ready to catch a crash: +&prompt.sudo; systemctl enable --now kdump +&prompt.sudo; kdumpctl status + + \ No newline at end of file From 7861f4d55ed2276cab1612d38705af6d0de12516 Mon Sep 17 00:00:00 2001 From: Amrita Date: Fri, 6 Feb 2026 13:07:51 +0530 Subject: [PATCH 13/17] review --- glues/more-info-kdump.xml | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-) diff --git a/glues/more-info-kdump.xml b/glues/more-info-kdump.xml index 7e522700e..9290903d7 100644 --- a/glues/more-info-kdump.xml +++ b/glues/more-info-kdump.xml @@ -32,13 +32,7 @@ - - - Man page for saving kernel dumps in &suse;: - - - - + Man page for &kexec;: From 5410fa75eebd46b160e722dedfc0ac476d47e794 Mon Sep 17 00:00:00 2001 From: Amrita Date: Tue, 12 May 2026 14:02:16 +0530 Subject: [PATCH 14/17] tech review-part 1 --- articles/kdump.asm.xml | 13 ++++++------- concepts/about-kdump.xml | 33 ++++++++++++++++++--------------- 2 files changed, 24 insertions(+), 22 deletions(-) diff --git a/articles/kdump.asm.xml b/articles/kdump.asm.xml index 369e65050..66ada3a83 100644 --- a/articles/kdump.asm.xml +++ b/articles/kdump.asm.xml @@ -74,19 +74,18 @@ Introduction to kdump - Learn how to use kdump when your system crashes. kdump is a - kernel crash dumping mechanism. When a system encounters a fatal error, kdump allows the system to save the contents of its memory to a file so you can analyze exactly what went wrong. - + Learn how to configure kdump in case your system crashes. kdump is a + kernel crash dumping mechanism. When a system encounters a fatal error, kdump allows the system to save the contents of its memory to a file for expert analysis. - Use kdump to analyze system crashes + Use kdump to capture data on system crashes WHAT? - &kdump; is the standard error-recovery and crash-dumping mechanism for the Linux kernel. Its primary purpose is to capture a snapshot of the system's memory (a vmcore file) at the exact moment a kernel crashes (kernel panic). + Configure &kdump; in case your system crashes. Its primary purpose is to capture a snapshot of the system's memory (a vmcore file) at the exact moment a kernel crashes (kernel panic). @@ -94,8 +93,8 @@ WHY? - Mastering &kdump; is essential for administrators and developers because it leverages a dual-kernel mechanism to capture a memory snapshot during a crash. This transforms mysterious system failures into diagnosable vmcore files that ensure production stability and reduce troubleshooting time. - +Correctly setting up kdump and obtaining the memory dump may help SUSE support or kernel developers to debug a potential kernel crash. + diff --git a/concepts/about-kdump.xml b/concepts/about-kdump.xml index 600554223..abb2de31b 100644 --- a/concepts/about-kdump.xml +++ b/concepts/about-kdump.xml @@ -28,28 +28,28 @@ system crash occurs. A vmcore file is a snapshot of your computer's system memor
Why is &kdump; important? The primary importance of &kdump; lies in its ability to capture a snapshot of a system's memory at the exact moment of a critical failure. - When a Linux kernel experiences a fatal error that halts all operations—standard logging services like syslog or journald usually fail along with it. - This often leaving no record of what went wrong. &kdump; bypasses this limitation by using &kexec; to boot a secondary capture kernel in a reserved slice of RAM. - This allows the system to remain stable enough to save the volatile memory (RAM) into a persistent file, known as a vmcore. - Without this tool, administrators are often left with nothing but a blank screen or a frozen console, making it nearly impossible to diagnose the root cause of intermittent or silent system crashes. + When a Linux kernel experiences a fatal error; syslog or journald usually fail along with it often leaving no record of what went wrong. &kdump; bypasses this limitation by using &kexec; to boot a secondary capture kernel in a reserved slice of RAM. + The unstable crashed system is replaced with a freshly started and stable capture kernel. Without this tool, administrators are often left with nothing but a blank screen or a frozen console, making it nearly impossible to diagnose the root cause of intermittent or silent system crashes.
Understanding the dual-kernel model - Dual-kernel model is the usage of a second, isolated kernel to handle a system crash safely. When the main system kernel fails, you can not trust it to write its own crash logs to disk—the memory that might be corrupted - because the kernel itself is no longer reliable. The dual-kernel approach solves this by jumping into a completely different environment. + Kdump uses a second isolated kernel referred to as the capture or crash kernel to handle.When the main system kernel fails, you can not trust it to write its own crash logs to disk—because the kernel memory might be corrupted and the kernel itself is no longer reliable. The dual-kernel approach solves this by jumping into a completely different environment. The model relies on two distinct kernels residing in memory simultaneously: The production (primary) kernel: is the kernel you use every day. It runs your applications and services. - The capture (crash) kernel: is a lightweight, minimal kernel specifically compiled to run in a small, reserved area of RAM. It only wakes up when the primary kernel panics. + The capture (crash) kernel: is a second copy of the kernel loaded in a small reserved area of RAM. It only starts when the primary kernel panics. Alongside the crash kernel, a special-purpose initramfs image is loaded to the reserved RAM. +It is built by the Kdump tool and includes all the drivers, settings, and programs to store the vmcore file. +
About the vmcore file - A vmcore file is a snapshot of your system's physical memory (RAM) taken at the exact moment the Linux kernel crashed. - When a system panics, the &kdump; service uses 7kexec; to boot a small, separate capture kernel that stays in a reserved slice of RAM. - This capture kernel’s main function is to look back at the crashed memory and save it to a vmcore file so that you can figure out what happened after the system reboots. - +A vmcore file is a snapshot of your system's physical memory (RAM) taken at the exact moment the Linux kernel crashed. +When kdump is set up on a system, the kdump service loads the capture kernel and initramfs on system boot, using the kexec program, into a pre-reserved area of RAM - the crash kernel area. +If at some point the system crashes, the capture kernel is started. Its memory is restricted to the pre-reserved crash kernel area, so none of the memory used by the crashed production kernel is overwritten. +The job of the capture kernel and initramfs is to save the contents of the production kernel's memory into a vmcore file. + The vmcore file is a snapshot of the RAM and includes: The Kernel state:All active kernel data structures, global variables and the call stack, which is what the CPU was doing when it died. @@ -61,14 +61,17 @@ system crash occurs. A vmcore file is a snapshot of your computer's system memor
What is &kexec;? &kexec; is a system call that functions as a software-defined boot loader, allowing a running kernel to bypass the hardware BIOS/UEFI stage and directly hand over control to a new kernel. - By loading the secondary kernel's image and parameters into memory while the system is still active, &kexec; performs a warm-up boot that preserves the state of RAM and significantly reduces downtime. + By loading the secondary kernel's image and parameters into memory while the system is still active, &kexec; performs a warm boot that preserves the state of RAM and significantly reduces downtime. This mechanism is the backbone of the &kdump; dual-kernel model, as it provides a reliable way to jump from a crashing production environment into a clean recovery environment for data capture. - The kexec-tools package contains a script called kexec-bootloader. This script reads the boot loader configuration and runs &kexec; using the same kernel options as the normal boot loader. - The most important component of &kexec; is the /sbin/kexec command. You can load a kernel with &kexec; in two ways: + The most important component of &kexec; is the kexec command. You can load a kernel with &kexec; in two ways: Load the kernel to the address space of a production kernel for a regular reboot: &prompt.sudo; kexec -l KERNEL_IMAGE -You can later boot to this kernel with the command kexec -e. +You can later boot to this kernel with the command kexec -e. +Instead of using &kexec;, you can directly use a program called kexec-bootloader. It finds the default kernel, +initrd and command line options from the boot loader configuration and passes everything to &kexec; to load the default kernel properly. + + Load the kernel to a reserved area of memory: &prompt.sudo; kexec -p KERNEL_IMAGE From 865ed40b2e43e2be94800ab085d3fd506f273d24 Mon Sep 17 00:00:00 2001 From: Amrita Date: Tue, 12 May 2026 14:08:43 +0530 Subject: [PATCH 15/17] fdbk --- concepts/about-kdump.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/concepts/about-kdump.xml b/concepts/about-kdump.xml index abb2de31b..58d3e8cbc 100644 --- a/concepts/about-kdump.xml +++ b/concepts/about-kdump.xml @@ -75,7 +75,7 @@ initrd and command line options from the boot loader configuration and passes ev Load the kernel to a reserved area of memory: &prompt.sudo; kexec -p KERNEL_IMAGE - This kernel is booted automatically when the system crashes. + This kernel is booted automatically when the system crashes.This is what kdump uses to load the capture kernel.
From 30ec13a5579db27ce167d6103e71497ac479ad82 Mon Sep 17 00:00:00 2001 From: Amrita Date: Wed, 13 May 2026 13:31:46 +0530 Subject: [PATCH 16/17] tech feedback-part1 --- articles/kdump.asm.xml | 4 +-- glues/more-info-kdump.xml | 9 +++++ tasks/configure-kdump.xml | 10 +++++- tasks/setup-kdump.xml | 31 ----------------- tasks/troubleshoot-kdump.xml | 66 ++++++++++++++---------------------- 5 files changed, 45 insertions(+), 75 deletions(-) delete mode 100644 tasks/setup-kdump.xml diff --git a/articles/kdump.asm.xml b/articles/kdump.asm.xml index 66ada3a83..3f3476246 100644 --- a/articles/kdump.asm.xml +++ b/articles/kdump.asm.xml @@ -14,8 +14,7 @@ - - + @@ -122,7 +121,6 @@ and use of the command line. - diff --git a/glues/more-info-kdump.xml b/glues/more-info-kdump.xml index 9290903d7..7628e5def 100644 --- a/glues/more-info-kdump.xml +++ b/glues/more-info-kdump.xml @@ -26,6 +26,15 @@ For information on &kdump;, refer to the following resources: + + + Man pages: + + + man 7 kdump + man 5 kdump + + Official Linux kernel documentation: diff --git a/tasks/configure-kdump.xml b/tasks/configure-kdump.xml index 122078f55..129604747 100644 --- a/tasks/configure-kdump.xml +++ b/tasks/configure-kdump.xml @@ -17,8 +17,16 @@ xmlns:trans="http://docbook.org/ns/transclusion"> - Configuring &kdump; + Installing and configuring &kdump; +You can install &kdump; + &prompt.sudo; zypper install kdump + This command downloads the following packages: + +kdump +kexec-tools +makedumpfile + To boot another kernel and preserve the data of the production kernel when the system crashes, you need to reserve a dedicated area of the system memory. The production kernel never loads to this area because it must be always available. It is used for the capture kernel so that the memory pages of the production kernel can be preserved. diff --git a/tasks/setup-kdump.xml b/tasks/setup-kdump.xml deleted file mode 100644 index 8d333c8f6..000000000 --- a/tasks/setup-kdump.xml +++ /dev/null @@ -1,31 +0,0 @@ - - - %entities; -]> - - - - - - - - Setting up a dump environment - - To perform a dump when your system crashes, you need the following packages: - - kdump: Contains the scripts, &systemd; services kdump.service and configuration files /etc/sysconfig/kdump. - kexec-tools: The low-level utility that allows the system to boot into the capture kernel without a hard reset. - makedumpfile: Used by the &kdump; scripts to compress the memory dump and filter out unnecessary data (like zero pages), preventing the dump file from being as large as your entire RAM. - - You can install the packages: - &prompt.sudo; zypper install kdump kexec-tools makedumpfile - - \ No newline at end of file diff --git a/tasks/troubleshoot-kdump.xml b/tasks/troubleshoot-kdump.xml index a8dbe6e78..42ea7bd7f 100644 --- a/tasks/troubleshoot-kdump.xml +++ b/tasks/troubleshoot-kdump.xml @@ -20,48 +20,34 @@ Common troubleshooting &kdump; issues +Testing and troubleshooting &kdump; is a critical process to ensure that your system can successfully capture a vmcore file during a kernel crash, which is often the only way to diagnose system crashes. When troubleshooting &kdump;, the process usually fails at one of three stages: during boot (memory reservation), during the crash when the the dump does not start or during the save process when the file is not written. - Here are some common troubleshooting scenarios, you may encounter: - - The command systemctl status kdump gives an error kdump: failed to load kdump kernel or No crashkernel reserved. - Cause:The main kernel did not set aside enough memory for the capture kernel. - Fix:Check the current reserved memory; at /proc/iomem | grep "Crash kernel" If it is small, increase the crashkernel value in /etc/default/grub. Rebuild the GRUB config and reboot. - - - The system crashes, the capture kernel boots, but the vmcore file is not found in /var/crash after the reboot. - - Cause:The capture kernel is a minimal environment. If you are saving the dump to an NFS share, an SSH server or a specialized partition (like LVM or LUKS), the capture kernel may lack the drivers or network configuration to reach the destination. - - Fix:Check /etc/kdump.conf to see where the dump is being sent. If you are using network storage, ensure the &kdump; service has pre-built the network into the initrd. Then run the command mkdumprd -f to refresh the capture kernel's drivers. - - - - The dump starts, but the progress bar stops and the system reboots without a complete file. - - Cause: A vmcore can be massive; equal to your RAM size. If your /var - partition is small, the dump fills the disk and fails. - - Fix:Use compression and filtering;Ensure your /etc/kdump.conf has the line - core_collector makedumpfile -d 31 -c. Alternatively, you can can change the path in kdump.conf to a partition with more space. - - - -The system freezes completely but the &kdump; kernel never boots. - - Cause: This usually happens during a hard lockup, where the CPU is so stuck that it cannot even execute the instruction to switch to the &kdump; kernel. - It can also happen if secure boot is blocking &kexec; from loading an unsigned capture kernel. - Fix: Enable the NMI watchdog in /etc/sysctl.conf (kernel.nmi_watchdog = 1) so the system can panic itself if it detects a freeze. You can -also run kexec -l KERNEL_PATH command manually to see if it returns a permission denied or signature error. - - - - -You can also use the following quick status commands: - -kdumpctl show / kdump-config show: Overall health and memory reservation -cat /sys/kernel/kexec_crash_loaded: 1 means 1 means the gun is loaded and ready to fire, 0 means it is not. - +
+ Testing &kdump; + It is advisable to test &kdump; after configuring it by simulating a kernel crash. +Otherwise you may only find out that it does not work when an actual kernel crash occurs,leaving you with no possibility to debug the crash. +Ensure no critical workloads are running and no unsaved data is present on the system. Additionally, ensure to +sync and unmount file systems: +echo s > /proc/sysrq-trigger +echo u > /proc/sysrq-trigger +Then you can simulate a kernel crash: +echo c > /proc/sysrq-trigger +Verify by checking if there is a new directory created under your KDUMP_SAVEDIR which is /var/crash by default. This contains the dmesg and +vmcore of the crashed kernel. + +
+
+ Troubleshooting &kdump; + One of the most common reasons &kdump; fails is that the amount of crash kernel + memory reserved is insufficient. Different system configurations may require + more memory than estimated by kdumptool calibrate and set up automatically in + the boot loader config by the kdump-commandline.service. + During &kdump;, if you see error messages mentioning low memory and invoking the +Out of Memory (OOM) killer, this is the likely cause. In case, you don't see such messages, trying with increased crash kernel reservation is a good +first step. + +
\ No newline at end of file From ac3a99893375ea1d146cef0f12139d881c4db2ea Mon Sep 17 00:00:00 2001 From: Amrita Date: Wed, 13 May 2026 14:13:58 +0530 Subject: [PATCH 17/17] troubleshooting --- tasks/troubleshoot-kdump.xml | 55 ++++++++++++++++++++++++++++++++++-- 1 file changed, 53 insertions(+), 2 deletions(-) diff --git a/tasks/troubleshoot-kdump.xml b/tasks/troubleshoot-kdump.xml index 42ea7bd7f..99b8b6d42 100644 --- a/tasks/troubleshoot-kdump.xml +++ b/tasks/troubleshoot-kdump.xml @@ -47,7 +47,58 @@ vmcore of the crashed kernel. the boot loader config by the kdump-commandline.service.
During &kdump;, if you see error messages mentioning low memory and invoking the Out of Memory (OOM) killer, this is the likely cause. In case, you don't see such messages, trying with increased crash kernel reservation is a good -first step. - +first step. +The recommended ways to rectify this are: + + Find the size of the current automatically configured crash kernel reservation: +&prompt.sudo; cat /proc/cmdline +This should contain one or two parameters in the form of: + +crashkernel=X M +crashkernel=Y M,low +crashkernel=Z M,high + +Add up the values of X,Y and Z which equates to the size of the current reservation in MiB (current). + + +Manually set the reservation by editing /etc/sysconfig/kdump and changing the value of +KDUMP_CRASHKERNEL +&prompt.sudo; KDUMP_CRASHKERNEL="crashkernel=<2 * current>M" +Then restart and reboot: +&prompt.sudo; systemctl restart kdump +&prompt.sudo; reboot + +Repeat a few times until &kdump; works. +Fine-tuning crash kernel memory involves finding the smallest RAM reservation that successfully captures a crash dump without triggering "Out of Memory" errors in the capture kernel. + Fine tune with the last working and non-working value. + +Other troubleshooting issues include: + + +Issues with switching to a text virtual terminal + + + + makedumpfile or network config errors + + + + Dracut errors + + + + kdump initramfs does not generate correctly + + + +Once you get some output, you can increase &kdump; verbosity. Setting + KDUMP_VERBOSE to 11 turns on debugging output during all stages of the &kdump; + process, it: + + Removes the quiet option from the capture kernel command line. + Runs the kdump-save script with -x. + runs makedumpfile with debugging. + +
\ No newline at end of file