system output
Output surfaces for Linux guests
Related: Linux, linux troubleshooting, kernel patch, virtualized time, system output ja.
System output is not just logs. For Linux and guest OS work on a virtualized platform, useful evidence comes from several surfaces with different lifetimes:
- live kernel messages,
- tracing streams,
- BPF maps and ring buffers,
- crash dumps,
- RAM-backed persistent records,
- host or hypervisor traces,
- raw memory left after failure.
The goal is not to collect everything. The goal is to preserve the evidence that answers one question: which layer owned the failed resource when the system crossed from healthy to broken?
A logging-only view is too small for systems work. A machine can reboot before userspace starts. A guest can blame a driver while the real event is a missed virtual interrupt. RAM can contain the only surviving record. This note is the broader map I want when debugging those systems.
The evidence model
For this note, I split output into four classes.
| Class | Examples | When it helps |
|---|---|---|
| Live logs | dmesg -w, journalctl -f, serial console capture | the system stays alive long enough to stream |
| Live tracing | ftrace, perf, bpftrace, BPF CO-RE tools | the symptom is reproducible and timing matters |
| Persistent RAM records | pstore, ramoops, ftrace-ramoops, console-ramoops | the system reboots, panics, or loses userspace |
| External evidence | serial console, hypervisor trace, monitor dump, RAM dump | the guest is not trustworthy or is already dead |
The classes have different failure modes. A journal can be late. A userspace buffer can be circular. A trace buffer can perturb timing. A ramoops region can be overwritten by the next crash. A host trace can describe a vCPU that the guest thought was running. Keep the surface name with the evidence.
I use this escalation path:
- Event count: count how often something happens.
- Profiling: sample where time is spent.
- Tracing: record timestamped events and reconstruct ordering.
- Postmortem: inspect records left in RAM or a dump after the live system is gone.
Do not start with the most powerful tool. Start with the cheapest surface that can answer the current hypothesis.
Kernel messages: printk, dmesg, and kmsg
Kernel messages are still the first source for boot, driver, and panic work. Kernel code commonly emits messages through printk, pr_*, and dev_* helpers:
pr_info("driver init\n");
pr_warn("retrying register read\n");
dev_err(dev, "probe failed: %d\n", ret);
For device drivers, prefer dev_* where possible. It attaches device context, which matters when the same driver instance exists in several guests or several device-tree nodes.
The path is:
kernel call site
-> printk record
-> kernel log buffer
-> /dev/kmsg
-> dmesg, journald, log plumbing, or a collector
Useful commands:
dmesg --time-format=iso --decode
dmesg --level=err,warn
dmesg --follow
journalctl -k -b --no-pager
cat /proc/sys/kernel/printk
sysctl kernel.printk
console_loglevel controls which messages reach the console. That is separate from whether the record exists in the kernel buffer. During boot-loop work, console visibility is a debugging interface, not just a cosmetic setting.
cat /proc/sys/kernel/printk
# console_loglevel default_message_loglevel minimum_console_loglevel default_console_loglevel
dmesg -s or --buffer-size still matters when reading older notes and compatibility paths:
dmesg --buffer-size 262144
On modern systems, /dev/kmsg is usually the important interface, but the operational rule is older than the implementation detail: a noisy failure can overwrite the earlier message that explained the cause. Preserve complete output before adding more instrumentation.
RAM-backed records: pstore and ramoops
RAM is often the best storage medium left after the crash. pstore is the kernel interface for persistent storage of crash records. ramoops is a pstore backend that writes records into a reserved RAM area.
The important property is lifetime. A normal log needs userspace and storage. A ramoops record needs a reserved memory region that survives reset well enough for the next boot to read it.
Typical records appear under:
mount -t pstore pstore /sys/fs/pstore 2>/dev/null || true
ls -l /sys/fs/pstore
cat /sys/fs/pstore/console-ramoops-0
cat /sys/fs/pstore/dmesg-ramoops-0
cat /sys/fs/pstore/ftrace-ramoops-0
In a device tree, a reserved ramoops region looks conceptually like this:
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
ramoops@8f000000 {
compatible = "ramoops";
reg = <0 0x8f000000 0 0x100000>;
record-size = <0x4000>;
console-size = <0x4000>;
ftrace-size = <0x4000>;
pmsg-size = <0x4000>;
};
};
For platform bring-up or image qualification, I check:
cat /proc/iomem | grep -iE 'ramoops|pstore|reserved'
cat /proc/cmdline | tr ' ' '\n' | grep -iE 'ramoops|pstore|memmap'
zcat /proc/config.gz 2>/dev/null | grep -E 'PSTORE|RAMOOPS'
grep -R . /sys/module/ramoops/parameters 2>/dev/null
If pstore is empty, do not assume the system did not panic. Check the reserved address, overlap with firmware carveouts, whether the reset preserved the memory, whether ECC or boot firmware scrubbed RAM, and whether another crash already overwrote the record.
For shutdown or late panic debugging, ftrace can also be recorded to pstore:
mount -t debugfs debugfs /sys/kernel/debug 2>/dev/null || true
echo 1 > /sys/kernel/debug/pstore/record_ftrace
reboot -f
tail /sys/fs/pstore/ftrace-ramoops-0
That is valuable when the final live console only says "rebooting" but the RAM record still has the function path.
Crash dumps and raw RAM
pstore is small. For memory corruption, page allocator bugs, use-after-free, or a panic after large state buildup, you often need a larger dump.
Common Linux paths:
cat /proc/vmcore # only inside the capture kernel
crash vmlinux vmcore
makedumpfile -d 31 /proc/vmcore vmcore.filtered
Guest and hypervisor paths vary:
- firmware or monitor RAM dump,
- hypervisor guest physical memory dump,
- offline memory capture,
- crash kernel with reserved
crashkernel=, - platform minidump framework,
- core dump plus pstore plus platform RAM dump.
When taking a raw RAM dump, record the metadata next to the file:
machine / SKU / revision
kernel image build id
vmlinux or unstripped symbols
System.map and /proc/kallsyms if available
guest physical address layout
reserved memory map
hypervisor version
reset reason
clock source and boot id
Without address layout and symbols, a RAM image is just bytes. With them, it can answer concrete questions: which task was running, which lock was held, which interrupt was pending, which page was poisoned, and whether a buffer still contains the record that userspace never flushed.
Useful live memory state before a crash:
cat /proc/iomem
cat /proc/meminfo
cat /proc/vmallocinfo | head
cat /proc/slabinfo | head
cat /proc/buddyinfo
cat /proc/pagetypeinfo 2>/dev/null | head
cat /proc/zoneinfo | head
Useful debug features when you can build the kernel:
CONFIG_KASAN
CONFIG_KFENCE
CONFIG_KCSAN
CONFIG_DEBUG_PAGEALLOC
CONFIG_PAGE_OWNER
CONFIG_DEBUG_VM
CONFIG_SLUB_DEBUG
CONFIG_PSTORE
CONFIG_PSTORE_RAM
CONFIG_PSTORE_FTRACE
CONFIG_KEXEC_CORE
CONFIG_CRASH_DUMP
These are not free. On constrained targets they change timing, memory pressure, and sometimes cache behavior. Use them intentionally and record that they were enabled.
GDB for cores and live state
GDB is useful when the evidence is a core dump, a live process, a reproduced crash under controlled conditions, or a small piece of machine state extracted from a larger failure. It is not a replacement for pstore, ftrace, BPF, or a vmcore reader. Use it when you have symbols, registers, mappings, and a concrete question about control flow or data state.
The capture rule is simple: keep the executable, shared libraries, debug files, build ID, core file, and the command line together. A backtrace without matching symbols is often just a timestamped guess.
Opening a core
gdb -q /path/to/program /path/to/core
Inside GDB:
set pagination off
set print pretty on
set print frame-arguments all
set disassemble-next-line on
info files
info proc mappings
info sharedlibrary
info threads
thread apply all bt full
If symbols live outside the target root:
set sysroot /path/to/sysroot
set solib-search-path /path/to/sysroot/lib:/path/to/sysroot/usr/lib
set debug-file-directory /path/to/debug
sharedlibrary
For reproducible local runs, keep the run environment explicit:
set args --config /tmp/repro.conf
set environment RUST_BACKTRACE=1
run
Commands I use constantly
| Command | Use |
|---|---|
bt | short backtrace |
bt full | backtrace with locals |
thread apply all bt full | all thread stacks, usually the first command on a core |
info registers | register state at the selected frame |
info frame | saved registers, frame address, caller information |
frame N | select a stack frame |
up / down | move through frames |
list | show source near the current line |
info locals | local variables |
info args | function arguments |
ptype expr | print type layout |
p expr | print expression |
p/x expr | print expression in hex |
x/16gx addr | examine 16 eight-byte words |
x/32xb addr | examine 32 bytes |
disassemble /m func | mixed source and assembly |
info symbol addr | map an address to a symbol |
maintenance info sections | loaded sections and ranges |
Crash triage pattern
For a native crash, I usually start with:
set pagination off
thread apply all bt full
info registers
info proc mappings
info files
Then select the crashing frame and check arguments and nearby memory:
thread 1
frame 0
info args
info locals
p/x $pc
x/8i $pc-16
x/32gx $sp
If the program counter is inside a shared object but symbols are missing, use the mappings:
info proc mappings
info symbol 0x7f1234567890
For a stripped binary, the build ID and mapping base are more useful than guessing. Find the mapped file and load matching symbols:
add-symbol-file /path/to/libfoo.so.debug 0x7f1234500000
Data corruption pattern
When the symptom is corrupted data, the question is often whether the value is invalid, stale, freed, or written by the wrong owner.
p object
ptype object
p/x object->state
x/64gx object
info symbol object->callback
For lists and queues, inspect both the container and the pointed-to elements. A valid-looking pointer can still point into the wrong lifetime:
p/x node
x/4gx node
info proc mappings
If the address is outside any mapping, the core and binary are not enough. Move to pstore, vmcore, allocator debugging, page owner, KASAN, KFENCE, or a RAM image.
Live attach pattern
Attach only when the pause is acceptable:
gdb -q -p "$(pidof target-service)"
Inside GDB:
set pagination off
info threads
thread apply all bt
continue
For short probes, prefer breakpoint commands that collect and continue:
break suspicious_function
commands
silent
printf "hit suspicious_function arg=%p\n", arg
bt 4
continue
end
If attach changes timing too much, stop using GDB and move to ftrace, perf, or BPF.
Kernel-adjacent use
For a kernel vmcore, use crash or a kernel-aware debugger first. GDB can inspect raw symbols and disassembly, but it does not understand many kernel data structures by itself.
Useful kernel-adjacent GDB cases still exist:
- decoding a userspace core that was produced by a system failure,
- checking a small repro program,
- validating a driver helper compiled as ordinary code,
- inspecting an address or disassembly from a trace,
- attaching through kgdb when the kernel is configured for it.
For kgdb-style sessions, keep the symbol file exact:
gdb -q vmlinux
target remote /dev/ttyS0
info registers
bt
lx-symbols
The lx-* commands depend on the kernel's GDB scripts and the debugging environment. If they are unavailable, do not pretend the view is complete. Fall back to crash, vmcore analysis, ftrace, BPF, and RAM-backed records.
Guest userspace output surfaces
A Linux guest can have a full userspace logging stack, a minimal initramfs, or almost no userspace at all. Do not treat userspace logs as a replacement for kernel evidence.
The main userspace surfaces are:
| Surface | What it gives |
|---|---|
| service logs | records emitted by init systems, daemons, and applications |
dmesg | kernel ring buffer when accessible |
/sys/fs/pstore | kernel crash, console, ftrace, pmsg records after reboot |
| core dumps | native crash state with registers, stacks, mappings, and fds |
| support bundle | packaged state across services and system interfaces |
| service introspection | service-specific state snapshots |
Basic capture:
dmesg --time-format=iso --decode > dmesg.txt
journalctl -b --no-pager > journal.txt
journalctl -k -b --no-pager > journal-kernel.txt
ls -l /sys/fs/pstore
cp -a /sys/fs/pstore ./pstore
coredumpctl list 2>/dev/null
Userspace logs can be structured, buffered, filtered, rate-limited, and delayed. A native crash may produce a core dump or a crash handler record only if the relevant handler and storage path were alive at the time of failure.
For platform work, read crash records like state snapshots, not like ordinary logs. A useful record includes registers, stack traces, memory maps, open file descriptors, and the exact binary build ID. If you have matching unstripped symbols, symbolize the stack instead of guessing from raw program counters.
On a hypervisor, userspace symptoms add one more split:
application or service symptom
-> native service, IPC, or syscall path
-> guest kernel driver / irq / scheduler
-> virtual device model or hypervisor event
-> physical device / firmware
If a service is blocked, userspace logs may show the symptom and nothing about the lower layer. Check IPC state, scheduler traces, guest interrupts, and host-side virtual device events.
Event count, profiling, tracing
A useful progression is:
- Event counting tells you whether a class of event is abnormal.
- Profiling tells you where time is being spent.
- Tracing tells you the order of events.
That progression avoids a common failure: recording a huge trace before you know which signal matters.
Event counting examples:
perf stat -e task-clock,context-switches,cpu-migrations,page-faults -a -- sleep 10
perf stat -e cycles,instructions,branches,branch-misses -a -- sleep 10
perf stat -e cache-references,cache-misses -a -- sleep 10
Profiling examples:
perf record -F 99 -a -g -- sleep 10
perf report
perf script > out.perf
Tracing examples:
trace-cmd record -e sched:sched_switch -e irq:* -e workqueue:* sleep 5
trace-cmd report
For a guest OS, run the count on both sides if possible. A guest can show low CPU usage while the host shows the vCPU is not scheduled. A guest can show block I/O timeout while the host shows a backend queue stall.
ftrace and tracefs
ftrace is the built-in kernel tracing framework exposed through tracefs, usually mounted at /sys/kernel/tracing.
Start by checking the surface:
mount -t tracefs nodev /sys/kernel/tracing 2>/dev/null || true
cd /sys/kernel/tracing
cat available_tracers
cat available_events | head
cat tracing_on
Function graph tracing is useful when you know the function family:
echo 0 > tracing_on
echo function_graph > current_tracer
echo '*read*' > set_ftrace_filter
echo 1 > tracing_on
cat /path/to/file >/dev/null
echo 0 > tracing_on
cat trace > /tmp/read.trace
Event tracing is safer and usually more stable:
echo 0 > tracing_on
echo > trace
echo 1 > events/sched/sched_switch/enable
echo 1 > events/irq/irq_handler_entry/enable
echo 1 > events/irq/irq_handler_exit/enable
echo 1 > tracing_on
sleep 3
echo 0 > tracing_on
cat trace > /tmp/events.trace
For boot problems, boot-time tracing can start earlier than a manual shell session. For latency problems, kernel 6.x tracing documentation includes tracers such as osnoise, timerlat, and hwlat. Those are especially relevant on systems where missed latency budgets are failures, not just performance noise.
A good low-level trace includes:
- scheduler switches,
- IRQ entry and exit,
- softirq,
- workqueue execution,
- power and clock events,
- block or network tracepoints if the symptom is I/O,
- driver-specific tracepoints if available,
- host/hypervisor events for vCPU scheduling and virtual interrupts.
BPF as controlled instrumentation
BPF is what I reach for when printk is too blunt, ftrace is too broad, and GDB changes the system too much.
The mechanism is:
- Compile a BPF program.
- Load it through the
bpfsyscall. - Let the verifier check safety.
- Attach it to tracepoints, kprobes, uprobes, perf events, LSM hooks, or other supported attachment points.
- Move results through maps, perf buffers, or BPF ring buffers.
For quick experiments, bpftrace is the fastest path.
List syscall tracepoints:
bpftrace -l 'tracepoint:syscalls:sys_enter_*'
Count syscalls by process:
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
Trace open paths:
bpftrace -e 'tracepoint:syscalls:sys_enter_openat { printf("%s %s\n", comm, str(args->filename)); }'
Measure read latency:
bpftrace -e 'kprobe:vfs_read { @start[tid] = nsecs; } kretprobe:vfs_read /@start[tid]/ { @ns[comm] = hist(nsecs - @start[tid]); delete(@start[tid]); }'
Profile kernel stacks:
bpftrace -e 'profile:hz:99 { @[kstack] = count(); }'
Good BPF debugging attaches to request setup and completion, carries a request pointer through a map, then emits latency and operation type. For modern kernels, prefer tracepoints when they exist, because kprobe targets can change across versions. Use kprobes when the internal function is exactly the hypothesis.
Block I/O with stable tracepoints:
bpftrace -e 'tracepoint:block:block_rq_issue { @bytes = hist(args->bytes); }'
bpftrace -e 'tracepoint:block:block_rq_complete { @lat[args->dev] = count(); }'
When writing a reusable tool, use libbpf, BTF, and CO-RE instead of embedding kernel headers at runtime. The running kernel exposes BTF at:
ls -l /sys/kernel/btf/vmlinux
bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h
bpftool prog list
bpftool map list
CO-RE matters for kernel 6.x and vendor kernels because structure layouts vary. It does not make unstable probe points stable. It only makes type access more portable when the target field still exists and BTF is available.
BPF ring buffers are useful when event order matters. Compared with per-CPU perf buffers, a shared BPF ring buffer can preserve ordering across CPUs for sequential events such as fork, exec, exit, request issue, and completion. That matters for guest debugging because cross-CPU order is often the bug.
Guest BPF constraints
On managed or restricted guests, the available BPF surface depends on build type, kernel config, security policy, and operator restrictions. Treat BPF as a platform engineering tool, not as something always available on production systems.
For debug or engineering builds:
mount -t tracefs nodev /sys/kernel/tracing 2>/dev/null
cat /sys/kernel/tracing/available_events | head
bpftool prog list
cat /sys/kernel/btf/vmlinux >/tmp/vmlinux.btf
For platform bugs:
- use userspace logs to locate the service or native symptom,
- use core dumps or crash records for native crashes,
- use ftrace or BPF for scheduler, IPC, IRQ, block, or driver timing,
- use pstore/ramoops for reboot and panic evidence,
- use host or hypervisor traces when the guest cannot observe the event.
IPC and framework behavior can hide behind system calls. A pure syscall trace may not explain why a service is blocked. In that case, combine userspace tags, IPC state, scheduler events, and kernel wait paths.
Hypervisor guest debugging
For a guest OS on a hypervisor, every record should be labeled by layer:
| Layer | Evidence |
|---|---|
| Guest userspace | journald, core dumps, service logs, state snapshots |
| Guest kernel | dmesg, pstore, ftrace, BPF, procfs/sysfs |
| Virtual platform | virtio queues, MMIO traps, injected IRQs, virtual timers |
| Host or management VM | device model logs, backend storage/network errors |
| Hypervisor | vCPU scheduling, stage-2 faults, traps, reset reason |
| Hardware or firmware | watchdog, power, secure monitor, crash dump |
Common mistake: trusting only the guest. If the guest says "block timeout", that can mean the guest driver is broken, but it can also mean the backend never completed the request, the virtual interrupt was not injected, the vCPU was not scheduled, or the guest clock jumped.
For a failure report, I want at least:
guest dmesg with monotonic timestamps
guest pstore after reboot
core dump or crash record, if native crash
guest /proc/interrupts and /proc/iomem
host/hypervisor event trace covering the same window
virtual device queue or backend logs
reset reason
clocksource and timestamp basis for each log
Playbooks
Boot loop before userspace
Use surfaces that do not require userspace:
earlycon
serial console
dmesg from bootloader or monitor capture
/sys/fs/pstore/console-ramoops-0 after reboot
hypervisor reset reason
host virtual console capture
Questions:
- Did the kernel start?
- Did the console start?
- Did init start?
- Did the reboot come from watchdog, panic, explicit reset, or hypervisor policy?
- Is the pstore area valid and not overwritten?
Native service crash
Collect:
journalctl -b --no-pager > journal.txt
coredumpctl list
coredumpctl info <PID-or-COREDUMP>
coredumpctl dump <PID-or-COREDUMP> > core.dump
Then symbolize with matching build artifacts. Do not stop at the last userspace log line. The crash record or core dump is the actual crash artifact.
I/O latency or timeout
Start with count and profile:
perf stat -e task-clock,context-switches,page-faults -a -- sleep 10
cat /proc/interrupts
cat /proc/softirqs
Then trace:
trace-cmd record -e block:block_rq_issue -e block:block_rq_complete -e irq:* sleep 10
bpftrace -e 'tracepoint:block:block_rq_issue { @bytes = hist(args->bytes); }'
In a guest, pair this with host or hypervisor queue events. A guest-side request issue without host-side completion is a boundary problem.
vCPU starvation
Guest symptoms:
- soft lockup,
- RCU stall,
- timer jump,
- delayed workqueue,
- long scheduling gap.
Guest evidence:
dmesg | grep -iE 'soft lockup|rcu|watchdog|sched|clocksource'
trace-cmd record -e sched:sched_switch -e irq:* sleep 10
cat /proc/schedstat
Host evidence:
vCPU run/exit trace
physical CPU load
IRQ routing
timer injection
hypervisor scheduling class
RAM corruption
First decide whether the evidence is live or postmortem.
Live:
cat /proc/slabinfo | head
cat /proc/buddyinfo
cat /proc/vmstat | grep -E 'pgfault|oom|compact|allocstall'
cat /sys/kernel/debug/page_owner 2>/dev/null | head
Postmortem:
pstore panic record
vmcore
raw RAM dump
KASAN report
KFENCE report
page_owner snapshot
hypervisor stage-2 fault trace
For raw RAM, preserve the exact kernel image and symbol set. If KASLR is enabled, record the load offset or extract it from the dump. If the platform has reserved regions, include firmware tables or device tree data when relevant, and always keep /proc/iomem with the dump.
Minimal capture sets
Linux guest
uname -a
cat /etc/os-release 2>/dev/null
cat /proc/cmdline
cat /proc/iomem
cat /proc/interrupts
cat /proc/softirqs
dmesg --time-format=iso --decode > dmesg.txt
journalctl -b --no-pager > journal.txt 2>/dev/null
journalctl -k -b --no-pager > journal-kernel.txt 2>/dev/null
ls -l /sys/fs/pstore 2>/dev/null
cp -a /sys/fs/pstore pstore 2>/dev/null || true
findmnt
lsblk -f
coredumpctl list 2>/dev/null
BPF readiness
zcat /proc/config.gz 2>/dev/null | grep -E 'BPF|BTF|KPROBE|FTRACE|TRACEPOINT'
ls -l /sys/kernel/btf/vmlinux
bpftool feature probe kernel
bpftool prog list
bpftool map list
bpftrace -l 'tracepoint:sched:*' | head
pstore readiness
zcat /proc/config.gz 2>/dev/null | grep -E 'PSTORE|RAMOOPS'
cat /proc/iomem | grep -iE 'pstore|ramoops|reserved'
ls -l /sys/fs/pstore
grep -R . /sys/module/ramoops/parameters 2>/dev/null
Reading output well
An output record is not a cause. It is a statement emitted by one layer at one time through one surface.
Read it with these questions:
- Which layer emitted this?
- Was the layer still healthy when it emitted it?
- Is the timestamp guest time, host time, monotonic time, or wall time?
- Could earlier records have been overwritten?
- Did instrumentation change timing?
- Does another layer confirm the same transition?
- Is this a symptom after the failure, or the first observable state change?
The right evidence set is the one that can survive the failure mode. If userspace dies, use kernel and RAM-backed records. If the kernel dies, use pstore, serial, hypervisor, or dump. If the guest lies because it is not running, use host and hypervisor traces.
References
- Linux tracing documentation
- Ramoops oops/panic logger
- BPF ring buffer
- libbpf overview and BPF CO-RE
- BTF documentation
- dmesg(1)