Systemd hangs on shutdown
Sun, Jan 23 2022
Problem description
I've been experiencing seemingly random hangs during shutdown and reboot sequences. My computer would show a black screen and a blinking underscore cursor in the upper left corner of the screen.
These hangs would take approximately 90 seconds, which is the default waiting time between SIGTERM
and SIGKILL
signals sent by systemd
to processes during shutdown. After those 90 seconds, the computer would proceed with the shutdown.
Pinpointing the problem
After inspecting the output of journalctl -r
, I've found one and a half minutes gap between these two entries:
Jul 17 14:08:48 arch systemd[565]: Stopped target Main User Target.
Jul 17 14:07:20 arch systemd[565]: pulseaudio.service: Succeeded.
Interesting. But let's dig deeper.
I started by adding systemd
debug options to the grub
configuration file located at /etc/default/grub
:
GRUB_CMDLINE_LINUX="systemd.log_level=debug systemd.log_target=kmsg log_buf_len=1M printk.devkmsg=on enforcing=0"
Then I had to update the grub
config:
grub-mkconfig -o /boot/grub/grub.cfg
I needed a script to output the dmesg
log to a file. I called it
/usr/lib/systemd/system-shutdown/debug.sh
and it contained these lines:
#!/bin/sh
mount -o remount,rw /
dmesg > /shutdown-log.txt
mount -o remount,ro /
To make it executable, I ran
chmod +x /usr/lib/systemd/system-shutdown/debug.sh
I then rebooted my computer and inspected the ~/shutdown-log.txt
file for entries containing "timeout". In my case, the offender was openvpn
:
[11500.333742] systemd[1]: systemd-networkd.service: Got notification message from PID 291 (WATCHDOG=1)
[11524.641662] systemd[1]: session-1.scope: Stopping timed out. Killing.
[11524.641773] systemd[1]: session-1.scope: Killing process 18930 (sudo) with signal SIGKILL.
[11524.641815] systemd[1]: session-1.scope: Killing process 18956 (openvpn) with signal SIGKILL.
[11524.641900] systemd[1]: session-1.scope: Failed with result 'timeout'.
[11524.641912] systemd[1]: session-1.scope changed stop-sigterm -> failed
Solution
Since I don't care about why the openvpn
process doesn't shutdown gracefully, I can just decrease the period between
SIGTERM
and SIGKILL
signals sent by systemd
.
I did that by editing the systemd
configuration file located at /etc/systemd/system.conf
.
I looked for the for the DefaultTimeoutStopSec
entry in that file, un-commented it, and set the value to 3s
.
After rebooting the computer, the long hangs were fixed.