Thanks to Michael Neuling at
OzLabs who gave me the heads up and wrote up the patch. One of my pain points doing development on the POWER9 is that
hardware watchpoints are disabled at the kernel level. This is because
the CPU will checkstop if a watchpoint is set on cache-inhibited memory such as devices, and if a checkstop occurs will invariably bring the system down. The formal name for the special purpose register governing this feature (recall that Power ISA has three classes of registers, i.e., general purpose, floating point and special purpose) is the Data Access Watchpoint Register, or DAWR. There is no software workaround for this problem, and because a malicious local user could bring the system down without privileges by managing to provoke such a situation, setting such watchpoints via the DAWR is therefore currently disabled for safety. Unfortunately, software watchpoints are sometimes hundreds of times slower than hardware watchpoints and for certain debugging tasks are just about indispensable (such as
JIT code generation).
IBM notes this issue as an erratum which implies they see it as a defect and therefore suggests it will be fixed in hardware in the future (it does not affect POWER8). Until then, Michael's patch enables "DAWR YOLO mode" for those of us (like me) who are single users on a workstation who know what we're doing, need hardware watchpoints to debug our software before the heat death of the universe, and accept the risk of system crashes. It creates a debugfs switch at /sys/kernel/debug/powerpc/dawr_enable_dangerous that enables the superuser to (mostly) freely turn access to the DAWR off and on; see the patch for more details. Fortunately this change has been finally queued for kernel version 5.2, which means I hopefully won't have to screw around with a custom kernel for much longer and is very good news for other developers in the same boat. Thanks, Michael!
In the interim you get the patch with a kernel from https://copr.fedorainfracloud.org/coprs/sharkcz/talos-kernel/ which follows the Fedora Rawhide nodebug kernel stream.
ReplyDeleteThere seems to be something wrong with 5.1 that's using an updated "DMA" patchset, but the 5.1-rc7 build is OK.
DeleteWill this patch also merged into 5.2 as well?
ReplyDeletehttps://lkml.org/lkml/2019/4/5/101
Doesn't seems so, new iteration is in progress to address the review.
Delete