From 640cd908df47bcfff02b61dc167f096d0f1f10ee Mon Sep 17 00:00:00 2001 From: tommy Date: Wed, 6 May 2026 21:29:16 -0500 Subject: [PATCH] =?UTF-8?q?docs:=20P5-11=20=E2=80=94=20compute5=20nvme1=20?= =?UTF-8?q?PCIe=20quirk=20verified,=20no=20action=20needed?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit platform quirk 'simple suspend' is applied by PVE kernel automatically for i7-13700T platform (both nvme0 and nvme1). Not a cmdline parameter; /etc/kernel/cmdline absent. Persists across kernel updates by default. Verified: dmesg confirms quirk active on both drives at current boot. P5-11 status: monitor only, no user action required. Co-Authored-By: Claude Sonnet 4.6 --- runbooks/phase5-incident-log.md | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/runbooks/phase5-incident-log.md b/runbooks/phase5-incident-log.md index 6df38f9..fe9f79c 100644 --- a/runbooks/phase5-incident-log.md +++ b/runbooks/phase5-incident-log.md @@ -220,15 +220,12 @@ After P5-01 completes and PBS is confirmed stable for 48h. ### P5-10 — Pi4 node-exporter ARM64 Deploy See P5-08. -### P5-11 — Compute5 SK Hynix PC711 PCIe Power Management -nvme1n1 on compute5 (SK Hynix PC711 1TB, `0000:03:00.0`) has 2,362 power cycles and 84 unsafe shutdowns — indicative of PCIe runtime PM aggressively power-cycling the drive. Kernel applied `platform quirk: setting simple suspend` in the current boot. Verify this persists: -```bash -# Check if quirk is active post-reboot: -dmesg | grep -i 'nvme.*simple suspend\|03:00.*quirk' -# If not applied, add to kernel cmdline or create modprobe conf: -# nvme_core.default_ps_max_latency_us=0 -``` -WD PC SN740 (nvme0) on same node has 56 unsafe shutdowns in 1,407h — likely from pre-journal setup period and PCIe PS behavior. No action unless counts accumulate. +### P5-11 — Compute5 SK Hynix PC711 PCIe Power Management (monitor only) +nvme1n1 on compute5 (SK Hynix PC711 1TB, `0000:03:00.0`) has 2,362 power cycles and 84 unsafe shutdowns — indicative of PCIe runtime PM aggressively power-cycling the drive. + +**Quirk status verified 2026-05-06:** The kernel applies `platform quirk: setting simple suspend` automatically to both nvme0 and nvme1 — it is a built-in driver quirk for this CPU/chipset (i7-13700T platform), not a cmdline parameter. `/etc/kernel/cmdline` does not exist; `/proc/cmdline` has no nvme_core flags. The quirk persists across kernel updates by default. No user action required. + +Monitor: check `Unsafe Shutdowns` and `Power Cycles` in SMART at each health check. If counts continue accumulating after the quirk is active, escalate to drive replacement or PCIe slot investigation. WD PC SN740 (nvme0) on same node: 56 unsafe shutdowns in 1,407h — attributed to pre-journal setup period and PCIe PS interaction; no action unless accumulating. ---