Add option to persist boot/update failure info to flash#809
Add option to persist boot/update failure info to flash#809mattia-moffa wants to merge 3 commits into
Conversation
When boot/update partition verification fails during boot or update, with this option the event is logged to flash in an ad-hoc partition. Information about logged failures is made available to the application through an API.
There was a problem hiding this comment.
Pull request overview
This PR adds an optional “persistent failure diagnostics” feature to wolfBoot: when enabled, boot/update/rollback verification failures are recorded to a dedicated flash region and made available to the application via a small read/clear API.
Changes:
- Add a flash-backed, circular log for failure records and expose read/clear APIs in
libwolfboot. - Record verification failures during update/boot and record rollback-not-confirmed events when rollback occurs.
- Add build options and document the new feature and API.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/update_flash.c | Records boot/update verification failures and rollback-not-confirmed events when diagnostics are enabled. |
| src/libwolfboot.c | Implements the on-flash diagnostics log format, scanning/ordering logic, and the public read/clear APIs. |
| options.mk | Adds build-time options/macros to enable diagnostics and configure the reserved flash region. |
| include/wolfboot/wolfboot.h | Defines failure phases/causes, the persisted record structure, and the new public API prototypes. |
| docs/API.md | Documents the new failure diagnostics feature and how applications consume the records. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
danielinux
left a comment
There was a problem hiding this comment.
Check word size comment.
Secondary notes (non-blocking):
- No overlap/alignment validation. Address-must-be-sector-aligned and must-not-overlap-partitions are documented but not enforced at compile time. A BUILD_BUG-style check against partition bounds would prevent a footgun.
- App-side wolfBoot_clear_failures() needs the flash driver. get_failure/clear_failures compile outside the __WOLFBOOT guard, so an application calling clear pulls in hal_flash_unlock/erase/lock. That's presumably intended, but worth a doc note that clearing requires the HAL flash driver linked into the app.
- diag_read non-ext path ignores unmapped/secure memory faults — fine for memory-mapped internal flash, just noting the XMEMCPY assumes the region is always readable.
| #define WOLFBOOT_FAILURE_CAUSE_NOT_CONFIRMED 4 /* image never confirmed via | ||
| * wolfBoot_success() */ | ||
|
|
||
| /* Persisted failure record. Exactly 16 bytes so it maps to a single 128-bit |
There was a problem hiding this comment.
The 128-bit word assumption does not work on all flash models. Some micros (STM32H7) have 256 bit words, so wolfBoot_record_failure would fail.
Suggested fix: pad DIAG_HDR_SIZE/DIAG_RECORD_SIZE to the platform flash write granularity (or expose a WOLFBOOT_DIAGNOSTICS_RECORD_SIZE that defaults to max(16, flash_word_size)), and update the size static-asserts accordingly.
When boot/update partition verification fails during boot or update, with this option the event is logged to flash in an ad-hoc partition. Information about logged failures is made available to the application through an API.
For a more detailed description see the changes to
docs/API.md.