Fixing the pi
Speculation
I have a RaspberryPi 3B that has been running Home Assistant and Z-Wave containers to control some switches. It ran fine for about a year, but now it has started breaking down intermittently. Not sure what would be the cause...
- Power: this has happened in the past, when the adapter somehow output low/no voltage for the pi, and it drains power if plugged into a phone. Could this have damaged the board in a way?
- Board: it has some problems too, the wifi is broken and requires a dongle. The SD card slot is damaged and was reinforced with glue (multiple attempted to solder it failed and I made a mess on it). It currently works only because it is tightly secured with screws, keeping the slot sandwiched between the case and the board.
- SD corrupted: suspected that but it wasn't.
- OS: suspecting it, piOS (Bookworm 12) and DietPi are unstable when accessed via SSH. Bulleye 11 seems alright initially, but after weeks, it starts encountering this issue:
This issue occurred once before (last month). I forgot what I did to fix it, so documenting it here might help my future self, and I intended to start writing a blog for self-improvement (and a way to distract me from some thoughts - a lot happened lately) which I procrastinated for years. 😔
The fix and result
The first time this happened, an AI chat recommended using a Linux command to fix the SD card partition. I booted up an Ubuntu in VBox in order to run it. So from the command history, I think this is the one that helped me before: sudo fsck -y /dev/sdb2
and it gave the output:
fsck from util-linux 2.39.3
e2fsck 1.47.0 (5-Feb-2023)
rootfs: recovering journal
JBD2: Invalid checksum recovering data block 524592 in log
JBD2: Invalid checksum recovering data block 125 in log
JBD2: Invalid checksum recovering data block 1 in log
JBD2: Invalid checksum recovering data block 267 in log
JBD2: Invalid checksum recovering data block 0 in log
JBD2: Invalid checksum recovering data block 3146836 in log
JBD2: Invalid checksum recovering data block 3156566 in log
Journal checksum error found in rootfs
rootfs contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(3280697--3280895) +(3287359--3287377) +(3287380--3287381) +(3287392--3287399) +(3287408--3287411) +(3287414--3287415) +(3287422--3287423) +(3287432--3287433) +3287439 +3287449 +(3287451--3287457) +(3287459--3287463) +3287537 +(3287539--3287541) +(3287544--3287546) +(3287716--3287724) +3287749 +3287793 +(3287828--3287835) +(3287858--3287860) +3287875 +3287877 +(3287879--3287881) +(3287887--3287892) +3287941 +(3287964--3287967) +(3287976--3287984) +3288046 +(3288048--3288059) +(3288076--3288079) +(3288125--3288482) +(3288484--3288485) +(3288487--3288491) +(3288495--3288499) +(3288502--3288503) +3288505 +(3288509--3288516) +(3288518--3288525) +(3288834--3289035) -(3289088--3290122) -(3290124--3290167) -(3290170--3290260) -(3290263--3291083) -(3291090--3291135) +(3293138--3293143) +(3295191--3295200) +(3295244--3295694) +3297743 +(3299792--3301119) +(3311356--3311359) +(3320946--3320959) +(3320982--3320989) +(3321076--3321087) -3325951 +(3331782--3331784) +3332768 +3333848 +(3333884--3333887) +(3334338--3334344) +(3334346--3334350) +(3334369--3334370) +(3334389--3334399) +3334623 +3335423 +(3335715--3335719) +(3336120--3336129) -3336191 +(3336193--3336200) +(3336237--3336252) +3336336 +3336339 +(3336433--3336447) +(3336562--3336569) +(3336943--3336959) +(3337042--3337044) +3337299 +3337311 +(3337516--3337521) +(3337976--3337983) +(3337995--3337996) +(3338893--3338904) +(3338923--3338924) +(3338996--3339007) +(3339190--3339197) +(3339563--3339621) +(3339947--3339968) +(3339970--3339977) +(3340102--3340177) +(3340198--3340360) +(3340365--3340562) +(3340564--3341087) +(3341090--3341299) +(3341572--3341802) +(3341805--3342079) +(3342081--3345151) -(3364862--3364863) -(3373224--3374847)
Fix? yes
Free blocks count wrong (12096504, counted=12096452).
Fix? yes
Free inodes count wrong (3495484, counted=3495481).
Fix? yes
rootfs: ***** FILE SYSTEM WAS MODIFIED *****
rootfs: 223943/3719424 files (0.1% non-contiguous), 3104316/15200768 blocks
Sadly, this worked just for the last time; this time, it was only able to boot up to the login prompt, but you couldn't log in - it just repeats the login prompt when you input the credential. When you reboot it, the error is still there. 😞
Somehow, I found one of my spare SD cards that has piOS installed on it, though I don't recall how it got there. Surprisingly, it is indeed the same setup and everything is working, it is just an old version. I was about to call it a fix until I saw the message:
Distributor ID: Raspbian
Description: Raspbian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye
xxx@pi:"$ [ 4141.045355] EXT4-fs error (device mmcblk0p2): ext4_validate_block_bitmap:39
0: comm bluetoothd: bg 61: bad block bitmap checksum
[ 4141.049934] EXT4-fs error (device mmcblk0p2) in ext4_mb_clear_bb:6077: Filesystem faile
d CRC
[ 4158.727792] EXT4-fs error (device mmcblk0p2): ext4_validate_block_bitmap:390: comm kwor
ker/u8:1: bg 51: bad block bitmap checksum
[ 4158.732518] EXT4-fs (mmcblk0pZ): Delayed block allocation failed for inode 274862 at lo
gical offset 0 with max blocks 8 with error 74
[ 4158.737299] EXT4-fs (mmcblk0p2): This should not happen !! Data will be lost
[ 4158.737299]
Sure enough, this one has its own problem, and I migrated it a couple of months ago and I forgot to format it. Somehow it is working at the moment, but I know it won't stay that way for long.
Next steps
I will continue trying to revive the original SD card. If that fails, I may abandon the data and try out another lightweight Linux OS - Tiny Core Linux looks promising. I'll update on this if I make progress.