A bad hard drive gives very confusing errors
I have had random keywords that would take down my RAID. The same drive was always involved, determined by its ID in SoftRAID. Sometimes it will work well for several hours, and sometimes (yesterday) I can provoke the error for 30 seconds just by verifying the Finder copy or diglloydTools IntegrityChecker. I started to worry that if it was my 2017 iMac 5K, but it was baseless when I was able to reproduce the issue on the 2016 MacBook Pro.
I spent the better part of my day tracking the problem to this run. I tried 3 different cables, two different Macs, two different OWC Thunderbay 4 cabinets (and both ports on the chassis), MacOS 10.13 and 10.14, daisy-chained and alone, confirmed the file system ̵
In all cases, the same drive was always noted as the culprit who failed to complete an I / O, causing all stations in the cabinet to be disconnected. This is a bit weird and I hope it's just that drive.
Replace an erroneous drive in RAID-5 or RAID-4
RAID-5 (or RAID-4), both providing fault tolerance by storing parity information to which the actual data can be reconstructed. With the loss of a drive, the RAID-5 reduces the effect of a RAID-0. So if a drive fails, nothing is lost and you can continue to work.
With maybe fifty (50 errors in the last two days (the one bad drive that goes AWOL causing everything), I saw SoftRAID incorrectly with its clever reconstruction capability, which was rebuilt for a minute or so. This bad drive just didn't play well, so I physically removed it.
Getting the RAID-5 back to full error tolerance means you are replacing the failed drive, the shot showing the process.
The broken RAID 5
Here means "degraded" that one of the drives has gone away and the RAID-5 is now a RAID-0 stripe – another error and everything is toast, but that's what backups are for. In this case, I physically removed the problem drive, given all the problems that it caused as discussed above.
SoftRAID: Degraded RAID-5
Adding a Replacement Device
The bad run was physically removed, I bolted a replacement to a vehicle type and hot-i nserted it into OWC Thunderbay 4. I choose thento tell SoftRAID to add this disk to RAID-5.
SoftRAID: select which drive to use to replace the failed RAID 5 drive
A confirmation dialog confirms the selection above:
SoftRAID: confirm use of new drive
With the replacement driver in place, SoftRAID goes on job to rebuild RAID-5 for full error tolerance. With 12 TB drives (11.2 TiB), this takes a while (about 11 hours), since the entire capacity must be read by each station, to generate the correct data to go on the replacement. There is no downtime, but the volumes can continue to be used, albeit with a performance loss due to the reconstruction process, but completely usable.
SoftRAID: Rebuilding RAID-5 with newly-tuned drive