Mismatches reported by Linux Software RAID
A few months ago I built a new file server. It uses software RAID1 managed
mdadm for its storage. Not
long after I installed it I noticed concerning messages from
Nov 6 01:43:19 seneca kernel: [1270392.116711] md: md0: data-check done. Nov 6 01:43:19 seneca mdadm: RebuildFinished event detected on md device /dev/md/0, component device mismatches found: 2944 (on raid level 1)
mdadm runs a data check ("scrubbing") on the array. That
month the check seemed to be telling me there was a problem.
After reading about this I came to the conclusion that it was not something
to be concerned about. The
md(4) man page has a section about this
entitled "SCRUBBING AND MISMATCHES". It says:
However on RAID1 and RAID10 it is possible for software issues to cause a mismatch to be reported. This does not necessarily mean that the data on the array is corrupted. It could simply be that the system does not care what is stored on that part of the array - it is unused space.
Thus the mismatch_cnt value can not be interpreted very reliably on RAID1 or RAID10, especially when the device is used for swap.
It also says:
If an array is created with
--assume-cleanthen a subsequent check could be expected to find some mismatches.
Well, the mismatches weren't on a swap device (or I wouldn't care).
I could not recall whether I had used
--assume-clean when setting up the
array, but thought it was possible.
I decided to trust the man page, and left it at that. I have separate data consistency checks, so in theory I would know about actual corruption.
9 months later
This weekend I had another report of mismatches:
Aug 6 02:44:21 seneca mdadm: RebuildFinished event detected on md device /dev/md/0, component device mismatches found: 6144 (on raid level 1)
The number of mismatches went up! This knocked out my theory that it was a
result of creating the array with
--assume-clean and made me worry again
that there was something wrong with the array or the disks.
Were the mismatches in unused space?
As you saw in the excerpt from
md(4), another reason for this is
inconsistency in unused space. I hoped this was what was happening, but I
wanted to verify that.
To do that I came up with the following plan:
- Fill up the entire disk
- Free the space again
- Re-run a data check
The theory being that if I wrote data to the entire array then unused space would become consistent across disks. I hoped this would make the number of mismatches go to 0.
I became root and filled up the disk:
# cat /dev/zero > bigfile
This took a while to fill up ~500 GiB. After that I removed the file and started a scrub:
# rm bigfile # echo check > /sys/block/md0/md/sync_action
This took about 1.5 hours to run. I monitored the status by checking the
watch -n 5 cat /proc/mdstat).
Afterwards things looked a lot better:
Aug 6 11:53:49 seneca mdadm: RebuildFinished event detected on md device /dev/md/0, component device mismatches found: 128 (on raid level 1)
This cut down the mismatches in a big way. According to
md(4), 128 could
mean that a single check failed.
I was still not satisfied though. Why was there still a single mismatch?
I reboot and ran through the whole process again.
# reboot # cat /dev/zero > bigfile # rm bigfile # echo check > /sys/block/md0/md/sync_action
After doing this a second time there were no more mismatches:
Aug 6 14:59:17 seneca mdadm: RebuildFinished event detected on md device /dev/md/0
# cat /sys/block/md0/md/mismatch_cnt 0
I'm not sure why doing this twice was necessary. Perhaps while I was
running the scrub the first time another process created and deleted a
file. Or perhaps failing to
sync the first time meant
bigfile was not
fully flushed to the underlying disks. Anyway, it looks like it was the
case that it was just unused space containing mismatches.
I'm not happy with there being mismatches reported like this. It makes me nervous. I'd be happier if unused space was kept in a consistent state, or at least not reported as a mismatch. It's a bit hand wavey to have to assume things are fine and that it's just unused space that has a mismatch, or that it was due to how I created the array. What if things aren't fine? How would I know?
I initially planned to use FreeBSD for this server so that I could use ZFS and benefit from its consistency checks. Unfortunately other requirements forced me to abandon the idea of using FreeBSD (the server has to host a non-free software print server). I know that using ZFS in Linux is possible, but having it as some kind of bolted on third party module (ZFS on Linux) is not ideal.
In order to have some confidence that my data isn't bitrotting away, I use a program I wrote called checksummer to scan all of my important files twice a day and compare their checksums. This is a poor man's filesystem consistency check, but I think it's better than nothing (which is what I get with ext4 and RAID1).
Tags: Linux, servers, raid, filesystems, FreeBSD, silent corruption, data