The One and the Many

Mismatches reported by Linux Software RAID

A few months ago I built a new file server. It uses software RAID1 managed through mdadm for its storage. Not long after I installed it I noticed concerning messages from mdadm:

Nov  6 01:43:19 seneca kernel: [1270392.116711] md: md0: data-check done.
Nov  6 01:43:19 seneca mdadm[995]: RebuildFinished event detected on md
device /dev/md/0, component device  mismatches found: 2944 (on raid level 1)

Every month mdadm runs a data check ("scrubbing") on the array. That month the check seemed to be telling me there was a problem.

After reading about this I came to the conclusion that it was not something to be concerned about. The md(4) man page has a section about this entitled "SCRUBBING AND MISMATCHES". It says:

However on RAID1 and RAID10 it is possible for software issues to cause a mismatch to be reported. This does not necessarily mean that the data on the array is corrupted. It could simply be that the system does not care what is stored on that part of the array - it is unused space.

[..]

Thus the mismatch_cnt value can not be interpreted very reliably on RAID1 or RAID10, especially when the device is used for swap.

It also says:

If an array is created with --assume-clean then a subsequent check could be expected to find some mismatches.

Well, the mismatches weren't on a swap device (or I wouldn't care).

I could not recall whether I had used --assume-clean when setting up the array, but thought it was possible.

I decided to trust the man page, and left it at that. I have separate data consistency checks, so in theory I would know about actual corruption.

9 months later

This weekend I had another report of mismatches:

Aug  6 02:44:21 seneca mdadm[912]: RebuildFinished event detected on md
device /dev/md/0, component device  mismatches found: 6144 (on raid level 1)

The number of mismatches went up! This knocked out my theory that it was a result of creating the array with --assume-clean and made me worry again that there was something wrong with the array or the disks.

Were the mismatches in unused space?

As you saw in the excerpt from md(4), another reason for this is inconsistency in unused space. I hoped this was what was happening, but I wanted to verify that.

To do that I came up with the following plan:

  1. Fill up the entire disk
  2. Free the space again
  3. Re-run a data check

The theory being that if I wrote data to the entire array then unused space would become consistent across disks. I hoped this would make the number of mismatches go to 0.

I became root and filled up the disk:

# cat /dev/zero > bigfile

This took a while to fill up ~500 GiB. After that I removed the file and started a scrub:

# rm bigfile
# echo check > /sys/block/md0/md/sync_action

This took about 1.5 hours to run. I monitored the status by checking the contents of /proc/mdstat (watch -n 5 cat /proc/mdstat).

Afterwards things looked a lot better:

Aug  6 11:53:49 seneca mdadm[912]: RebuildFinished event detected on md
device /dev/md/0, component device  mismatches found: 128 (on raid level 1)

This cut down the mismatches in a big way. According to md(4), 128 could mean that a single check failed.

I was still not satisfied though. Why was there still a single mismatch?

I reboot and ran through the whole process again.

# reboot
# cat /dev/zero > bigfile
# rm bigfile
# echo check > /sys/block/md0/md/sync_action

After doing this a second time there were no more mismatches:

Aug  6 14:59:17 seneca mdadm[912]: RebuildFinished event detected on md
device /dev/md/0

Indeed:

# cat /sys/block/md0/md/mismatch_cnt
0

I'm not sure why doing this twice was necessary. Perhaps while I was running the scrub the first time another process created and deleted a file. Or perhaps failing to sync the first time meant bigfile was not fully flushed to the underlying disks. Anyway, it looks like it was the case that it was just unused space containing mismatches.

Closing thoughts

I'm not happy with there being mismatches reported like this. It makes me nervous. I'd be happier if unused space was kept in a consistent state, or at least not reported as a mismatch. It's a bit hand wavey to have to assume things are fine and that it's just unused space that has a mismatch, or that it was due to how I created the array. What if things aren't fine? How would I know?

I initially planned to use FreeBSD for this server so that I could use ZFS and benefit from its consistency checks. Unfortunately other requirements forced me to abandon the idea of using FreeBSD (the server has to host a non-free software print server). I know that using ZFS in Linux is possible, but having it as some kind of bolted on third party module (ZFS on Linux) is not ideal.

In order to have some confidence that my data isn't bitrotting away, I use a program I wrote called checksummer to scan all of my important files twice a day and compare their checksums. This is a poor man's filesystem consistency check, but I think it's better than nothing (which is what I get with ext4 and RAID1).

Tags: Linux, servers, raid, filesystems, FreeBSD, silent corruption, data

Back

Comment