Creating a RAID of USB pendrives in Linux

USB hub and USB pendrives used as RAID10 with my laptop

USB hub and pendrives used as RAID10 with my laptop.

If you’re not familiar with the RAID (Redundant Array of Inexpensive Disks) concept and the different types of array, the article ‘RAID 0, RAID 1, RAID 5, RAID 10 Explained with Diagrams‘ gives a quick summary (and links to another article ‘RAID 2, RAID 3, RAID 4, RAID 6 Explained with Diagram‘). Another helpful article is ‘RAID Levels Explained‘.

A few years ago I came across a YouTube video by a Mac user, titled ‘Use a bunch of USB Flash drives in a RAID array‘. Purely out of interest he had experimented with creating RAIDs using USB pendrives (also known as ‘USB flash drives’ or ‘USB memory sticks’). The creation of a RAID using USB pendrives for his Apple Macs was very easy, and, since then, I had wanted to try this using one of my laptops running Linux, just to satisfy my curiosity. I have previously created software RAIDs in a Linux server using internal 3.5-inch HDDs, for the root, home and swap partitions, and for file storage partitions for a Cloud server and NAS. However, I had never created a RAID using external USB drives. This week I happened to have a spare four-port USB 3.0 hub and four old 4GB USB 2.0 pendrives, so I finally got the chance to create a RAID with USB pendrives (see photo). I decided to use my main laptop, which has Gentoo Linux with OpenRC, elogind, eudev and KDE installed. That installation does not have an initramfs so I did not need to rebuild an initramfs to assemble the RAID. Anyway, early assembly of a RAID by an initramfs would only be needed if the RAID were being used to hold the directories required by the OS (the root partition, for example). As my RAID would be pluggable external storage, I wanted to mount it manually rather than adding it to /etc/fstab to be mounted automatically at boot. As I had not used a RAID on this laptop before, I had not enabled the RAID drivers in the kernel configuration, so I needed to do that and rebuild the kernel. I opted to make the RAID drivers kernel modules rather than built into the kernel, so that I could load only the relevant module for whichever type of RAID I wished to create.

I had to decide which filesystem to use in the RAID. I have always used ext4 in my RAIDs using HDDs. However, F2FS is an interesting filesystem developed by Samsung for devices using flash memory, such as SD cards, USB pendrives and SSDs. So I decided to format the pendrives to use F2FS, and create an F2FS RAID. As I had not used F2FS previously on this laptop, I had not enabled the F2FS driver in the kernel configuration, so I enabled the F2FS driver in the kernel at the same time as I enabled the RAID drivers. As with the RAID drivers, I opted to make the F2FS driver a kernel module rather than built into the kernel, so that I could load it and unload it whenever I wanted.

Not only did it turn out to be easy to create a RAID using USB pendrives, I found that the Linux RAID module gets loaded automatically when I connect the USB hub. Furthermore the RAID is recognised by KDE and listed under ‘Places’ in the Dolphin file manager’s windows, which I can click on to mount and unmount the RAID. So I did not even need to configure the OS to load the RAID module at boot (the OS does not load the module automatically at boot if the hub is not connected).

DigitalOcean produced a good tutorial on creating RAIDs in Ubuntu: ‘How To Create RAID Arrays with mdadm on Ubuntu 16.04‘. The procedure is essentially the same in Gentoo Linux, the only differences being the path of the mdadm.conf file and the method of updating an initramfs (which I did not need to do anyway in this particular installation).

As I had four spare USB pendrives and a four-port hub, I decided to create a RAID10 array. Below is a summary of the steps I took.

1. I rebuilt the kernel in order to build the RAID and F2FS modules. The relevant kernel configuration parameters I set are shown below:

root # grep RAID /usr/src/linux/.config | grep -v "#"
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
CONFIG_MD_RAID10=m
CONFIG_MD_RAID456=m
CONFIG_ASYNC_RAID6_RECOV=m
CONFIG_RAID6_PQ=m
root # grep F2FS /usr/src/linux/.config | grep -v "#"
CONFIG_F2FS_FS=m
CONFIG_F2FS_STAT_FS=y
CONFIG_F2FS_FS_XATTR=y
CONFIG_F2FS_FS_POSIX_ACL=y
root # uname -a
Linux clevow230ss 4.19.72-gentoo #2 SMP Tue Oct 15 01:36:57 BST 2019 x86_64 Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz GenuineIntel GNU/Linux

2. I installed the mdadm tool:

root # eix -I mdadm
[I] sys-fs/mdadm
     Available versions:  4.1^t {static}
     Installed versions:  4.1^t(01:52:17 15/10/19)(-static)
     Homepage:            https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/
     Description:         Tool for running RAID systems - replacement for the raidtools

3. I installed the F2FS tools:

root # eix -I f2fs
[I] sys-fs/f2fs-tools
     Available versions:  1.10.0(0/4) 1.11.0-r1(0/5) 1.12.0-r1(0/6) ~1.13.0(0/6) {selinux}
     Installed versions:  1.12.0-r1(0/6)(02:05:17 15/10/19)(-selinux)
     Homepage:            https://git.kernel.org/cgit/linux/kernel/git/jaegeuk/f2fs-tools.git/about/
     Description:         Tools for Flash-Friendly File System (F2FS)

4. I rebooted the laptop.

5. The f2fs module was not loaded automatically, therefore I loaded it manually and edited /etc/conf.d/modules to add the module name so that it would be loaded automatically in future:

root # modprobe f2fs
root # lsmod | grep f2fs
f2fs                  466944  0
root # nano /etc/conf.d/modules
root # grep ^modules /etc/conf.d/modules
modules="fuse bnep rfcomm hidp uvcvideo cifs mmc_block snd-seq-midi iptable_raw xt_CT uinput f2fs"

6. I plugged the four USB pendrives into the USB hub, and connected the hub to the laptop.

7. I launched GParted, deleted the existing partition on each pendrive (three had been formatted as FAT32, one as exFAT), reformatted them individually as F2FS and gave them each a label (USBPD01 to USBPD04). I could have done all that from the command line but it is easier using GParted, and I like an easy life.

Note that the mdadm USE flag in Gentoo Linux needed to be set when GParted was merged, so GParted would need to be re-merged with USE="mdadm" if that is not the case. Furthermore, GParted will only include F2FS in the list of available filesystems if either the F2FS module is loaded or the F2FS driver has been built into the kernel.

8. I ascertained the names of the USB pendrives:

root # lsblk -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINT
NAME     SIZE FSTYPE TYPE MOUNTPOINT
sda    698.7G        disk
├─sda1   128M ext2   part
├─sda2    16G swap   part [SWAP]
├─sda5   128G ext4   part /
├─sda6   256G ext4   part /home
└─sda7 298.5G ntfs   part /media/NTFS
sdb      3.8G        disk
└─sdb1   3.8G f2fs   part
sdc      3.8G        disk
└─sdc1   3.8G f2fs   part
sdd      3.8G        disk
└─sdd1   3.8G f2fs   part
sde      3.8G        disk
└─sde1   3.8G f2fs   part

As you can see above, the four USB pendrives are sdb to sde.

9. I loaded the raid10 module:

root # modprobe raid10
root # lsmod | grep raid
raid10                 57344  1

10. I created the RAID10 array:

root # mdadm --create --verbose /dev/md0 --level=10 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde
mdadm: layout defaults to n2
mdadm: layout defaults to n2
mdadm: chunk size defaults to 512K
mdadm: partition table exists on /dev/sdb
mdadm: partition table exists on /dev/sdb but will be lost or
       meaningless after creating array
mdadm: partition table exists on /dev/sdc
mdadm: partition table exists on /dev/sdc but will be lost or
       meaningless after creating array
mdadm: partition table exists on /dev/sdd
mdadm: partition table exists on /dev/sdd but will be lost or
       meaningless after creating array
mdadm: partition table exists on /dev/sde
mdadm: partition table exists on /dev/sde but will be lost or
       meaningless after creating array
mdadm: size set to 3913728K
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.

It takes a while for the RAID to be created, so I checked progress periodically as follows:

root # cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 sde[3] sdd[2] sdc[1] sdb[0]
      7827456 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      [>....................]  resync =  2.8% (222272/7827456) finish=23.8min speed=5308K/sec
      
unused devices: <none>
root # cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 sde[3] sdd[2] sdc[1] sdb[0]
      7827456 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      [========>............]  resync = 44.0% (3449856/7827456) finish=12.9min speed=5637K/sec
      
unused devices: <none>
root # cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 sde[3] sdd[2] sdc[1] sdb[0]
      7827456 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      [==============>......]  resync = 74.0% (5797760/7827456) finish=5.9min speed=5698K/sec
      
unused devices: <none>
root # cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 sde[3] sdd[2] sdc[1] sdb[0]
      7827456 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]
      
unused devices: <none>

11. I formatted the RAID:

root # sudo mkfs.f2fs -f /dev/md0

        F2FS-tools: mkfs.f2fs Ver: 1.12.0 (2018-11-12)

Info: Disable heap-based policy
Info: Debug level = 0
Info: Trim is enabled
Info: Segments per section = 1
Info: Sections per zone = 1
Info: sector size = 512
Info: total sectors = 15654912 (7644 MB)
Info: zone aligned segment0 blkaddr: 512
Info: format version with
  "Linux version 4.19.72-gentoo (root@clevow230ss) (gcc version 8.3.0 (Gentoo 8.3.0-r1 p1.1)) #2 SMP Tue Oct 15 01:36:57 BST 2019"
Info: [/dev/md0] Discarding device
Info: This device doesn't support BLKSECDISCARD
Info: This device doesn't support BLKDISCARD
Info: Overprovision ratio = 2.300%
Info: Overprovision segments = 179 (GC reserved = 94)
Info: format successful

The option ‘-f‘ forces mkfs to overwrite any existing filesystem. (I believe the same option is ‘-F‘ in Ubuntu, rather than ‘-f‘.)

12. I created a mount point so I could mount the RAID from the command line if I wanted:

root # mkdir -p /mnt/md0

13. I mounted the RAID from the command line and checked its size. In the case of RAID10 I would expect the size to be double the size of one of the formatted USB pendrives, i.e. approximtely 2 x 3.8GB = 7.6GB):

root # mount /dev/md0 /mnt/md0
root # df -h -x devtmpfs -x tmpfs
Filesystem      Size  Used Avail Use% Mounted on
/dev/root       126G   36G   84G  31% /
/dev/sda6       252G  137G  103G  57% /home
/dev/sda7       299G  257G   43G  86% /media/NTFS
/dev/md0        7.5G  419M  7.1G   6% /mnt/md0
root # blkid | grep -v sda
/dev/md0: UUID="d565c117-37e0-48eb-b635-a2fe70b83272" TYPE="f2fs"
/dev/sdb: UUID="d1288120-a161-4809-3e89-bb5f967df69b" UUID_SUB="45a488a0-5126-0b95-0c28-eb1f743f77c7" LABEL="clevow230ss:0" TYPE="linux_raid_member"
/dev/sdc: UUID="d1288120-a161-4809-3e89-bb5f967df69b" UUID_SUB="ef7de228-cf4d-c6bf-c74a-462a0e27f8bd" LABEL="clevow230ss:0" TYPE="linux_raid_member"
/dev/sdd: UUID="d1288120-a161-4809-3e89-bb5f967df69b" UUID_SUB="b5dd5c41-3ab2-fa38-bd28-0b965883775c" LABEL="clevow230ss:0" TYPE="linux_raid_member"
/dev/sde: UUID="d1288120-a161-4809-3e89-bb5f967df69b" UUID_SUB="16149e7e-5a96-ece6-65ba-25721bcee49f" LABEL="clevow230ss:0" TYPE="linux_raid_member"

So /dev/md0 looked correct.

14. I checked that nothing was already configured in mdadm.conf and added the array’s details to it:

root # grep -v "#" /etc/mdadm.conf
root # mdadm --detail --scan | sudo tee -a /etc/mdadm.conf
ARRAY /dev/md0 metadata=1.2 name=clevow230ss:0 UUID=d1288120:a1614809:3e89bb5f:967df69b
root # grep -v "#" /etc/mdadm.conf
ARRAY /dev/md0 metadata=1.2 name=clevow230ss:0 UUID=d1288120:a1614809:3e89bb5f:967df69b

15. As the RAID will have only a partition for file storage, and as the RAID array will not always be connected to the laptop, it does not need to be assembled automatically early during boot, so there is no need to add mdadm.conf to an initramfs (which this laptop does not have anyway) and no need to specify /dev/md0 in /etc/fstab to be mounted at boot.

16. I left the USB hub connected to the laptop and rebooted.

17. I checked that the modules were loaded at boot:

root # lsmod | grep raid
raid10                 57344  1
root # lsmod | grep f2fs
f2fs                  466944  0

18. I checked that the RAID had been assembled correctly at boot:

root # blkid | grep -v sda
/dev/sdb: UUID="d1288120-a161-4809-3e89-bb5f967df69b" UUID_SUB="45a488a0-5126-0b95-0c28-eb1f743f77c7" LABEL="clevow230ss:0" TYPE="linux_raid_member"
/dev/sdc: UUID="d1288120-a161-4809-3e89-bb5f967df69b" UUID_SUB="ef7de228-cf4d-c6bf-c74a-462a0e27f8bd" LABEL="clevow230ss:0" TYPE="linux_raid_member"
/dev/sdd: UUID="d1288120-a161-4809-3e89-bb5f967df69b" UUID_SUB="b5dd5c41-3ab2-fa38-bd28-0b965883775c" LABEL="clevow230ss:0" TYPE="linux_raid_member"
/dev/md0: UUID="d565c117-37e0-48eb-b635-a2fe70b83272" TYPE="f2fs"
/dev/sde: UUID="d1288120-a161-4809-3e89-bb5f967df69b" UUID_SUB="16149e7e-5a96-ece6-65ba-25721bcee49f" LABEL="clevow230ss:0" TYPE="linux_raid_member"

19. I rebooted a few times with and without the USB hub connected. The module raid10 only gets loaded if the USB hub is connected. If I reboot without the hub connected, raid10 is no longer loaded automatically at boot. If I plug in the hub after the laptop has booted, raid10 gets loaded and the RAID array is recognised by the OS.

20. I mounted the RAID from the command line and copied a file to it as root user:

root # mount /dev/md0 /mnt/md0
root # ls -la /mnt/md0
total 8
drwxr-xr-x 2 root root 4096 Oct 15 07:40 .
drwxr-xr-x 7 root root 4096 Oct 15 07:42 ..
root # cp ./Paper_sheet_sizes.png /mnt/md0
root # ls -la /mnt/md0
total 268
drwxr-xr-x 2 root root   4096 Oct 15 08:07 .
drwxr-xr-x 7 root root   4096 Oct 15 07:42 ..
-rw-r--r-- 1 root root 265760 Oct 15 08:07 Paper_sheet_sizes.png
root # umount /dev/md0
root # ls -la /mnt/md0
total 8
drwxr-xr-x 2 root root 4096 Oct 15 07:42 .
drwxr-xr-x 7 root root 4096 Oct 15 07:42 ..

However, /mnt/md0/ is owned by the root user, so user fitzcarraldo cannot copy files into it. Therefore I changed the ownership:

root # mount /dev/md0 /mnt/md0
root # ls -la /mnt/
total 28
drwxr-xr-x  7 root root 4096 Oct 15 07:42 .
drwxr-xr-x 22 root root 4096 Oct  6 08:31 ..
-rw-r--r--  1 root root    0 Apr  9  2015 .keep
drwxr-xr-x  2 root root 4096 Apr 19  2015 cdrom
drwxr-xr-x  2 root root 4096 Jan 16  2017 floppy
drwxr-xr-x  2 root root 4096 Oct 15 08:07 md0
drwxr-xr-x  2 root root 4096 Apr 17  2015 pendrive
drwxr-xr-x  2 root root 4096 Mar 18  2016 usbstick
root # chown fitzcarraldo:fitzcarraldo /mnt/md0
root # ls -la /mnt/
total 28
drwxr-xr-x  7 root         root         4096 Oct 15 07:42 .
drwxr-xr-x 22 root         root         4096 Oct  6 08:31 ..
-rw-r--r--  1 root         root            0 Apr  9  2015 .keep
drwxr-xr-x  2 root         root         4096 Apr 19  2015 cdrom
drwxr-xr-x  2 root         root         4096 Jan 16  2017 floppy
drwxr-xr-x  2 fitzcarraldo fitzcarraldo 4096 Oct 15 08:07 md0
drwxr-xr-x  2 root         root         4096 Apr 17  2015 pendrive
drwxr-xr-x  2 root         root         4096 Mar 18  2016 usbstick
root # umount /dev/md0

21. ‘Places’ in Dolphin shows /mnt/md0 as ‘7.5 GiB Hard Drive’.

22. I can still mount the RAID from the command line:

root # mount /dev/md0 /mnt/md0
root # df -h /dev/md0
Filesystem      Size  Used Avail Use% Mounted on
/dev/md0        7.5G  420M  7.1G   6% /mnt/md0
root # umount /dev/md0

23. If I want to use the RAID in KDE I must use Dolphin to mount it, not mount it from the command line. To do this I click on the RAID ‘7.5 GiB Hard Drive’ listed under ‘Places’, and a window pop-ups prompting me to enter the root user’s password.

If I mount /dev/md0 via Dolphin instead of via the command line, KDE mounts it on a different directory:

root # df -h /run/media/fitzcarraldo/d565c117-37e0-48eb-b635-a2fe70b83272/
Filesystem      Size  Used Avail Use% Mounted on
/dev/md0        7.5G  420M  7.1G   6% /run/media/fitzcarraldo/d565c117-37e0-48eb-b635-a2fe70b83272

If I want to unmount it, I right-click on the RAID in ‘Places’ and select ‘Unmount’ in the right-click menu. Once it has been unmounted, I can unplug the hub from the laptop. If I plug the hub back into the laptop, the RAID is detected and can be mounted as usual.

So, it works! A USB hub and pendrives are a handy way to:

  • experiment with creating the various types of RAID;
  • compare the capacity of the RAID with the capacity of the USB pendrives used;
  • measure the time to write and read a large file to/from the RAID and compare those times with the time to write and read the same file to/from a single USB pendrive of the same model.