Solaris Volume Manager (SVM)
Solaris Volume Manager or SVM, is software that you use to create volumes from physical or logical (lun) disks. That is the simplest way that I can explain SVM.
This page is only an introduction to SVM and it's features. I will use examples to show you how SVM works.
We will discuss the following features:
What is Solaris Volume Manager and what are state database relicas?
Creating simple raid 0 stripe/concat volumes
Creating mirrored volumes
Resizing mirrored volumes
Creating raid 5 volumes
Creating soft partitions
Working with hotspare pools
What is Solaris Volume Manager and state database replicas
What is a Volume Manager then? Volume managers have been around for years. Some vendors have integrated them into their operating systems and others have made it available as extra software you can install.
In essence, it's software that can manage multiple disks and that you can use or mount after you create your volumes.
In Solaris they use the Solaris Volume Manager or SVM. With previous releases they used to call it, Disk Suite or Solstice Disk Suite (SDS). In Solaris 10, they have included it into the operating system when you install the OS.
Solaris also uses the ZFS or ZetaByte Filesystem. I discuss this great filesystem in another of my Solaris 10 pages. You can actually choose between UFS or ZFS when you install Solaris 10.
How does it work? You first partition your hard disks in Solaris and then you use the slices you create to create Solaris Volume Manager volumes that can be mounted as normal UFS filesystems.
Solaris Volume Manager uses a concept called metadevices. It takes the Solaris logical device path, /dev/dsk/c2t0d0s0 and creates a metadevice from it, /dev/md/dsk/d0. You actually specify the md device number such as d0, d1 or d20. Below is a simple diagram that show this relationship between logical device paths and metadevices.
Solaris Volume Manager simple metadeviceThe /dev/md/dsk/d0 metadevice actually points to /dev/dsk/c1t0d0s0. This type of volume is called a simple volume. So why would you use the md instead of the normal Solaris logical device? Well, with Solaris Volume Manager you can now use this concept of metadevices and create more complex volumes such as stripes, raid 5 and mirrors.
State database replicas or metadb's
Before you can create any volumes, you need a place where the configuration information for your volumes are kept. These are called metadb's or state database replicas. It's kept on a partition on your disks and by default one replica is 4Mbytes in size. This is enough for most configurations but can be changed when you create the metadb's.
The minimum number of state database replicas is 3. So, if you place 3 metadb's on one slice you would need 12Mbytes of space on that slice. It's also recommended to place these replicas on different disks. Keep redundancy in mind when placing these replicas.
Let's look at the rules that govern these state database replicas. Like I said already, you need at least 3. This is the minimum. Where you put them is up to you.
If you have less than half the metadb's available, the system will panic. Let me explain. Let's say you created 7 replicas. 3 are placed on disk c2t0d0s7 and 4 on disk c2t1d0s7. If disk c2t1d0s7 fails, then more than half of your replicas are not available. 7 - 4 = 3.
If you have half the replicas available, then you system will continue to run, but when you try to boot the system, it will come up with an error that you have to fix before you can continue.
Example, you have 3 replicas on disk c1t0d0s7 and 3 on c2t1d0s7. Any disk fails then you only have 3 replicas left. When you try to boot the system will give you an error. You can fix it by logging in as root in single user mode and deleting the missing replicas using solaris volume manager commands. In this case you have 3 left and that's the new quorum.
With Solaris Volume Manager you can create raid 0 simple volumes (stripes/concatenation), raid 5 and raid 0+1 volumes. You can also use hotspares and soft partitions. I will try and show examples of all of these types of volumes.
Let's first look at what a raid 0 volume would look like. Below is a diagram showing what a raid 0 volume would look like using Solaris Volume Manager. I have 2 disks that I partitioned exactly the same. Slice 0 50 Gbyte, slice 1 10 Gbyte and slice 3 10 Gbyte.
As you can see, the disks have to be partitioned using the format command before you can use them is Solaris Volume Manager. This is very important. Planning is probably the most important thing when setting up Solaris Volume Manager volumes.
Solaris Volume Manager simple volume raid 0With a simple concat volume, you can have a one to one mapping meaning, a slice on a Solaris disk maps to a metadevice or volume to Solaris Volume Manager. This is raid 0 concatenation. You could also use multiple slices in a concatenated volume or you could use a stripe.
The above example could either have been a concatenation or a stripe. It stays a simple volume whether it's striped or concatenated. The big difference comes with how you create these simple volumes from the command line.
Creating simple raid 0 stripe/concat volumes
Enough talk, let's create some Solaris Volume Manager volumes. I found that the best way to learn Solaris Volume Manager is by doing it.
First of all, let's discuss the setup. I have Solaris 10 running in a Oracle VM VirtualBox. I have create 4 SATA disks and assigned them to the VM. They are numbered: c2t0d0, c2t1d0, c2t2d0 and c2t3d0. I then partitioned the disks in the following way for use with solaris volume manager:
slice 0 200M
Slice 1 200M
slice 3 200M
slice 4
slice 5 Remaining space
slice 7 32M
Let's create 2 simple volumes. The one will be a stripe and the other a concatenation.
We use the metainit command to create volumes. But first, we need to create the metadb's that I talked about earlier.
I will put 3 replicas on each disk's slice 7. I usually use slice 7 as replicas. Some documentation also recommends this. But you can put it anywhere you like.
Below is the commands I used.
bash-3.00# metadb
metadb: qserver: there are no existing databases
bash-3.00# metadb -a -c 3 -f c2t0d0s7
bash-3.00# metadb -a -c 3 c2t1d0s7
bash-3.00# metadb -a -c 3 c2t2d0s7
bash-3.00# metadb -a -c 3 c2t3d0s7
bash-3.00# metadb
flags first blk block count
a u 16 8192 /dev/dsk/c2t0d0s7
a u 8208 8192 /dev/dsk/c2t0d0s7
a u 16400 8192 /dev/dsk/c2t0d0s7
a u 16 8192 /dev/dsk/c2t1d0s7
a u 8208 8192 /dev/dsk/c2t1d0s7
a u 16400 8192 /dev/dsk/c2t1d0s7
a u 16 8192 /dev/dsk/c2t2d0s7
a u 8208 8192 /dev/dsk/c2t2d0s7
a u 16400 8192 /dev/dsk/c2t2d0s7
a u 16 8192 /dev/dsk/c2t3d0s7
a u 8208 8192 /dev/dsk/c2t3d0s7
a u 16400 8192 /dev/dsk/c2t3d0s7
bash-3.00#
Some people are very scared of Solaris Volume Manager. As you can see, its very simple to work with the commands. Let's look at the first create command.
metadb -a -c 3 -f c2t0d0s7 The command is metadb. The -a means add. The -c is the number of replicas. By default it will only create 1 replica per slice. Here I said create 3 replicas per slice. The -f is a force option. The first replica you create and the last you delete must include the -f option. Then I spcified the Solaris disk name where the state databases should be created.
With the other metadb commands there's no -f. That's because I already created 3 on the first slice. metadb, without options, just show you your metadb information.
We are turning and burning. Let's create some volumes. Like I said, I will create 2 raid 0 volumes. The first will be a concatenation and the second a stripe.
bash-3.00# metainit d10 2 1 c2t0d0s0 1 c2t1d0s0
d10: Concat/Stripe is setup
bash-3.00# metastat d10
d10: Concat/Stripe
Size: 819200 blocks (400 MB)
Stripe 0:
Device Start Block Dbase Reloc
c2t0d0s0 0 No Yes
Stripe 1:
Device Start Block Dbase Reloc
c2t1d0s0 0 No Yes
bash-3.00# metainit d20 1 2 c2t2d0s0 c2t3d0s0
d20: Concat/Stripe is setup
bash-3.00# metastat d20
d20: Concat/Stripe
Size: 819200 blocks (400 MB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase Reloc
c2t2d0s0 0 No Yes
c2t3d0s0 0 No Yes
Let's look at the command and options we used. metainit is the command line interface (cli) to creating volumes. d10 is the number of the metadevice or volume. You can create up to 128 volumes by default. If you want more, you need to edit a file and reboot. Keep the d numbering logical. Don't jump around when naming these. I used d10, I could have used d100 or d30 or dwhatever. It's just the name of the metadevice.
The 2 1 c2t0d0s0 1 c2t1d0s0 is the complicated part. The first 2 specifies the number of stripes or height. The next number specifies the width. The Solaris device names are known as the components. This is a typical example of how to create a concatenation. Let me use a drawing to explain this.
Solaris Volume Manager metainit commandA concatenation usually has more than one stripe and a stripe usually only has one stripe with all slices specified as the components in the stripe.
If you look at the diagram and then the command you will see that it's actually quite easy to create striped and concatenated volumes.
We use the metastatcommand to display the volumes. You can specify a specific volume. If you use the metastat without options, it displays all volumes.
Mirroring volumes
Let's mirror the two volumes. What! you cannot mirror a concat and a stripe! You say. Well, you can with Solaris Volume Manager. It's not a good idea, but for illustration purposes, that's what I'll do.
I'll use the d10 and d20 metadevices and create a d30. d30 will be the mirror device.
Here goes!
bash-3.00# metainit d30 -m d20
d30: Mirror is setup
The -m option is used to create a mirror device. So, the above command says, create a mirror called d30 and add d20 as the first submirror. Let's have a look at the d30 mirror.
bash-3.00# metastat d30
d30: Mirror
Submirror 0: d20
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 819200 blocks (400 MB)
d20: Submirror of d30
State: Okay
Size: 819200 blocks (400 MB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase State Reloc Hot
The above output shows the attributes of the mirror d30. We now have a one way mirror cause there's only one submirror attached to the d30 mirror. To add the second submirror, we use the metattach command. Like this.
bash-3.00# metattach d30 d10
d30: submirror d10 is attached
bash-3.00# metastat d30
d30: Mirror
Submirror 0: d20
State: Okay
Submirror 1: d10
State: Resyncing
Resync in progress: 13 % done
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 819200 blocks (400 MB)
d20: Submirror of d30
State: Okay
Size: 819200 blocks (400 MB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase State Reloc Hot
d10: Submirror of d30
State: Resyncing
Size: 819200 blocks (400 MB)
Stripe 0:
Device Start Block Dbase State Reloc Hot
Now the metastat output shows that d30 has two submirrors, d10 and d20. It also shows that the two mirrors are syncing.
We can now go and create a filesystem on d30 and mount it from the command line or place it in the /etc/vfstab.
bash-3.00# newfs /dev/md/rdsk/d30
newfs: construct a new file system /dev/md/rdsk/d30: (y/n)? y
/dev/md/rdsk/d30: 819200 sectors in 400 cylinders of 64
tracks, 32 sectors 400.0MB in 25 cyl groups (16 c/g,
16.00MB/g, 7680 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 32832, 65632, 98432, 131232, 164032, 196832, 229632,
262432, 295232,
492032, 524832, 557632, 590432, 623232, 656032, 688832,
721632, 754432, 787232
bash-3.00# mkdir /mirvol
bash-3.00# mount /dev/md/dsk/d30 /mirvol
bash-3.00# df -h /mirvol
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d30 376M 1.0M 338M 1% /mirvol
bash-3.00#
How easy is that. From now on we only use the /dev/md device paths. You never use the /dev/dsk Solaris device paths. If you look at the df -h /mirvol output, you now have almost 400Mbyte available.
Resizing mirrored volumes
Now let's do something totally cool. Let's say that someone created the volume like this and the performance is really bad. They ask us to change the volume so that both submirrors are striped. They also say that the volume is being used and it cannot be unmounted. Hm, let's see what we can do here with solaris volume manager.
d20 is the stripe and d10 is the concat. We need to detach and remove the d10 volume and recreate it as a stripe. Then we just reattach the d10 volume to the d30 mirror. Sounds cool! Let's do it.
bash-3.00# metadetach d30 d10
d30: submirror d10 is detached
bash-3.00# metaclear d10
d10: Concat/Stripe is cleared
bash-3.00# metainit d10 1 2 c2t0d0s0 c2t1d0s0
d10: Concat/Stripe is setup
bash-3.00# metattach d30 d10
d30: submirror d10 is attached
bash-3.00# metastat d30
d30: Mirror
Submirror 0: d20
State: Okay
Submirror 1: d10
State: Resyncing
Resync in progress: 12 % done
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 819200 blocks (400 MB)
d20: Submirror of d30
State: Okay
Size: 819200 blocks (400 MB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase State Reloc Hot
d10: Submirror of d30
State: Resyncing
Size: 819200 blocks (400 MB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase State Reloc Hot
I did all of this while the volume was mounted. Pretty decent VM. Lot's of people don't take Solaris Volume Manager seriously. If you know how it works then you can do some pretty cool stuff with it.
I'm going to show you an even cooler thing with Solaris Volume Manager. Let's say that we need more space on d30. We have run out of space and the users need another 200M. Whenever you attach a component to a metadevice, the extra space is added as a concat. This might be ok for some configurations, but we are using a stripe and a concat would impact performance.
I'm going to show you how you can increase the space on d30 and keep the stripe layout for the volume. All of this while the filesystem stays mounted on the system. We will do this live!
The procedure would be as follows. We first detach and clear d10. We then stripe another slice to the d10 volume. This would create a stripe with 3 slices. We then reattach d10 to d30 and repeat the process for d20. So, lets give it a go!
bash-3.00# metadetach d30 d10
d30: submirror d10 is detached
bash-3.00# metaclear d10
d10: Concat/Stripe is cleared
bash-3.00# metainit d10 1 3 c2t0d0s0 c2t1d0s0 c2t4d0s1
d10: Concat/Stripe is setup
bash-3.00# metastat d10
d10: Concat/Stripe
Size: 1228800 blocks (600 MB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase Reloc
c2t0d0s0 0 No Yes
c2t1d0s0 0 No Yes
c2t4d0s1 0 No Yes
bash-3.00# metattach d30 d10
d30: submirror d10 is attached
bash-3.00# metadetach d30 d20
d30: submirror d20 is detached
bash-3.00# metaclear d20
d20: Concat/Stripe is cleared
bash-3.00# metainit d20 1 3 c2t2d0s0 c2t3d0s0 c2t5d0s1
d20: Concat/Stripe is setup
bash-3.00# metattach d30 d20
d30: submirror d20 is attached
bash-3.00# metastat d30
d30: Mirror
Submirror 0: d10
State: Okay
Submirror 1: d20
State: Resyncing
Resync in progress: 6 % done
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 819200 blocks (400 MB)
d10: Submirror of d30
State: Okay
Size: 1228800 blocks (600 MB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase State Reloc Hot
d20: Submirror of d30
State: Resyncing
Size: 1228800 blocks (600 MB)
Stripe 0: (interlace: 32 blocks)
Device Start Block Dbase State Reloc Hot
bash-3.00# df -h /mirvol
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d30 376M 1.0M 338M 1% /mirvol
bash-3.00#
Hm, df -h still reports 376M, it should be 600M or there about. What we need to do is tell d30 that there's some extra space and then tell the filesystem that we have added some more space.
To add the space to d30 we need to run the metattach command and for Solaris we will use the growfs command.
bash-3.00# metattach d30
bash-3.00# metastat d30
d30: Mirror
Submirror 0: d10
State: Okay
Submirror 1: d20
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 1228800 blocks (600 MB)
.
.
output omitted
bash-3.00# growfs -M /mirvol /dev/md/rdsk/d30
/dev/md/rdsk/d30: 1228800 sectors in 600 cylinders of 64
tracks, 32 sectors
600.0MB in 38 cyl groups (16 c/g, 16.00MB/g, 7680 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 32832, 65632, 98432, 131232, 164032, 196832, 229632,
262432, 295232,
918432, 951232, 984032, 1016832, 1049632, 1082432, 1115232,
1148032, 1180832,
1213632
bash-3.00# df -h /mirvol
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d30 564M 1.0M 525M 1% /mirvol
bash-3.00#
There you go. We have added some extra space to the volume while it was mounted. The best is, d10 and d30 are still striped volumes, not concat. Cool stuff!
Creating raid 5 volumes
Next I want to look at creating raid 5 volumes. This is very simple and is much easier to understand than raid 0 volumes.
Again, we use the metainit command.
bash-3.00# metainit d20 -r c2t0d0s0 c2t1d0s0 c2t2d0s0 c2t3d0s0
d20: RAID is setup
bash-3.00# metastat d20
d20: RAID
State: Initializing
Initialization in progress: 38.5% done
Interlace: 32 blocks
Size: 1226752 blocks (599 MB)
Original device:
Size: 1227744 blocks (599 MB)
Device Start Block Dbase State Reloc Hot Spare
c2t0d0s0 330 No Initializing Yes
c2t1d0s0 330 No Initializing Yes
c2t2d0s0 330 No Initializing Yes
c2t3d0s0 330 No Initializing Yes
Easy. Just remember that you can also specify a interlace or stripe unit size. Just use the -i option and the size. The default is 16Kbytes or 32blocks.
Creating soft partitions
Let's look at soft partitions. In Solaris you are limited to 7 slices that you can use on a disk. This basically means that you can only use 7 slices or have 7 filesystems per disk. The same goes for metadevices in solaris volume manager. You use Solaris slices in Solaris Volume Manager so the same rules apply. How can we get past that? We use soft partitions.
With soft partitions you can use either a Solaris slice, /dev/dsk/c2t0d0s0 or a Solair Volume Manager volume and slice that up into more than 7 volumes. You can, theoretically, create any number of soft partitions on a slice or volume depending on the size, of course.
I will demonstrate both ways. I will use the slice 6 of c2t0d0 and the d20 raid 5 volume we created in the previous example.
First of all, I will use the Solaris slice and then the d20 volume.
bash-3.00# metainit d50 -p c2t0d0s6 50m
d50: Soft Partition is setup
bash-3.00# metainit d51 -p c2t0d0s6 50m
d51: Soft Partition is setup
bash-3.00# metainit d52 -p c2t0d0s6 50m
d52: Soft Partition is setup
bash-3.00# metastat d50 d51 d52
d50: Soft Partition
Device: c2t0d0s6
State: Okay
Size: 102400 blocks (50 MB)
Device Start Block Dbase Reloc
c2t0d0s6 0 No Yes
Extent Start Block Block count
0 1 102400
d51: Soft Partition
Device: c2t0d0s6
State: Okay
Size: 102400 blocks (50 MB)
Device Start Block Dbase Reloc
c2t0d0s6 0 No Yes
Extent Start Block Block count
0 102402 102400
d52: Soft Partition
Device: c2t0d0s6
State: Okay
Size: 102400 blocks (50 MB)
Device Start Block Dbase Reloc
c2t0d0s6 0 No Yes
Extent Start Block Block count
0 204803 102400
In the above example I used the stock standard Solaris slices to create the soft partitions. I only use soft partitions on Solaris slices if the disks are luns from hardware raid devices. The reason is that the luns will already have some redundancy when the raid group was built. Sometimes you get big luns from the storage device and then using soft partitions makes a lot of sense.
The next example is from a current volume from Solaris Volume Manager.
bash-3.00# metainit d60 -p /dev/md/rdsk/d20 100m
d60: Soft Partition is setup
bash-3.00# metainit d61 -p /dev/md/rdsk/d20 50m
d61: Soft Partition is setup
bash-3.00# metainit d62 -p /dev/md/rdsk/d20 50m
d62: Soft Partition is setup
bash-3.00# metastat d60 d61 d62
d60: Soft Partition
Device: d20
State: Okay
Size: 204800 blocks (100 MB)
Extent Start Block Block count
0 32 204800
d61: Soft Partition
Device: d20
State: Okay
Size: 102400 blocks (50 MB)
Extent Start Block Block count
0 204864 102400
d62: Soft Partition
Device: d20
State: Okay
Size: 102400 blocks (50 MB)
Extent Start Block Block count
0 307296 102400
You can now go and create filesystems on these soft partitions and mount them like any normal filesystem.
Let's do this with one of the soft partitions. Let's use d50. After that I want to show you how to re size the soft partition as well.
bash-3.00# newfs /dev/md/rdsk/d50
newfs: construct a new file system /dev/md/rdsk/d50: (y/n)? y
/dev/md/rdsk/d50: 102400 sectors in 50 cylinders of 64
tracks, 32 sectors
50.0MB in 4 cyl groups (16 c/g, 16.00MB/g, 7680 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 32832, 65632, 98432,
bash-3.00# mkdir /softvol
bash-3.00# mount /dev/md/dsk/d50 /softvol
bash-3.00# df -h /softvol
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d50 46M 1.0M 41M 3% /softvol
bash-3.00#
Cool, the filesystem is mounted. Let's say that we run out of space and we need to increase the size. We can use the metattach and growfs commands to achieve this.
bash-3.00# metattach d50 50M
d50: Soft Partition has been grown
bash-3.00# df -h /softvol
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d50 46M 1.0M 41M 3% /softvol
bash-3.00# growfs -M /softvol /dev/md/rdsk/d50
/dev/md/rdsk/d50: 204800 sectors in 100 cylinders of 64
tracks, 32 sectors 100.0MB in 7 cyl groups (16 c/g,
16.00MB/g, 7680 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 32832, 65632, 98432, 131232, 164032, 196832,
bash-3.00# df -h /softvol
Filesystem size used avail capacity Mounted on
/dev/md/dsk/d50 93M 1.0M 88M 2% /softvol
bash-3.00#
Great stuff. I hope you can start to see the use for soft partitions. It's really a great tool in the Solaris Volume Manager toolkit.
Working with hot spare pools
A hotspare pool, in Solaris Volume Manager, consists of slices in a hotspare device that can be used in the case we have a disk failure on a redundant volume such as raid 1 (mirror) or raid 5 (distributed parity). When a disk fails Solaris Volume Manager will check if there any hot spares assigned to the volume. If there are, then SVM will check if there is a slice that matches the failed disk slice and starts to rebuild the failed disk slice to the assigned hot spare pool slice.
Below is a simple diagram showing how hot spare pools can be assigned to a raid 1 volume.
Solaris Volume Manager hot spare poolsIn this example we only used one pool called hsp00 which contains 2 slices. This pool is then assigned to the submirrors and not the mirror volume. This is important for mirrored volumes. You assign the hot spare pool to the submirrors.
For raid raid 5 volumes you just need to assign it to the raid volume. Nothing funny with raid 5.
Keep in mind that the slices, in a hot spare pool, should be the same size or bigger than the slice it's going to substitute. I usually just use an extra disk and partition it exactly as the data disks and then add the disk slices to a hot spare pool and then assign the hot spare pool to the volumes. This ensures that my slices are always correct in the event a disk fails.
Let me show you some examples on hot spares. I will create two volumes. One will be a raid 1, with two slices, and the other a raid 5 with 3 slices. I will then create a hot spare pool, and assign this pool to both the raid 5 and raid 1 volumes.
This is just to give you an idea of how it works.
Let's create the volumes first. Below are the commands I used.
bash-3.00# metainit d11 1 1 c2t0d0s0
d11: Concat/Stripe is setup
bash-3.00# metainit d12 1 1 c2t1d0s0
d12: Concat/Stripe is setup
bash-3.00# metainit d10 -m d11
d10: Mirror is setup
bash-3.00# metattach d10 d12
d10: submirror d12 is attached
bash-3.00# metainit d20 -r c2t0d0s1 c2t1d0s1 c2t2d0s1 -i 8k
d20: RAID is setup
Easy peasy. I created the raid 5 and specified the interlace size with the -i 8k option. This means use an interlace or stripe unit size of 8 Kilobytes.
Now let's create the pool and assign it to the volumes.
bash-3.00# metainit hsp00 c2t3d0s0 c2t3d0s1 c2t3d0s3 c2t3d0s6
hsp000: Hotspare pool is setup
bash-3.00# metastat hsp00
hsp000: 4 hot spares
Device Status Length Reloc
c2t3d0s0 Available 409600 blocks Yes
c2t3d0s1 Available 409600 blocks Yes
c2t3d0s3 Available 409600 blocks Yes
c2t3d0s6 Available 792576 blocks Yes
bash-3.00# metaparam -h hsp00 d11
bash-3.00# metaparam -h hsp00 d12
bash-3.00# metaparam -h hsp00 d20
bash-3.00# metastat
d10: Mirror
Submirror 0: d11
State: Okay
Submirror 1: d12
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 409600 blocks (200 MB)
d11: Submirror of d10
State: Okay
Hot spare pool: hsp000
Size: 409600 blocks (200 MB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c2t0d0s0 0 No Okay Yes
d12: Submirror of d10
State: Okay
Hot spare pool: hsp000
Size: 409600 blocks (200 MB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c2t1d0s0 0 No Okay Yes
d20: RAID
State: Okay
Hot spare pool: hsp000
Interlace: 16 blocks
Size: 817152 blocks (399 MB)
Original device:
Size: 818848 blocks (399 MB)
Device Start Block Dbase State Reloc Hot Spare
c2t0d0s1 170 No Okay Yes
c2t1d0s1 170 No Okay Yes
c2t2d0s1 170 No Okay Yes
hsp000: 4 hot spares
Device Status Length Reloc
c2t3d0s0 Available 409600 blocks Yes
c2t3d0s1 Available 409600 blocks Yes
c2t3d0s3 Available 409600 blocks Yes
c2t3d0s6 Available 792576 blocks Yes
We used the metainit command to create the hot spare pools and added all the slices we wanted to. You could also use the metahs command afterwards to add slices to an existing pool.
We then used the metaparam solaris volume manager command to add the pools to the volumes. Notice that I added hsp00 to the submirrors of the raid 1 volume instead of the volume itself.
The metastat solaris volume manager command shows the results. The volumes now have a hot spare pool assigned to it. To make this a more realistic, let's simulate a disk failure. Let's fail c2t1d0 and see what the result would be.
bash-3.00# fmthard -s /dev/null /dev/rdsk/c2t1d0s2
The previous command clears the vtoc of the disks. Very dangerous don't use this command!
Next I will do a newfs to generate some io on the volumes.
bash-3.00# newfs /dev/md/rdsk/d20
bash-3.00# newfs /dev/md/rdsk/d20
Let's have a look at the hot spare pool and volumes with metastat in solaris volume manager.
bash-3.00# metastat d20
d20: RAID
State: Resyncing
Resync in progress: 23.5% done
Hot spare pool: hsp000
Interlace: 16 blocks
Size: 817152 blocks (399 MB)
Original device:
Size: 818848 blocks (399 MB)
Device Start Block Dbase State Reloc Hot Spare
c2t0d0s1 170 No Okay Yes
c2t1d0s1 170 No Resyncing Yes c2t3d0s0
c2t2d0s1 170 No Okay Yes
bash-3.00# metastat hsp00
hsp000: 4 hot spares
Device Status Length Reloc
c2t3d0s0 In use 409600 blocks Yes
c2t3d0s1 Available 409600 blocks Yes
c2t3d0s3 Available 409600 blocks Yes
c2t3d0s6 Available 792576 blocks Yes
bash-3.00# metastat d10
d10: Mirror
Submirror 0: d11
State: Okay
Submirror 1: d12
State: Resyncing
Resync in progress: 51 % done
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 409600 blocks (200 MB)
d11: Submirror of d10
State: Okay
Hot spare pool: hsp000
Size: 409600 blocks (200 MB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c2t0d0s0 0 No Okay Yes
d12: Submirror of d10
State: Resyncing
Hot spare pool: hsp000
Size: 409600 blocks (200 MB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
c2t1d0s0 0 No Resyncing Yes c2t3d0s1
bash-3.00# metastat hsp00
hsp000: 4 hot spares
Device Status Length Reloc
c2t3d0s0 In use 409600 blocks Yes
c2t3d0s1 In use 409600 blocks Yes
c2t3d0s3 Available 409600 blocks Yes
c2t3d0s6 Available 792576 blocks Yes
bash-3.00#
As you can see the failed slices started resyncing to the hot spare as soon as it failed. We should now still be able to mount the filesystems and write to it. If this works then the hot spare config worked.
bash-3.00# mkdir /raid1
bash-3.00# mount /dev/md/dsk/d10 /raid1
bash-3.00# mkdir /raid5
bash-3.00# mount /dev/md/dsk/d20 /raid5
bash-3.00# mkfile 100m /raid1/testfile
bash-3.00# ls -l /raid1/testfile
-rw------T 1 root root 104857600 Jun 17 14:33 /raid1/testfile
bash-3.00# mkfile 100m /raid5/testfile
bash-3.00# ls -l /raid5/testfile
-rw------T 1 root root 104857600 Jun 17 14:34 /raid5/testfile
bash-3.00# df -h /raid1 /raid5
Filesystemt size used avail capacity Mounted on
/dev/md/dsk/d10 188M 101M 68M 60% /raid1
/dev/md/dsk/d20 375M 101M 237M 30% /raid5
bash-3.00#
Yep, we can still access the filesystems. No Problem. Solaris volume manager is performing as it should so far.
Let's say the disk was faulty and we replaced it. How do we get the original slices back to the way it was? Great question. Let's see.
Remember one thing, the disk failed so after we replaced it we need to recreate the slices exactly the way they were created. Solaris Volume Manager does not do this by default.
If you don't do this then you might run into problems where slices might be to small to relocate back. After you are finished with an installation, write down the slice information of the disks.
This all comes in with the planning phase. Be careful with this!
Next I will recreate the slices on the disk and use the metareplace solaris volume manager command to relocate the slices back from the hot spare pools.
bash-3.00# metareplace -e d20 c2t1d0s1
d20: device c2t1d0s1 is enabled
bash-3.00# metareplace -e d10 c2t1d0s0
d10: device c2t1d0s0 is enabled
bash-3.00#
bash-3.00# metastat d20
d20: RAID
State: Resyncing
Resync in progress: 98.0% done
Hot spare pool: hsp000
Interlace: 16 blocks
Size: 817152 blocks (399 MB)
Original device:
Size: 818848 blocks (399 MB)
Device Start Block Dbase State Reloc Hot Spare
c2t0d0s1 170 No Okay Yes
c2t1d0s1 170 No Resyncing Yes c2t3d0s0
c2t2d0s1 170 No Okay Yes
bash-3.00# metastat hsp00
hsp000: 4 hot spares
Device Status Length Reloc
c2t3d0s0 Available 409600 blocks Yes
c2t3d0s1 Available 409600 blocks Yes
c2t3d0s3 Available 409600 blocks Yes
c2t3d0s6 Available 792576 blocks Yes
Now we can see that all volumes are back to normal. The hot spare did it's job perfectly. Notice I used the command metareplace -e. This is needed if you replaced a disk in the same location that failed and you need to relocate it back.
You could also relocated it to another slice and not the original if you wanted to. I don't like this cause you then start moving slices around and at some stage you will get very confused. Keep it simple in solaris volume manager.
I don't want to overload this page with solaris volume manager features and functions. I wanted to show you what I usually use when I configure solaris volume manager on Solaris servers.
Solaris volume manager also has a graphic user interface that you can use to create volumes. I don't use it cause the command line is so easy.
I might add some other stuff that you can do with solaris volume manager at a later stage. Use the RSS feeds on the left if you are interested in updates on this page.
Return from Solaris Volume Manager to Solaris 10
Back to What is My Computer