The ZFS Pool on my server was showing degraded state. After checking the SMART status of the constituent drives and finding no problem, I discovered that there’s a bug in Solaris 10.5 where the system reports a growing number of errors and eventually fails the pool. dmesg shows an error unable to kmem_alloc enough memory for scatter/gather list, however, there is actually nothing wrong with the pool. Running zpool status shows degraded state:
root@fs:~# zpool status
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM CAP Product /Disks IOstat mess SN/LUN
rpool ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0 32.2 GB VMware Virtual S S:5 H:25 T:0 000000000000000
errors: No known data errors
pool: tank
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 0 in 12h15m with 0 errors on Fri Dec 21 00:08:43 2020
config:
NAME STATE READ WRITE CKSUM CAP Product /Disks IOstat mess SN/LUN
tank DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
c0t50014EE20BF0750Dd0 ONLINE 0 0 0 4 TB WDC WD40EFRX-68W S:0 H:0 T:0 WDWCC4E6NAXVAS
c0t50014EE263348A3Ed0 ONLINE 0 0 0 4 TB WDC WD40EFRX-68W S:0 H:0 T:0 WDWCC4E0FRRRRP
c0t50014EE2B69D2D68d0 DEGRADED 0 0 20 too many errors 4 TB WDC WD40EFRX-68W S:0 H:0 T:0 WDWCC4E3AN2Y99
errors: No known data errors
Running zpool clear recovers the pool:
root@fs:~# zpool clear
root@fs:~# zpool status
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c1t0d0 ONLINE 0 0 0
errors: No known data errors
pool: tank
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: none requested
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
c0t50014EE20BF0750Dd0 ONLINE 0 0 2
c0t50014EE263348A3Ed0 ONLINE 0 0 0
c0t50014EE2B69D2D68d0 ONLINE 0 0 0