Troubleshooting
Dell™ PowerEdge™ Expandable RAID Controller 5/i and 5/E User's Guide
To get help with your Dell™ PowerEdge™ Expandable RAID Controller (PERC) 5 controller, you can contact your Dell Technical Service representative or access the Dell Support website at support.dell.com.
Virtual Disks Degraded
A redundant virtual disk is in a degraded state when one physical disk has failed or is inaccessible. For example, a RAID 1 virtual disk consisting of two physical disks can sustain one physical disk in a failed or inaccessible state and become a degraded virtual disk.
To recover from a degraded virtual disk, rebuild the physical disk in the inaccessible state. Upon successful completion of the rebuild process, the virtual disk state changes from degraded to optimal. For the rebuild procedure, see Performing a Manual Rebuild of an Individual Physical Disk in RAID Configuration and Management.
Memory Errors
Memory errors can corrupt cached data, so the controllers are designed to detect and attempt to recover from these memory errors. Single-bit memory errors can be handled by the firmware and do not disrupt normal operation. A notification will be sent if the number of single-bit errors exceeds a threshold value.Multi-bit errors are more serious, as they result in corrupted data and data loss. The following are the actions that occur in the case of multi-bit errors:
- If an access to data in cache memory  causes a multi-bit error when the controller is started with dirty  cache, the firmware will discard the cache contents. The firmware will  generate a warning message to the system console to indicate that the  cache was discarded and will generate an event.
 
- If a multi-bit error occurs at run-time either in code/data or in the cache, the firmware will stop. 
 
- The  firmware will log an event to the firmware internal event log and will  log a message during POST indicating that a multi-bit error has  occurred.
 
|   | NOTE: In case of a multi-bit error, contact Dell Technical Support. | 
General Problems
Table 6-1 describes general problems you might encounter, along with suggested solutions.| Problem | Suggested Solution | 
|---|---|
| The device displays in Device Manager but has a yellow bang (exclamation point). | Reinstall the driver. See the driver installation procedures in the section Driver Installation. | 
| The device does not appear in Device Manager. | Turn off the system and reseat the controller. | 
| No Hard Drives Found message appears during a CD installation of Microsoft® Windows® 2000 Server, Windows Server® 2003, or Windows XP because of the following causes: | The corresponding solutions to the three causes of the message are: 
 | 
Physical Disk Related Issues
Table 6-2 describes physical disk-related problems you might encounter, along with suggested solutions.Physical Disk Failures and Rebuilds
Table 6-3 describes issues related to physical disk failures and rebuilds.| Issue | Suggested Solution | 
|---|---|
| Rebuilding a physical disk after one of them is in an inaccessible state. | If you have configured hot  spares, the PERC 5 controller automatically tries to use one to rebuild a  physical disk that is in an inaccessible state. Manual rebuild is  necessary if no hot spares with enough capacity to rebuild the  inaccessible physical disks are available. You must insert a physical  disk with enough storage into the subsystem before rebuilding the  physical disk. You can use the BIOS Configuration Utility or Dell  OpenManage™ Storage Management application to perform a manual rebuild  of an individual physical disk. See the section Performing a Manual Rebuild of an Individual Physical Disk in RAID Configuration and Management for procedures to rebuild a single physical disk. | 
| Rebuilding the physical disks after multiple disks become simultaneously inaccessible. | Multiple physical disk errors in a  single array typically indicate a failure in cabling or connection and  could involve the loss of data. It is possible to recover the virtual  disk after multiple physical disks become simultaneously inaccessible.  Perform the following steps to recover the virtual disk. Follow the safety precautions to prevent electrostatic discharge.  Ensure that all the drives are present in the enclosure.  If the VD is redundant and transitioned into  DEGRADED state before going OFFLINE a rebuild operation starts  automatically after the configuration is imported. If the VD has gone  directly into the OFFLINE state due to a cable pull or power loss  situation the VD will be imported in its OPTIMAL state without a rebuild  occurring.You can use the BIOS Configuration Utility or Dell OpenManage Storage  Management application to perform a manual rebuild of multiple physical  disks. See the section Performing a Manual Rebuild of an Individual Physical Disk in RAID Configuration and Management for procedures to rebuild a single physical disk. | 
| A virtual disk fails during rebuild while using a global hot spare. | The global hot spare goes back into HOTSPARE state and the virtual disk goes into FAIL state. | 
| A virtual disk fails during rebuild while using a dedicated hot spare. | The dedicated hot spare goes into READY state and the virtual disk goes into FAIL state. | 
| A physical disk becomes inaccessible during a reconstruction process on a redundant virtual disk that has a hot spare. | The rebuild operation for the inaccessible physical disk starts automatically after the reconstruction is completed. | 
| A physical disk is taking longer than expected to rebuild. | A physical disk takes longer to rebuild when under high stress. For example, there is one rebuild input/output (I/O) operation for every five host I/O operations. | 
SMART Error
Table 6-4 describes issues related to the Self-Monitoring Analysis and Reporting Technology (SMART). SMART monitors the internal performance of all motors, heads, and physical disk electronics and detects predictable physical disk failures.|   | NOTE: For information about where to find reports of SMART errors that could indicate hardware failure, see the Dell OpenManage Storage Management documentation. | 
| Problem | Suggested Solution | 
|---|---|
| A SMART error is detected on a physical disk in a redundant virtual disk. | Perform the following steps: 
 See Performing a Manual Rebuild of an Individual Physical Disk for rebuild procedures. | 
| A SMART error is detected on a physical disk in a non-redundant virtual disk. | Perform the following steps: See Deleting Virtual Disks for information on deleting a virtual disk. 
 See Setting Up Virtual Disks for information on creating virtual disks. | 
PERC 5 Post Error Messages
In PERC 5 controllers, the BIOS (read-only memory, ROM ) provides INT 13h functionality (disk I/O) for the virtual disks connected to the controller, so that you can boot from or access the physical disks without the need of a driver. Table 6-5 describes the error messages and warnings that display for the BIOS.Red Hat Enterprise Linux Operating System Errors
Table 6-6 describes an issue related to the Red Hat® Enterprise Linux operating system.| Error Message | Suggested Solution | 
|---|---|
| This error message displays when  the Linux Small Computer System Interface (SCSI) mid layer asks for  physical disk cache settings. Because the PERC 5 controller firmware  manages the virtual disk cache settings on a per controller and a per  virtual disk basis, the firmware does not respond to this command. Thus,  the Linux SCSI mid layer assumes that the virtual disk's cache policy  is write-through. SDB is the device node for a virtual disk. This value  changes for each virtual disk. See the section Setting Up Virtual Disks for more information about write-through cache. Except for this message, there is no side effect to this behavior. The cache policy of the virtual disk and the I/O throughput are not affected by this message. The cache policy settings for the PERC5 SAS RAID system remain the settings you have already chosen. | |
| Driver does not auto-build into new kernel after customer updates. | This error is a generic problem  for DKMS and applies to all DKMS-enabled driver packages. This issue  occurs when you perform the following steps: 
 The driver running in the new kernel is the  native driver in the new kernel. The driver package you once installed  in the new kernel does not take effect in the new kernel.Perform the following procedure to make the driver auto-build into the new kernel: dkms build -m  dkms install -m  DKMS The following details appear: | 
| smartd[smartd[2338] Device: /dev/sda, Bad IEC (SMART) mode page, err=-5, skip device smartd[2338] Unable to register SCSI device /dev/sda at line 1 of file /etc/smartd.conf | These error messages are caused by  an unsupported command coming directly from the user application. This  is a known issue in which user applications try to direct Command  Descriptor Blocks to RAID volumes. This error message has no effect on  the user and there is no loss of functionality due to this error. The Mode Sense/Select command is supported by firmware on the PERC 5. However, the Linux kernel daemon is issuing the command to the virtual disk instead of to the driver IOCTL node. This action is not supported. | 
LED Behavior Patterns
The external SAS ports on the PERC 5/E Adapter have a port status LED per x4 SAS port. This bi-color LED displays the status of any external SAS port. The LED indicates whether all links are functional or only partial links are functional. Table 6-7 describes the patterns for the port status.Audible Alarm Warnings
An audible alarm is available on the PERC 5/E Adapter to alert you of key critical and warning events involving the virtual disk or physical disk problems. You can use the Basic Input/Output System (BIOS) Configuration Utility to enable, disable, or silence the on-board alarm tone.|   | NOTE: Silencing the alarm stops only the current alarm, but future alarms will be sounded. To permanently disable the alarm, select the disable alarm option. | 
|   | NOTE: If the PERC 5/E alarm was already beeping due to a previous failure and a new virtual disk is created on the same controller, then the previous alarm will be silenced. This is expected behavior. | 
 
   
 
Tidak ada komentar:
Posting Komentar
koment