The server bvmhost-x86-01.stg.iad2.fedoraproject.org has a failed drive that needs to be replaced. The process involves contacting Dell support, providing a hardware report, and coordinating with IT for drive replacement.
Contact Dell Support: - Call Dell support to report the failed drive. - Provide the necessary hardware report exported from the server.
Generate the Hardware Report: - Access the server’s management interface at bvmhost-x86-01-stg.mgmt.iad2.fedoraproject.org (IP: 10.3.160.191). - Log in using the <CHECK_EMAIL> password. - Retrieve the serial number and confirm the drive failure. - Generate and send the hardware report to Dell.
Notify IT: - Inform the IT team that a new drive will be shipped. - Ensure IT is prepared to accept the delivery and replace the drive. - Notify pcole@redhat.com about the incoming drive for the IAD2 datacenter.
Metadata Update from @phsmoura: - Issue priority set to: Waiting on Assignee (was: Needs Review) - Issue tagged with: medium-gain, medium-trouble, ops
Update: This is in progress, we were able to reach DELL and open a support ticket, they confirmed it was indeed a failed drive, and sent up a form for the replacement! This is done via cross-communication with the internal redhat team for shipping purposes, the form has been filled and sent back to DELL.
Engineer scheduled to visit the datacenter today Wed 29th May 8am - 6pm.
Will update once this work is completed.
Metadata Update from @dkirwan: - Issue assigned to jnsamyak
HD has been replaced successfully.
Metadata Update from @dkirwan: - Issue close_status updated to: Fixed - Issue status updated to: Closed (was: Open)
The final step is re-adding the drive to the raid:
So, first go there and look at dmesg to find the new drive. [Wed May 29 17:57:42 2024] sd 0:2:4:0: [sdj] 1170997248 512-byte logical blocks: (600 GB/558 GiB) it's sdj next look at another drive, say sdi... 'fdisk -l /dev/sdi' now copy that exact partition setup to sdj (there's a parted way to just copy it, but I never remember it) Now, just re-add it to all the raid's: mdadm /dev/md0 --add /dev/sdj1 (and md1/md2 and sdj2/3) and it's rebuilding away: [>....................] recovery = 0.4% (2812416/583852032) finish=72.3min speed=133924K/sec
Log in to comment on this ticket.