[volunteergrid2-l] Gratch is down -- hard, and probably for two weeks

Steve Dodson steve.dodson at gmail.com
Sun Mar 13 22:04:23 PDT 2011


I'm glad this is your machine and not mine ;-P

No worries here - I've not really started to use the grid yet; in its
current form (primarily # of nodes), it still seems somewhat experimental to
me...

On Sun, Mar 13, 2011 at 10:47 PM, Shawn Willden <shawn at willden.org> wrote:

> So, I feel really bad about this, especially since I was driving the
> requirement to maintain good uptime, but I don't see a way around it.
>
> Yesterday I updated my file server from Debian Lenny to Squeeze.  I
> actually don't know if that had anything to do with what happened, or if it
> was coincidental, but this evening the upgrade was completed so I decided to
> reboot.  I knew this was a little risky because I'm flying to Colorado in
> the morning, but I figured I had several hours to deal with any breakage,
> and I have over a decade of Debian upgrades under my belt, so I was
> confident I could handle it.
>
> On boot, my BIOS didn't recognize any of my seven SATA drives, just the
> lonely IDE drive which wasn't configured as a boot drive.  I reset the BIOS
> config to "failsafe defaults" and restarted.  It saw the SATA drives.  When
> I tried to boot it found GRUB (with the newly-installed GRUB2 stuff) but got
> an error trying to read the partition table of the drive with the root file
> system.  I actually had the machine configured to be able to boot off of any
> one of four drives (all of which have a partition which is part of a
> mirrored array containing the root file system), so by editing the boot
> parameters I was able to find a drive it could boot from.
>
> When it came up, though, all of the RAID5 and RAID6 arrays failed to start
> because they were unable to find enough drives.
>
> It was able to get to a command line, though, and when I started looking it
> quickly became apparent what the source of the problem was:  FOUR of my
> seven SATA drives claim to have no partition table.  Apparently the
> partition tables have gotten corrupted somehow.  Hopefully that's all that
> got lost.  My most crucial data is backed up onto multiple laptop drives,
> but there's a LOT of less critical stuff which is nowhere else.
>
> Anyway, I decided to shut the machine off and walk away while I think about
> what I need to do, and research the tools that are available to recover lost
> partition tables.  My disks all have identical partition tables, so I think
> I can just rebuild the tables based on the known config which I can get from
> the others.  But I want to proceed very cautiously, and think everything
> through THOROUGHLY before I start making any changes.
>
> Which leads back to:  I'm flying to Colorado tomorrow, and won't be back
> home for two full weeks.  Eventually I plan to take this machine to
> Colorado, but I can't do it now.  So I see no way to avoid a two-week
> downtime.
>
> --
> Shawn
>
> _______________________________________________
> volunteergrid2-l mailing list
> volunteergrid2-l at tahoe-lafs.org
> http://tahoe-lafs.org/cgi-bin/mailman/listinfo/volunteergrid2-l
> http://bigpig.org/twiki/bin/view/Main/WebHome
>



-- 
soli Deo gloria,

Steve Dodson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://tahoe-lafs.org/cgi-bin/mailman/private/volunteergrid2-l/attachments/20110313/da57eb14/attachment.html>


More information about the volunteergrid2-l mailing list