Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
It's funny.  Laugh. Data Storage IT

10 Computer Mishaps 898

Ant writes "ZDNet UK posted Ontrack Data Recovery's 2004 list of the 10 strangest and funniest computer mishaps... Some of them are funny!" My best mishap was installing the alpha video driver on an NT 3.51 box thinking that it was just an alpha driver. Of course since this Alpha meant DEC and this was an x86 box, the server barfed pretty hard. Also the time I spilled an 8oz glass of water on my laptop and lost all my email from 1994 to 1999 and my backup was corrupted. That I liked too.
This discussion has been archived. No new comments can be posted.

10 Computer Mishaps

Comments Filter:
  • Beer + Keyboard (Score:4, Interesting)

    by markmcb ( 855750 ) on Tuesday August 23, 2005 @11:36AM (#13379890) Homepage
    I don't know about that article being funny, but I knew a guy in colleg who woke up to a random dude pissing in his keyboard. I'm not sure if the keyboard was ruined, but I do know that it was trashed (much like the random dude). Cops were involved and the guy ended up having to buy a whole new system for my friend. So if you're in college and you're not locking your dorm room door, you might want to put a towel or something over your keyboard at night.
  • by skroz ( 7870 ) on Tuesday August 23, 2005 @11:39AM (#13379924) Homepage
    The worst/oddest I've seen went something like this:

    1. Someone ran rsync with -r at the end, intending to do something recursive. This option was treated as an argument, causing a file called -r to be created. This was done in / on an HP-UX workstation.

    2. Two years later, someone wrote a script to be run from cron that would run as root then change to a directory containing data files, erase them, and create new ones. This directory of data files was NFS mounted on the workstation in 1 above. Many, many other filesystems were also mounted on this workstation, all rw, all as root.

    3. Some time after that, someone rebooted the workstation. Not All of the NFS mounts came up, so when the script in 2 ran as root and did not check to make sure the destination directory existed, it was not able to cd and ran in /

    4. The script executed "rm -f *", intending to delete the data files. Unfortunately, the file called -r was still in / and was included in the argument list. Rm of course interpreted this as an option, so the command became "rm -f -r (everything else in /.)"

    5. 3 and 4 happened on a saturday night when no one was around, so no one noticed all of the data disappearing until Monday, when it was all gone.

    6. Several people had a very, very long day. Actually, several long days. A few weeks, actually.

    Can you count the number of gross and avoidable administration mistakes, boys and girls?
  • Re:#1 Works! (Score:4, Interesting)

    by RealityMogul ( 663835 ) on Tuesday August 23, 2005 @11:40AM (#13379930)
    I tried freezing a drive that wasn't working once. Didn't help any.
     
    What did help was taking the cover off and physically holding the arm in place so the head couldn't jump back and forth. Drive worked well enough to get data off after that.
     
    It should be noted that this solution was simply a result of getting really pissed off at the drive because nothing else would work.
  • by yagu ( 721525 ) * <{yayagu} {at} {gmail.com}> on Tuesday August 23, 2005 @11:43AM (#13379956) Journal

    I have to agree with first posters... these aren't very good stories. But, thinking maybe it's phishing for better stories, I'll byte:

    I once created an extremely complex script, crafted lovingly to do something at the time I'm sure I thought important. As always I incrememtally built and tested, assuring myself of one more self-anointed masterpiece. Finally, finished, as an afterthought...

    I inserted a variable to point to a directory node below which I would clean up all of my work (even though I knew I had no need for the variable and would never tweak it). It was such a simple addition. No need to test.

    Fired up the script, it ran a couple of seconds, I was prepared to enjoy the fruits of my labor. Hmmmm, I don't remember ANY of the test runs running so long. Why is the hard drive light flickering so much? And why still? And why so long?...

    Yeah, the

    rm -fr c:/$CleanupDir (I was using MKS Toolkit in a windows environment)

    command worked perfectly. Except I defined the variable initially as: cleanupdir=dirname

    So, everything was lost except for the frigging "masterpiece".

    Undaunted, (I'm no idiot, golllll!), I calmly inserted the QIC backup tape with my prerun backup.

    No, wait!, I'll not be caught with that error again! I quickly edited the only remaining file in my tree of files, the offending script and smugly fixed the rogue spelling. I hadn't been working in this industry this long without knowing how to take safeguards!

    Now, twenty minutes later, my script fixed... my files restored... let's try this again. Yeah... something about the chronology of fixing the script, then restoring the broken version over it from the backup tape. At least I proved the error was replicatable. So, I am an idiot afterall!

    disclaimer: this happened over ten years ago, so I'm a bit short on exact detail of the snafu, but it really did happen. And, even though I repeated my idiocy, the fact I had the backup tape at all with only the one error to fix in the script saved my butt... so not all was lost in the lunacy.

  • Re:Beer - cleanup (Score:3, Interesting)

    by saskboy ( 600063 ) on Tuesday August 23, 2005 @11:49AM (#13380023) Homepage Journal
    You could try pouring distilled water into the keyboard, while it's unplugged naturally, and let it sit for a while then drain it. It should remove the stickiness, and not leave any residue or rust the connections if you're fortunate.
  • by xtracto ( 837672 ) on Tuesday August 23, 2005 @11:51AM (#13380040) Journal
    Hehe, that reminds me the time when I wanted to upgrade the Hard disk of my fathers computer, I was like 13 or something.

    I dit everything "almost" ok, unplug IDE cable, unplug DC cable, take out old HD and install new HD... everything smooth

    After that I decided to install the old HD as a slave disk, again just install HD, plug IDE and plug DC cables...

    Then, turn on the computer and whoops, old CD not working... after trying with some jumpers configurations and *here i go* different way of connecting the IDE cable (on those days IDE cables didnt have the small plastic which prevents you from connecting them in the wrong way... and also the bios didnt have protection so you could not fry them :( )...

    Of course, after some time of trying to use the computer with the drive (turned on, tested int he bios, configured the HD head, cyls,etc params) the only thing that happened is that my fathers HD got fried...

    Now, the only detail I missed so far is that that disk contained nothing less than my fathers PhD thesis =oS.

    Yep, you can guess how I felt after I took the disk to a friend (he was like 30 or something and was the expert in computers then) and he told me that my disk was totally RIP...

    Fortunately, for me, my father had backup of his thesis in floppy disks ...

    Oh! or other time when I erased all my information when making my first FreeBSD installation! that was back in 1994 ... cute.
  • by tero ( 39203 ) on Tuesday August 23, 2005 @11:57AM (#13380100)
    Well, it can happen to the best of us.. This is from Lasu's Linux Anecdotes [liw.iki.fi]
    At one point, Linus had implemented device files in /dev, and wanted to dial up the university computer and debug his terminal emulation code again. So he starts his terminal emulator program and tells it to use /dev/hda. That should have been /dev/ttyS1. Oops. Now his master boot record started with "ATDT" and the university modem pool phone number. I think he implemented permission checking the following day.
  • Re:#1 Works! (Score:3, Interesting)

    by TheLoneIguana ( 126589 ) on Tuesday August 23, 2005 @12:06PM (#13380186)
    Indeed- I did this about a week ago. I was able to recover data from a seriously twitchy drive by sticking it in the freezer in the employee break room. It took a couple of sessions to get everything off the 60GB drive, because once it warmed up it just vanished from the system...
    It had a broken interface pin as well, so it was quite an adventure making the sucker work long enough to recover the user's documents.
  • Re:Dull dull dull (Score:3, Interesting)

    by Marc2k ( 221814 ) on Tuesday August 23, 2005 @12:08PM (#13380202) Homepage Journal
    Uh, agreed. Even the title's inane: "...list of the 10 strangest and funniest computer mishaps... Some of them are funny!"

    Let me guess..the rest are strange?
  • No E? No problem (Score:3, Interesting)

    by joebutton ( 788717 ) on Tuesday August 23, 2005 @12:11PM (#13380233)
  • Re:Dull dull dull (Score:3, Interesting)

    by wiredlogic ( 135348 ) on Tuesday August 23, 2005 @12:13PM (#13380254)
    This article is really lame, uninformative and about as funny as colon cancer.

    That's because this "article" is really an advertisement in disguise.
  • Re:My ones (Score:3, Interesting)

    by Cyberax ( 705495 ) on Tuesday August 23, 2005 @12:15PM (#13380273)
    "ifdown eth0" while working through SSH is my best one.
  • Re:My ones (Score:1, Interesting)

    by theufo ( 575732 ) on Tuesday August 23, 2005 @12:20PM (#13380310) Homepage
    On a server I needed to remotely manually replace libc with an older version file from another machine. Ofcause you have to remember to do everything in a single command otherwise if you delete the old version you cannot run anything else. (I am sure there must be a simpler solution to that than take the disk out and do it on another machine)


    Leave Midnight Commander running on another terminal. Since it doesn't rely on outside commands, you can still use it to recover the backup, just in case.

    Some of my most embarassing whoopsONOOOO-moments:

    1) Doing ls some-dir in /, then doing rm -rf * while thinking that some-dir was my cwd.

    2) When recovering the backup after that, it appeared that I'd broken the backup script three months ago when I patched it to skip /tmp.

    3) Leaving the computer on while "repairing" the CPU fan and smashing the motherboard up beyond any recognition when the screwdriver slips.

    4) Defragging a disk under Windows 98 which has tons of large data files from Linux apps on it. Since the standard win98 defrag has a rather low partition size limit, I use the Microsoft BackOffice defragger. For some reason it corrupts the entire drive irrepairably.

    5) Drinking coffee while installing RAM into a box lying down on the floor below me... Not much imagination needed for this one.
  • Re:#1 Works! (Score:5, Interesting)

    by Fishstick ( 150821 ) on Tuesday August 23, 2005 @12:24PM (#13380374) Journal
    Hate to do a "me too" post, but I did get desperate enough to try this once and it did work.

    Had an old NT box that I had used long ago as a domain controller at home (don't ask). Sucker had been running a long time not doing much other than acting as a logon & print server when the power went out. When the power came back and I went to start everything back up, the BIOS saw the drive, but it never spun up and I was left with the 'operating system not found' message.

    The drive was pretty old (Seagate 3.5 gig, I think) and there wasn't any really valuable data on it (or I would obviously have backed it up), but I wanted to at least boot the box one more time to see if there was anything I wanted to recover. I put the drive in a ziploc and stuck it in the freezer for like 20 minutes. Took it out, hooked it back up (leaving it in the bag to try to prevent as much condensation as possible), and it spun right up.

    Turned out there wasn't anything of any real interest on the drive, and it refused to ever spin up again, but I can vouch for the fact that this does indeed seem plausible.
  • by Black Perl ( 12686 ) on Tuesday August 23, 2005 @12:27PM (#13380417)
    Freezing will not help with a head crash or key sectors going bad. But there have been cases where it works. Back in the early 90's there was a problem with many Quantum-brand drives called "stiction", where the platter would not spin up after having been powered down. An internal lubricant (or adhesive, I forget which) basically got slightly runny when the drive got hot and re-solidified a bit out of place when cooled. This provided just enough friction to prevent the low-torque motor from being able to spin the drive up. Sometimes just rotating the drive quickly by snapping your wrist back and forth would do it. Freezing is another technique that worked (sometimes a combination of the two).
  • by Odonian ( 730378 ) on Tuesday August 23, 2005 @12:31PM (#13380462)
    Well if you're looking for the 10 best, you have to include the famous Digital Equipment Corp tale about the monkey that got fried when field service calibrated the PDP-12 it was hooked up to in some bio lab. Thus leading to the phrase 'Always Mount a Scratch Monkey'..

    I remember this one going around DEC 20 yrs ago in the NOTES files.

    This borders on urban legend it's so old / well distributed, but should probably be included. google for it or check out: scratch-mokey.html [www.xinu.nl]

  • Re:my mishap (Score:3, Interesting)

    by qwijibo ( 101731 ) on Tuesday August 23, 2005 @12:32PM (#13380479)
    I had to use MySQL for a work project where I did exactly that style of oopsie on my project, the company's now primary DNS database. Fortunately, I was dumping the database into RCS every 15 minutes, so I promptly restored the database and tried to remember what I had been doing right before the oops.

    I use and recommend PostgreSQL, but that particular company was big on using MySQL for everything, including financial transactions.
  • by Enigma_Man ( 756516 ) on Tuesday August 23, 2005 @12:42PM (#13380596) Homepage
    In highschool, we had Macs primarily. The one server we had had an external drive (old hardware). If the drive was powered down for more than a couple of minutes, when you powered it back up, it would appear that the drive wasn't connected, not recognized, not even there as far as the computer was concerned. If you let it sit long enough, then rebooted, it'd come up just fine. We eventually determined that there were some cracked solder traces on the board that would expand just enough when warm to make the connection well enough.

    -Jesse
  • My fun experiences (Score:2, Interesting)

    by mart459 ( 240857 ) on Tuesday August 23, 2005 @12:44PM (#13380633)
    During desert storm one, had an emergency order for a replacement machined part that HAD to get out. I get a call that the PC running the machine tool for part of the process went down. A quick look had it as the CRT was bad. Union shop - had to have a union guy do it or face consequences. Waited 45 minutes for him to show up and then tell me that he was taking the computer. My reply was that it just needed a new display. Argued with the idiot for five minutes even swapping displays so that he could see that the PC was working (over his objections). Still he wanted to take the PC. I left the working display on it, went to the tool crib and checked out a hammer, put the hammer through the old display and told the jerk that the display needed replacing. He took the display without comment. Probably because I had the hammer and a really angry face at that point. (hey - minutes counted here...)

    Initial install there was a blast -
    installs had to be done from government cut tapes. After four tries with us sitting there with thumbs up our behinds for two weeks we were told that I could use MY personal backup tapes since the government procedures could not provide us with valid source. Loaded them without incident. However, the exause fan on the roof jammed, and the firefighters dousing it with water flooded our computer room. Wait for replacement parts. Kitchen underneath had a grease fire fire - computer room flooded again. Wait for replacement parts. Harrier crashes and parts hit the roof. Computer room flooded again, wait for replacement parts. AC in computer room dies. So old that we have to wait for parts to be fabricated. Finially after two months on sitting on our behinds due to all of these disasters, get everything working, have all of the acceptance folks fly in for testing to show up after the kick off meeting to find out that between out 8AM start and the 9:30 "let's start the testing scripts" the WHOLE BUILDING WAS CONDEMNED and we were not allowed in. My whole team was laughing so hard we were crying.
  • by GodLived ( 517520 ) on Tuesday August 23, 2005 @12:48PM (#13380679) Journal
    Question - how did you figure all this out? I expect it must have taken at least a couple people to piece together this history.

    I'm thinking that the '-r' rsync file must have not gotten erased, because they were able to tell it was an rsync file, and saw it sitting there and knew rm -f * would have picked it up as "-r" as an option.

    Still, wouldn't rm -rf * also have deleted the '/-r' file? Maybe someone caught it in time to keep that around, or had a backup of the root directory (but I'm thinking not, not if the data loss caused weeks of headaches.)
  • Re:My ones (Score:4, Interesting)

    by _Sharp'r_ ( 649297 ) <sharper@@@booksunderreview...com> on Tuesday August 23, 2005 @01:10PM (#13380905) Homepage Journal
    Circa 1996, I started a local ISP with a buddy of mine. I was the "technical" guy and he was the "sales" guy. I kept my day job until we had enough revenue to pay for us both.

    Our main system was a BSDi box that handled user authentication and POP3 email. Since he had to deal with signing up users in the office while I worked my day job, I showed my buddy how to add and edit users on the system.

    So one day he calls me and tells me that users have started complaining that they can't login. I start looking around and finally figure out the problem after some questioning.

    That day he was bored, so he decided to "clean up" the passwd file. There were some deleted users removed from the file, so the uid's were no longer in sequence. He merrily went through and renumbered them all so that they'd be in sequence in the file.

    The good news is that the user's mail directories were named after their username, so I could quickly use that as a reference to recreate which UID went with which username originally.

    In the summer of 1994, I was trying to fix a broken Compaq while working in an authorized service center. Generally, Compaq would credit us with 1.5 hours labor for a bad motherboard and usually it only took me 30 minutes to replace one. In this case, it was taking forever.

    I replaced the motherboard, but it wouldn't power up. After a little fiddling to double-check everything, I decided the new motherboard might be DOA and replaced it with another new one.

    Same result. Now I wondered if something else was wrong, like the power supply, since I couldn't even get any POST codes out of it. Still, the fans spun and such, so it was getting at least some power.

    So I hooked the power supply up to another machine. Worked fine, so I put it back. Still dead. At this point, nothing but the power supply, motherboard, cpu, ram and video card were connected, so I tried it without the video card. From previous tests when it first came in, I knew the cpu and ram were ok. Still nothing.

    Finally, I grabbed another new motherboard and plugged it into the power supply without even bothering to put it into the case. Started up just fine with me standing there holding it in the air.

    So relieved, I shut it down and put the new motherboard in the computer, asking myself what the odds were of having two DOA motherboards in a row.

    Apparently pretty slim, since once again I turned the computer on and got nothing. Pulled the Motherboard back out and held it and it worked fine again. Put it back in, got nothing.

    At this point, I obviously decided it was something with the case and went looking. Sure enough, there was an extra small metal clip that was supposed to help attach the motherboard to the case that had come loose and then wedged itself into a corner. It was in just the right position to make contact with a couple of the solder points on the motherboard, shorting them and causing the motherboard to shut itself down immediately without even POSTing.

    One removed, the whole thing worked fine. Later, I tried the original motherboard and it also worked fine, so somehow that clip worked it's way out while it was running.
  • Re:My ones (Score:3, Interesting)

    by bonehead ( 6382 ) on Tuesday August 23, 2005 @01:34PM (#13381141)
    I've done that one myself, to the file server at the office, even.

    It was about 3am, I had a 12 pack of beer in me, and I was trying to get a new wifi card working in my home PC. At one point I hit alt-F2 to switch to a different terminal (completely forgetting that I was still logged into the file server on that terminal) and type "ifdown eth0", thinking I was killing the local eth0.

    I was, of course, too drunk to drive to the office and fix it right away and had to wait until the next day, but luckily it was a Saturday night so the only e-mail we missed was a bunch of spam.
  • by stonewolf ( 234392 ) on Tuesday August 23, 2005 @01:40PM (#13381203) Homepage
    I originally heard this story from Art during a lull in a seminar on programming implementation when he was a visiting professor at the UofU. It is the best story I ever heard for proving than no good deed goes unpunished. It is also, the funniest computer story I have ever heard.

    Stonewolf

    Read on....

    Subject: Always Mount a Scratch Monkey

    Date: Wednesday, 3 September 1986 16:46-EDT
    From: "Art Evans"
    To: Risks@CSL.SRI.COM

    In another forum that I follow, one corespondent always adds the comment

            Always Mount a Scratch Monkey

    after his signature. In response to a request for explanation, he replied somewhat as follows. Since I'm reproducing without permission,
    I have disguised a few things.

    My friend Bud used to be the intercept man at a computer vendor for calls when an irate customer called. Seems one day Bud was sitting at his desk when the phone rang.

            Bud: Hello. Voice: YOU KILLED MABEL!!
            B: Excuse me? V: YOU KILLED MABEL!!

    This went on for a couple of minutes and Bud was getting nowhere, so he decided to alter his approach to the customer.

            B: HOW DID I KILL MABEL? V: YOU PM'ED MY MACHINE!!

    Well to avoid making a long story even longer, I will abbreviate what had happened. The customer was a Biologist at the University of Blah-de-blah, and he had one of our computers that controlled gas mixtures that Mabel (the monkey) breathed. Now Mabel was not your ordinary monkey. The University had spent years teaching Mabel to swim, and they were studying the effects that different gas mixtures had on her physiology. It turns out that the repair folks had just gotten a new Calibrated Power Supply (used to calibrate analog equipment), and at their first opportunity decided to calibrate the D/A converters in that computer. This changed some of the gas mixtures and poor Mabel was asphyxiated. Well Bud then called the branch manager for the repair folks:

            Manager: Hello
            B: This is Bud, I heard you did a PM at the University of
                            Blah-de-blah.
            M: Yes, we really performed a complete PM. What can I do
                    for You?
            B: Can You Swim?

    The moral is, of course, that you should always mount a scratch monkey.

    There are several morals here related to risks in use of computers. Examples include, "If it ain't broken, don't fix it." However, the cautious philosophical approach implied by "always mount a scratch monkey" says a lot that we should keep in mind.

    Art Evans
    Tartan Labs
  • by aliensporebomb ( 1433 ) on Tuesday August 23, 2005 @01:43PM (#13381235) Homepage
    Back in the early 1990s I worked for a company that sold computer systems, peripherals and printers.

    I was working technical support at the time and received a call from someone up near the arctic
    circle and they were a print shop or some-such and had a critical job they needed to print but had
    ran out of toner.

    They had no spare toner.

    The closest spare toner they could get was several hundred miles away and accessible only by helicopter.

    We set-up an arrangement so that they would get several toner cartridges though they would miss
    the deadline.

    A little while later, the woman I spoke to called me back and indicated there was giant black streaks on anything they wanted to print.

    Apparently, in utter desperation to print they
    took an electric drill, took a toner cartridge
    for their copy machine and used a drinking straw
    to place the liquid toner for the copier into
    the empty container for the printer which used a
    dry toner system.

    What resulted is what our production people
    called "toner bombing" a printer.

    You can sandblast it all you like but it's not
    going to ever print like it did before and it's
    all but destined for the landfill at that point.

    They RUINED a high-end, $10,000+ printer for
    volume production.

    Thus endeth the lesson.
  • Re:Isopropyl alcohol (Score:3, Interesting)

    by Internet_Communist ( 592634 ) on Tuesday August 23, 2005 @01:44PM (#13381249) Homepage
    I use isopropyl a lot myself as well. I usually buy the 99% stuff so then I can make my own mixtures easier (just add water!) but the CD skipping is one I use it for a lot as well. However, for deep scratches, nothing works better then some metal polish. The worst is scratches from the label side, since it's not as thick as the bottom and typically means you're screwed.

    I had a cellphone go through the washing machine before. And it worked OK afterwards (though there was some blotches on the screen) I wouldn't try that again...
  • by parryFromIndia ( 687708 ) on Tuesday August 23, 2005 @01:45PM (#13381257)
    In our computer lab we had a very un-friendly admin. We used to hate him like anything. Our revenge was to screw up the Win 95 boxes assigned to us. They were the so called protected ones - hooked on to the Novell netware server, had no floppy / cd drives and no internet access. But being running Win95 we were easily able to achieve our goal of hosing the OS. We had to put in efforts only until we figured out the following, after which OS destruction was automagic! -

    As I said the boxes didn't use to have floppy/cd drives and so re-installation of OS was very problematic. The net admin figured out a 'slick' way to deal with the situation -

    a. Boot 95 to command prompt
    b. Enable network and CD to the novell netware share which hosted the Win 95 CDROM
    c. His theory being that reformatting destroys the hard drive - fdisk it. (Actual reason being that he won't be able to access the 95 CDROM share after the format) Remove all partitions and recreate them as-is.
    d. Run win95 setup from the netware share.
    e. Profit!

    The machines generally remained on unless a software install required a reboot, in which case the partition table was re-read on boot and things were screwed up, once again.

    The admin used to feel better by cursing at the software installation program which hosed the OS!!

    He never found out why this was happening and we never bothered to tell him!!
  • by commonchaos ( 309500 ) on Tuesday August 23, 2005 @01:46PM (#13381268) Homepage Journal
    Monday, 19-Oct-1987 [wikipedia.org]
  • Some of mine (Score:3, Interesting)

    by Jhan ( 542783 ) on Tuesday August 23, 2005 @02:15PM (#13381515) Homepage
    1. Dynamic SQL is fun:
      sprintf(sql, "update veryImportantTable set seriousMoney=y-%d where foo", adjustment);
      Of course, it turned out that "adjustment" was sometimes negative... And "--" starts a SQL comment...
    2. Hm, the contents of $IMPORTANT_DIR on server X seem to be older than that of $IMPORTANT_DIR on master server... OK, NFS mount master dir and
      cp -rp $MASTER $COPY
      Unfortunately, the files weren't really out of date since the $IMPORTANT_DIR was NFS-mounted from master to "copy" in the first place. In other words, a cp -rp between two NFS mounts from the same source, leaving me with a ton of files of zero bytes.
    3. After three solid days of installing $NEW_MEGA_SERVER:
      1. First, unpacking tar of "/nfs" in root dir, when it was created inside /nfs. A million files/dirs now pollute /.
      2. Create a "trash" dir, cd / and attempt to move the bad files into "trash".
        mv file anotherFile stillMoreFiles filesFilesWhenWillTheyStop tryToGetCleverWith * /trash
        Yep. Of all the times to have fat fingers. A space before the *. Bye, bye filesystem
    4. <FOF>A humble work statipn. A file named -rf in the root directory... Cron script with unchecked cd, involving rm *... Root of 10+ enterprise servers NFS-mounted r/w...</FOF>
  • Check Backups! (Score:3, Interesting)

    by greed ( 112493 ) on Tuesday August 23, 2005 @02:18PM (#13381536)

    Everyone knows backups are important, right? Well, pretend you do if you don't....

    So, my first full-time job, I wound up being de facto system operator of the department's AIX machines, since I'd learned the Magic Incantations necessary to upgrade AIX 3.2.3 to 3.2.4 or 3.2.5... or even, this was really great, install AIX on a new computer!

    After my first 6 months in that group, the guy who had been doing the backups left for hopefully-greener pastures. No big deal, you just had to run this program which told you which tape numbers to load in to which drives, and printed out the label sheet for the previous night's backups. Everything was done with the standard dump command, which is a nice and robust way to back up.

    Everything seemed good; every now and then, a few nodes would fail to back up, but no big deal, they were done on an earlier night, and the tape selection program made sure that a "recent" backup was available for each node before it allowed a tape to be re-used.

    If only I had noticed that it was always 3 machines in a row that had backup failures....

    So, the inevitable happens... we need to restore an important filesystem. (Anyone who worked with IBM 857mb SCSI drives knows that the inevitable happened about once every 6 months. Per drive.)

    Pulled the latest tape, seeked to the right record... funny, this looks like a different filesystem, but that should be a much later record on the tape.

    Try the next-earlier one, same thing.

    One before that... found the right filesystem. Restore went good, fortunately it was a read-mostly system, and we didn't lose any important changes since that dump.

    But having filesystems in the wrong place... I couldn't figure that one out. I went through the backup script (which had been adapted from a magazine article...). Added a whole pile of logging and tracing, especially putting stderr somewhere where it could be read back (it had been sent to /dev/null... of course.)

    So those three failures in a row? The went something like this:

    DUMP: Backing up /dev/hd9var...
    DUMP: No space left on device.
    DUMP: DUMP ABORTED
    DUMP: Backing up /dev/hd1
    DUMP: I/O ERROR
    DUMP: DUMP ABORTED
    DUMP: Backing up /dev/hd2
    DUMP: Device not ready
    DUMP: DUMP ABORTED
    DUMP: Backing up /dev/hd3
    DUMP: Dump complete!

    What the error messages didn't explain, but some experimenting found, the operation that hit end-of-tape returned end-of-tape, as expected. The next operation got an I/O error, because the last operation resulted in an I/O error, and the tape had not been rewound or ejected.

    The thing is, these tape drives would automatically re-wind when you read back an I/O error from them...

    So device not ready was obtained while the drive was rewinding. (Normally, it should just block until the rewind is complete. In this case, the NEXT command after it started rewinding would block.)

    Then the remaining backups would go fine... overstriking the eariler ones on the same tape.

    I've had a fetish for proper error-checking in scripts ever since... and I don't accept scripts written by others unless they will run with #!/bin/sh -e or equivelent.

  • by KillerBob ( 217953 ) on Tuesday August 23, 2005 @02:27PM (#13381598)
    Back when I was working at Compaq, pre-merger, my most memorable call was from a lawyer. Seems he had gotten so totally frustrated with the quality of support at HP that he'd thrown his computer through his 6th floor window and into the street below.

    Not out the window, mind you... through the window. Shattering the glass on the way by. That afternoon, he was on the phone with me, at Compaq, for help installing his software and restoring backups. (amazingly enough, he'd been smart enough to make backups just before throwing the damned thing out the window :), and didn't mind paying the $40 fee for help with unsupported software, as long as I was able to get the thing working.)

    The story does not end there. He was so happy with the support that he asked to talk to my supervisor.... A half hour later, the supervisor comes by and asks if I'm busy. He's just finished talking to the lawyer, and found out that the cause of his problem was his HP laser printer that didn't have driver support for his new Windows XP-based computer, and he didn't like being told by HP that they didn't support that printer with XP yet.... So my boss asks if I'm busy, and I say no, so he hands the guy off to me again to fix his printer, on the house.

    How did I fix it? I sent him to HP's website to download the Windows 2000 drivers for the printer. I explain to him that yes, I'm aware that he's running Windows XP, but that Windows XP shares a kernel with Windows 2000, and that because of it, XP Home will not install as an upgrade over 2k Pro. Basically, they're the same OS, except that XP has flashy graphical enhancements. (at the time, that was true). So we download the 2k driver for the printer, and 5 minutes later, he's printing again, and asks to talk to the supervisor again. :)

    Long story short... $100 gift certificate for a local steakhouse, and a plaque that reads "top letter generator" is all I have to show for it. Oh, and the satisfaction of knowing that 4 months later, HP announced the Compaq acquisition.... I bet he was peeved at that. :)
  • by Frank T. Lofaro Jr. ( 142215 ) on Tuesday August 23, 2005 @02:51PM (#13381885) Homepage
    I don't know if that would be a good school or a bad one to go to.

    Pro:

    Lunch lasts a few hours!

    Con:

    You get zapped with electricity from bizarre electrified asphalt (or was it just a ground and your mike was electrified :)

    I once was in a house where neutral and hot were reversed to a light fixture. Worked fine until we decided to upgrade it to something better and the electrician got zapped. :)
  • Re:My ones (Score:1, Interesting)

    by Anonymous Coward on Tuesday August 23, 2005 @02:59PM (#13381978)
    On my primary machine, I have a full set of standard utilities linked against uclibc in aonther directory, and a static pivot_root and shell. I'm not all that worried if I ever break something that badly.
  • MTC... (Score:5, Interesting)

    by King_TJ ( 85913 ) on Tuesday August 23, 2005 @03:01PM (#13382003) Journal
    Actually, though I'm sure you're correct in some cases about the cold helping with a malfunctioning temp. sensor in the drive - I think the freezer trick also sometimes just works because of defective IC chips on the controller board portion of the drive.

    (Every IDE hard drive actually has the drive controller electronics bolted to a circuit board on the bottom of it. That's why the "IDE interface" is such a basic thing on your PC, whether it's integrated onto the motherboard or is a seperate PCI card. Most of the real work is done on the drive's electronics.)

    With some malfunctioning electronics, you can manage to keep them working properly as long as you keep them cold enough. (One of the old tricks for troubleshooting bad parts in TV sets and the like was to selectively spray them with a can of compressed air, chilling them temporarily.)
  • by Anonymous Coward on Tuesday August 23, 2005 @03:11PM (#13382089)
    We've recovered drives that way for long periods of time.

    If the bearings on the bottom of the drive are going out, just flip the drive upside down and it will mainly use the other bearings.
  • Couple of mine (Score:2, Interesting)

    by scoobrs ( 779206 ) on Tuesday August 23, 2005 @03:19PM (#13382146)
    An underling at a used computer store I worked at in college was building a PC for the city fire chief. He chose a macintosh monitor off of the used monitor shelf and attached it to the PC's serial port, which matches the connector and carries a charge. Ironically, this lit the monitor on fire. The fire chief was a good sport about it.

    A college roommate was impressed by how easily I built my own computers and got too anxious to install his DDR memory in his new PC when he got home that he didn't wait for an expert to arrive. He managed to jam the memory in backwards against the key as hard as he could until he heard a snap. Both the motherboard and memory started on fire.

    This is a great site to get other people's stories from: http://rinkworks.com/stupid/ [rinkworks.com]

  • Two of mine (Score:1, Interesting)

    by Anonymous Coward on Tuesday August 23, 2005 @03:56PM (#13382482)

    Here are a couple of personal favorites. These were 100% my fault so I'm going for atonement here.

    #1: Few years back i'm doing a remote import of a large database onto a client's system. The idea here was to replicate a good db then import the data into the tables via sql embedded in csh scripts.

    After the import (~300 GB), I needed to go in and delete some records that the client wasn't supposed to have. I wrote some slick PL/SQL to loop through the table and delete the records. In order to speed it up, I also included a COMMIT command after every 50 records (this means no rollbacks to you non-DBA's out there).

    Well, the lesson here is to always double check your logic. Of course I deleted all of the good data and left the bad. Compound this with the fact that they had never tested their backup system so all of the tapes were bad. Three weeks later, the system is finally back online. That sucked.

    #2: Early 90's and I'm in an all Mac shop. We were experimenting with AppleTalk and one of the graphic designers was sharing her system with mine.

    I don't know about you guys but they way I had learned to eject a disk was to drag the icon over to the trash can. I dragged her share folder icon over the trash can and proceeding to watch her system files get deleted. That one was fun

  • by dragondm ( 30289 ) on Tuesday August 23, 2005 @04:41PM (#13382897) Homepage
    Yeh, I once had two $1000+ specialized video cards explode on me due to that.

    The GPU blew right off the board.
    Twice.

    (These cards were "multiconsole cards", essentialy 4 video cards crammed onto one board, plus a usb-like serial bus. the card had 4 rj45 connectors, and you used cat-5 to connect 4 port expander boxes w/ standard svga, mouse, and keyboard connecters on 'em. It allowed you to put 4 extra monitor+keyboard+mouse comboes on a single computer. Usefull for POS stuff. ) The first time it happened, we didn't know what was going on. We thought it was a short or something in the board. The second time, tho, I got a nasty shock as I was plugging the cable in, and the chip exploded inches from my nose (the case was open). I recognized the feel of 120vac, and checked the outlets. Turns out that someone took a shortcut wiring the building. They only used 2 wire cable in the walls, so they wired the ground prong of the outlets to the neutral. That would have been fine, except for the fact that the cable for the outlet the computer was plugged into had it's hot & neutral flipped at the fuse box. Thus the ground of the computer was 120vac off from the ground on the monitor I was plugging in. Yikes.

    Oddly enough, the computer survived both incidents just fine.
  • Re:#1 Works! (Score:4, Interesting)

    by jandrese ( 485 ) * <kensama@vt.edu> on Tuesday August 23, 2005 @04:59PM (#13383056) Homepage Journal
    Interesting. I wonder if the on-disk firmware for the drive (yes HDDs run mostly off of code stored on the disk) got corrupted and by holding the arm back you were forcing the drive to run in a minimally functional PIO mode or something. I had a batch of Maxtors that were terrible about corrupting their on-drive firmware. The problem manifested as the drives silently returning corrupt data too, which was highly annoying. Fortunately it's a dead giveaway when you reboot the machine and see:

    ...
    ad10: 76319MB <MAXTOR 4K080H4/A08.1500> [155061/16/63] at ata5-master UDMA100
    ad12: 76319MB <MAXxo`yk.@#l2fv9!..3u> [155061/16/63] at ata6-master UDMA100
    ad14: 76319MB <MAXTOR 4K080H4/A08.1500> [155061/16/63] at ata7-master UDMA100
    ...
  • by DragonHawk ( 21256 ) on Tuesday August 23, 2005 @09:59PM (#13385637) Homepage Journal
    "Hey, did you know RPM will let you remove every package from the system?"

    I once had cause to utter the above sentence. I was working on a customer's web server remotely. I was performing some maintenance, upgrading this, migrating that. At one point, I had a list of installed packages I wanted to remove from the system. Well, I screwed up something and somehow managed to run "rpm --erase" with a list of every package currently installed on the system. I was multi-tasking and had switched to other things, so I didn't really clue in to the fact that my RPM transaction was taking way too long to run until some of the scripts tied to the uninstall action started complaining because things like "perl" were missing. I started pounding on [CTRL]+[C] but it was already too late. Almost everything was gone. I couldn't even scp files in.

    That was a fun drive to the client site. At least the data was all still there, since only the software was removed.
  • by farble1670 ( 803356 ) on Wednesday August 24, 2005 @02:40AM (#13387030)
    i appreciate that this topic has gotten a lot of works to relay their interesting and sometimes funny computer tragedies, but does any else think this post is not a lot more than an attempt to get folks to look at the banner ads on the initial "top 10" link? really, a guy got mad at his laptop and put it in the toilet and flushed? is that funny? more like the guy making up the list was running out of ideas towards the end of the list.

The moon is made of green cheese. -- John Heywood

Working...