53

I feel like a kid in the principal's office explaining that the dog ate my homework the night before it was due, but I'm staring some crazy data loss bug in the face and I can't figure out how it happened. I would like to know how git could eat my repository whole! I've put git through the wringer many times and it's never blinked. I've used it to split a 20 Gig Subversion repo into 27 git repos and filter-branched the foo out of them to untangle the mess and it's never lost a byte on me. The reflog is always there to fall back on. This time the carpet is gone!

From my perspective, all I did is run git pull and it nuked my entire local repository. I don't mean it "messed up the checked out version" or "the branch I was on" or anything like that. I mean the entire thing is gone.

Here is a screen-shot of my terminal at the incident:

incident screen shot

Let me walk you through that. My command prompt includes data about the current git repo (using prezto's vcs_info implementation) so you can see when the git repo disappeared. The first command is normal enough:

  » caleb » jaguar » ~/p/w/incil.info » ◼  zend ★ »
❯❯❯ git co master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.

There you can see I was on the 'zend' branch, and checked out master. So far so good. You'll see in the prompt before my next command that it successfully switched branches:

  » caleb » jaguar » ~/p/w/incil.info » ◼  master ★ »
❯❯❯ git pull
remote: Counting objects: 37, done.
remote: Compressing objects: 100% (37/37), done.
remote: Total 37 (delta 25), reused 0 (delta 0)
Unpacking objects: 100% (37/37), done.
From gitlab.alerque.com:ipk/incil.info
 + 7412a21...eca4d26 master     -> origin/master  (forced update)
   f03fa5d..c8ea00b  devel      -> origin/devel
 + 2af282c...009b8ec verse-spinner -> origin/verse-spinner  (forced update)
First, rewinding head to replay your work on top of it...
>>> elapsed time 11s

And just like that it's gone. The elapsed time marker outputs before the next prompt if more than 10 seconds have elapsed. Git did not give any output beyond the notice that it was rewinding to replay. No indication that it finished.

The next prompt includes no data about what branch we are on or the state of git.

Not noticing it had failed I obliviously tried to run another git command only to be told I wasn't in a git repo. Note the PWD has not changed:

  » caleb » jaguar » ~/p/w/incil.info »
❯❯❯ git fetch --all
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

After this a look around showed that I was in a completely empty directory. Nothing. No '.git' directory, nothing. Empty.

My local git is at version 2.0.2. Here are a couple tidbits from my git config that might be relevant to making out what happened:

[branch]
        autosetuprebase = always
        rebase = preserve
[pull]
        rebase = true
[rebase]
        autosquash = true
        autostash = true
[alias]
        co = checkout

For example I have git pull set to always do a rebase instead of a merge, so that part of the output above is normal.

I can recover the data. I don't think there were any git objects other than some unimportant stashes that hadn't been pushed to other repos, but I'd like to know what happened.

I have checked for:

  • Messages in dmesg or the systemd journal. Nothing even remotely relevant.
  • There is no indication of drive or file system failure (LVM + LUKS + EXT4 all look normal). There is nothing in lost+found.
  • I didn't run anything else. There is nothing in the history I'm not showing above, and no other terminals were used during this time. There are no rm commands floating around that might have executed in the wrong CWD, etc.
  • Poking at another git repo in another directory shows no apparent abnormality executing git pulls.

What else should I be looking for here?

smheidrich
  • 896
  • 9
  • 14
Caleb
  • 69,278
  • 18
  • 196
  • 226
  • i would contact upstream, but anything which could modify the fs (nfs etc.), parallel processes such as virus scanner, backup scripts, automated cleanup scripts etc. – Ulrich Dangel Jul 24 '14 at 09:51
  • Hi Caleb. Sorry to hear about your git horror story. I'd suggest also asking on #git on freenode and also the git ml. You could point them to this question. Don't you have other copies of the repos scattered around? I try to keep at least three copies of a local repos in different places. You forgot to mention your git version and OS. You could also indicate whether making your data available for others to reproduce is an option. – Faheem Mitha Jul 24 '14 at 11:32
  • @UlrichDangel No NFS, virus scanner, backup scripts or cleanup is involved. It's a local EXT4 file system on LUKS on LVM on local SATA drives. Cron is empty. – Caleb Jul 24 '14 at 12:21
  • 1
    @FaheemMitha I already mentioned in the question that the only thing lost will be some items in stash, but that's not a big deal. I still want to know how this happened and make it not do so again! – Caleb Jul 24 '14 at 12:23
  • Does the `.git` directory still exist? Are all the normal files in it (`HEAD`, `objects/`, `refs/`, etc)? – phemmer Jul 24 '14 at 12:37
  • 4
    @Patrick As I explained in the question already, no `.git` does not exist. Nothing does—what used to be the git root directory has nothing in it at all. – Caleb Jul 24 '14 at 12:39
  • Caleb, can you reproduce the problem? – Faheem Mitha Jul 24 '14 at 13:01
  • Have you looked at the interaction with `vcs_info`? I don't know it myself, but I used something like it for mercurial for a while to get a nicer prompt, but eventually did not like the idea of vcs commands being issued to get the information, and it was slow. (BTW I only had `git` throw away things once, as explained to me later, because it purged non-saved data after a few weeks during which I was away, but nothing instantaneous like your example). – Anthon Jul 24 '14 at 13:02
  • @FaheemMitha Asking this question wouldn't make much sense if I could. I only have the one instance to go on. As mentioned in the question I did try using other repos on this machine and don't see any problems. – Caleb Jul 24 '14 at 13:25
  • @Anthon Git has 'porcelain' commands that make retrieving meta info like that fairly straightforward (and fast). I've been using this shell implementation for months, and it's a widely used implementation. I don't think that's the culprit. From the terminal history is seems apparent that the git command aborted (took it's ball and went home) during the rebase. There should have been more output from git for that operation before it turned things back over to the shell. The `vcs_info()` routine would not have even gotten called until after it returned. – Caleb Jul 24 '14 at 13:29
  • 1
    Is the repository you're pulling from a “native” git repository or is it fed by git-svn? Is or was there a file called `.git` in the upstream repository somewhere? – Gilles 'SO- stop being evil' Jul 24 '14 at 21:49
  • 1
    @Gilles The project was imported from subversion so it has some git-svn stuff in history but that was a one shot deal and svn is no longer involved. I was pulling from a native git repo (gitlab over ssh). No, there is no `.git` file upstream or anywhere in history. – Caleb Jul 25 '14 at 04:31
  • Why does the output say "(forced update)"? Is that normal, or is `git pull` an alias or shell function? – Alexander Aug 04 '14 at 11:38
  • 2
    @Alexander The pull operation is normal (other that being a rebase rather than a merge). The notice about a forced update is indicating that the repo I'm pulling from had a force push that reset it from a different position than the local repo last saw it at. This is normal because I'm syncing actively developed and frequently rebased material between my own computers, not a public branch that other developers will see. – Caleb Aug 04 '14 at 13:27
  • @Caleb - That's Arch, right?. Have you looked at `journalctl`'s logs for the time frame? You've got a lot of filesystem layers going on there, and there is definitely a pointer to something at that level. Maybe `git` tripped on a physical boundary - or your filesystem did - when it should have been spanning on a logical one. – mikeserv Aug 06 '14 at 02:48
  • @Caleb As far as I can tell, when git did the rebase, it never applied any commits after rewinding head. I'm not sure if that has anything to do with it, but it may be related. – Tyzoid Aug 14 '14 at 14:24
  • Is there any chance that the commit you pulled has purposedly been altered maliciously in order to eat your homework, that is, can you trust the people who can modify `origin` to not try to pull something on you? – pqnet Aug 21 '14 at 21:14
  • @pqnet None. I am the only one with access to the git server in question and have since audited that it was not tampered with. – Caleb Aug 21 '14 at 23:11
  • 3
    @Caleb your shell prompt includes git branch indication, which means forming PS1 includes git commands not exposed in your log. They can principally change the picture and can be the issue source. You should update the question describing precisely how your shell prompt is formed, what commands are run to get a current branch, and reconsider how they could spoil your repo. – Netch Aug 30 '14 at 06:22
  • @Netch My question already identifies exactly what code is running to make that happen. – Caleb Aug 30 '14 at 07:06
  • 1
    @Caleb Did you set or change one or both of `$GIT_DIR` and `$GIT_WORK_TREE` anywhere? Setting those in the wrong way could cause problems like this, I think (It's unavoidable). – Volker Siegel Oct 25 '14 at 04:44
  • @Volker no. The only time I mess with those values is with `vcsh` managed repos of which this is not one. Those env vars would have been per default git behavior in this scenario. – Caleb Oct 25 '14 at 06:08
  • 2
    @Caleb You should really ask on the git development mailing list; You could write it as a bug report, or just ask informally - it's the same anyway. There are some developers that know git impressively well - they can probably tell by intuition what could have happened. (If not, they will just follow the discussion quietly.) And they know whether it happended before. (Reporting it there is the "official" way to report bugs for git) – Volker Siegel Oct 25 '14 at 06:21
  • Honestly this looks like a `mv` more than anything else. If a process moves your CWD out from under you, you will be left in an apparently empty directory. Fascinating question, but I doubt we will ever know for certain what really happened that fateful day. ;) – Wildcard Nov 22 '16 at 10:49
  • 7
    @Wildcard Actually I've been meaning to put together an answer to this as I actually did figure out what happened. The system had recently come up from sleep and the network had been out for days before it had gone to sleep. Somewhere in that process I'd left a pacman process running that was trying to upgrade something on the system. To make a long story short it turns out glibc got updated and the git binary got overwritten. Because of the way it forked itself, one instance ended up being different than the other and they ate each-other's lunch. The directory really was empty (not just appar – Caleb Nov 22 '16 at 11:01
  • 1
    @Caleb, aha. Yes, I'd love to see that answer. Help clear up the mystery for everyone. :) (P.S.: If you just make a scratch answer with the content of that comment, you can always update it as time goes on.) – Wildcard Nov 22 '16 at 11:06

4 Answers4

6

Yes, git ate my homework. All of it.

I made a dd image of this disk after the incident and messed around with it later. Reconstructing the series of events from system logs, I deduce what happened was something like this:

  1. A system update command (pacman -Syu) had been issued days before this incident.
  2. An extended network outage meant that it was left re-trying to download packages. Frustrated at the lack of internet, I'd put the system to sleep and gone to bed.
  3. Days later the system was woken up and it started finding and downloading packages again.
  4. Package download finished sometime just before I happened to be messing around with this repository.
  5. The system glibc installation got updated after the git checkout and before the git pull.
  6. The git binary got replaced after the git pull started and before it finished.
  7. And on the seventh day, git rested from all its labors. And deleted the world so everybody else had to rest too.

I don't know exactly what race condition occurred that made this happen, but swapping out binaries in the middle of an operation is certainly not nice nor a testable / repeatable condition. Usually a copy of a running binary is stored in memory, but git is weird and something about the way it re-spawns versions of itself I'm sure led to this mess. Obviously it should have died rather than destroying everything, but that's what happened.

Caleb
  • 69,278
  • 18
  • 196
  • 226
  • 1
    git could have failed because git is make using differend commands, insteadof a single binary. A simple git pull executes `git-fetch`, `git-rebase` or`git-merge` and `git gc` – Ferrybig Dec 05 '18 at 08:26
2

Possibly by failing at defining the file path to be deleted.

Your case reminded me a beautiful day that when my homemade remove(path) method tried to remove the root folder because the given parameter was empty string which the OS corrected (!) as the root folder.

This may be a similar git bug. Such that:

  1. Rebase command wanted to delete a file like remove(project_folder + file_path) (pseudo code)
  2. Somehow file_path was empty at the time.
  3. Command evaluated as some thing like remove(project_folder)
maliayas
  • 121
  • 4
1

With luck, you can fix this with the following command:

git reset --hard ORIG_HEAD  

When potential dangerous changes commence, git stashes your current state in ORIG_HEAD. With it you can undo a merge or rebase.

Git Manual: Undoing a Merge

Routhinator
  • 126
  • 13
  • 4
    I don't think you read the whole question. **This sort of fix is completely out of the question** because _there is no git meta-data_. Doing a reset like this requires an extant .git directory and some objects in it to work from. I have nothing. It's not just a messed up working directory, it is no longer a repository of any kind. – Caleb Aug 28 '14 at 15:19
  • Ahh, my apologies. That is very unusual. If git repo is gone I suppose there will be no way to recover unless you are shadowing files in linux and have the fs backups of the files. I will delete my answer as it is irrelevant. – Routhinator Aug 28 '14 at 15:31
  • Yes I know it's an unusual problem (and I do have backups). My question here is how it went wrong ... where to look for the bug in git or my file-system driver or whatever else could have borked to eat a directory in the middle of an operation like this. – Caleb Aug 28 '14 at 15:36
  • I am also very curious. Would hatr for something like this to happen to my repos. – Routhinator Aug 28 '14 at 16:10
-1

Looks like someone ran git push --force on this repo, and you pulled down those changes. Try cloning the repo fresh, that should get you back into a clean working state again.

conorsch
  • 101
  • 2
  • 1
    The forced push rebased the last handful of commits. That wasn't what I pulled down (the working directory is no longer a working directory!) and even if it was re-cloning wouldn't make any sense. – Caleb Aug 21 '14 at 20:55
  • 4
    I don't think you can remove someones `.git` directory with a forced push – Greg Sep 20 '17 at 04:35