Motivation

After a project has come to completion, the code repository is most likely littered with deprecated code branches, tangential development versions and obsolete directories.

How do you take this mess and convert it to a production repository without losing any history?

Setup

To demonstrate, let’s use a toy repo with branch structure:

- c0 --- c1 --- c2 --- c3 (master)
    \                    \
     \                    \
      \                    c4' --- c5' --- c6' (dev2)
       \                             \
        c1' --- c2' (dev0)            \
                                      c6'' --- c7'' (dev3)

and directory structure (assume all files present for all branches):

project/
   |-- code/
   |     |-- client1/
   |     |     |-- script1.sh
   |     |     `-- script2.sh
   |     |
   |     `-- client2/
   |           |-- script1.sh
   |           `-- script2.sh
   |
   `-- research/
         |-- tests/
         `-- method/

Filtering

The filter-branch (reference here) command allows us to filter out subdirectories of a repo as well as purge any commits that do not apply to the files contained.

The first thing you should do is move to a benign directory and clone your repo to it, just in case you make a hash of it all.

cd /tmp
git clone https://github.com/you/project.git

Let’s say we want client2’s scripts from the dev3 branch we first checkout that branch:

cd project/
git checkout dev3

Note: In linux you can skip this step and specify the branch directly in the filter-branch command.

Now lets filter out the subdirectory we’re interested in:

git filter-branch --prune-empty --subdirectory-filter code/client2/
# for Linux, from any branch:
# git filter-branch --prune-empty --subdirectory-filter code/client2/ -- dev3

Our repository on the dev3 branch now looks like this:

project/
   |-- script1.sh
   `-- script2.sh

Production Repo

Now that we have a branch and directory structure that we desire, its time to create the production repo. The first step is to create the repo on your server (e.g. GitHub, GitLab, or GOGS).

Once you have that (e.g.project_client2) set up a remote:

git remote add origin http://server.com/you/project_client2.git

Lastly, we push to the master branch of the repo:

git push origin +dev3:master

Now your production repo has dev3 as its master branch and can go forward as the base version.