Git Sub Modules- A Visual Introduction

Author: Paramvir Singh Karwal

Published: 2021-08-03 04:58:15.0


Website handcrafted by Paramvir Singh Karwal

title Git is by far the most advanced version control system (VCS) that is out there. Git provides so much of control over the way the files are stored and versioned and without having to worry about the integrity of the files. Out of many other capabilities which makes git shine among all other VCSs comes capability of having a module within another module. Let’s go through the basic understanding of git sub modules and then see how can we can leverage it.


Why use git sub modules?

As we already know that a git submodule is just a module that is kept inside another module. But why would we do that? Suppose you are working on a project that is supposed to rate the candidates for some online test based on the answers they submit for your questions. Let’s say that generation of ratings of candidates is complex and it requires specific algorithms which is to be handled by a rating engine module which a third party team is developing whereas your module just has to record the responses and show the final results.

At this point it is obvious that you will have to use the rating engine module. Both of these modules are being maintained in a different repository. Now you can for sure copy the code of the rating engine module repo in your module and start using it. But what if the third party team pushes another enhancement to their module repo? You will either be using their old code assuming you are not aware of the change or you will have to copy the code again. This is inefficient approach and is not recommended.

The other thing that you can do is to add a submodule within your module which will point to the remote repo of the third party module. Now if there is any change that is pushed by the third party team, you can easily pull it using git like you would do with any other git repo. Git allows you to keep a clone of this repo as a subdirectory. This makes whole process pain less and avoids issues.


How to use git sub modules?

Let’s try to understand it with foo and bar git repo. We will keep the bar repo as a submodule inside foo module.

[/temp/git/demo]$ ls -lrt
total 0
drwxr-xr-x 1 PARAM 197121 0 Aug 21 02:15 foo

As of now foo module contains just one foo_readme.txt file.

[/temp/git/demo/foo]$ ls -lrt
total 0
-rw-r–r– 1 PARAM 197121 0 Aug 21 02:30 foo_readme.txt

Similarly we have a bar_readme.txt file inside bar module.

[/temp/git/demo/bar]$ ls -lrt
total 0
-rw-r–r– 1 PARAM 197121 0 Aug 21 02:46 bar_readme.txt

We will use colours to indicate the current state of modules with respect to the remote repository. This will help to visualize the changes. So the outline of a module will be blue if that module is up to date with the remote repo, red if it is behind the remote repo and green if the local repo is ahead of the remote repo. In starting we have two modules foo and bar which are up to date with the remote repos.

  title  

Adding a submodule bar inside foo:

[/temp/git/demo/foo]$ git submodule add https://github.com/paramvirkarwal/bar.git
Cloning into ‘C:/demo/foo/bar’…
remote: Counting objects: 3, done.
remote: Total 3 (delta 0), reused 3 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.
warning: LF will be replaced by CRLF in .gitmodules.
The file will have its original line endings in your working directory.

 

Check status:

[/temp/git/demo/foo]$ git status
On branch master
Your branch is up to date with ‘origin/master’.
Changes to be committed:
(use “git reset HEAD <file>…” to unstage)
new file:   .gitmodules
new file:   bar

Here you will notice that .gitmodules file will be created if it is not already there and details of the submodule are added. Lets see what are the contents of this file.

[submodule “bar”]
path = bar
url = https://github.com/paramvirkarwal/bar.git

This contains the information of the submodule, the local path and the remote path. This file will be version controlled just like other files.

 

Let's do a git commit

[/temp/git/demo/foo]$ git commit -m “adding module bar inside foo”
[master 72d0c90] adding module bar inside foo
2 files changed, 4 insertions(+)
create mode 100644 .gitmodules
create mode 160000 bar

See below the illustration of current state of both modules:

  title  

As we just committed changes in foo module the outline shown is green which means local repo is ahead of the remote repo which means that the changes of foo are not yet pushed to remote. Notice that outline of bar submodule is still blue as there are no changes done for this module. Now let’s push the foo changes to remote repo.

[/temp/git/demo/foo]$ git push origin master
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 403 bytes | 403.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/paramvirkarwal/foo.git
25ef758..72d0c90  master -> master
[/temp/git/demo/foo]$ git status
On branch master
Your branch is up to date with ‘origin/master’.


nothing to commit, working tree clean

Let’s see the diagrammatic illustration again. As at this point both the modules are up to date with remote repo their outline is shown as blue.

  title  

Now let’s say that another developer / third party made a change in the bar repo by adding a new file instructions.txt in the remote repo. Try to imagine what will be the state of these modules with respect to their corresponding remote repos? Interestingly git status on foo will show that the foo module is up to date with the remote repo.

[/temp/git/demo/foo]$ git remote update
Fetching origin
[/temp/git/demo/foo]$ git status
On branch master
Your branch is up to date with ‘origin/master’.
nothing to commit, working tree clean

Try doing the same with the bar sub module inside foo module.

[/temp/git/demo/foo/bar]$ git remote update
Fetching origin
remote: Counting objects: 2, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 2 (delta 0), reused 2 (delta 0), pack-reused 0
Unpacking objects: 100% (2/2), done.
From https://github.com/paramvirkarwal/bar
95b2f8a..2be626a  master     -> origin/master
[/temp/git/demo/foo/bar]$ git status
On branch master
Your branch is behind ‘origin/master’ by 1 commit, and can be fast-forwarded.
(use “git pull” to update your local branch)
nothing to commit, working tree clean

It will say that the local repo is behind the remote repo. Now let’s put a diagram corresponding to it. The bar repo is shown with the red outline as it behind the remote repo.

  title  

Now use git pull inside the bar submodule to get the latest code and do a git status.

[/temp/git/demo/foo/bar]$ git pull
Updating 95b2f8a..2be626a
Fast-forward
instructions.txt | 0
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 instructions.txt
[/temp/git/demo/foo/bar]$ git status
On branch master
Your branch is up to date with ‘origin/master’.
nothing to commit, working tree clean

Now it shows that the local bar repo is up to date with the remote bar repo. That’s good! But wait, what happens to the foo module which is the parent of the bar submodule? Do keep in mind that we have modified a subdirectory inside the foo. As the contents inside the foo were modified, the local foo module is now ahead of the remote foo repo. Try doing a git status for the foo module.

[/temp/git/demo/foo]$ git status
On branch master
Your branch is up to date with ‘origin/master’.
Changes not staged for commit:
(use “git add <file>…” to update what will be committed)
(use “git checkout — <file>…” to discard changes in working directory)
modified:   bar (new commits)
no changes added to commit (use “git add” and/or “git commit -a”)

Notice that foo sees as a new commit in the bar repo and at this point of time it treats/shows bar as one entity. It does not show the individual changes in the bar repo (Well of course you can see it using git diff –cached –submodule command )

Let's commit the changes:

[/temp/git/demo/foo]$ git add bar
[/temp/git/demo/foo]$ git commit -m “updated submodule”
[master afdcd73] updated submodule
1 file changed, 1 insertion(+), 1 deletion(-)
[/temp/git/demo/foo]$ git status
On branch master
Your branch is ahead of ‘origin/master’ by 1 commit.
(use “git push” to publish your local commits)
nothing to commit, working tree clean

The local foo module is now ahead of the remote repo. Now let’s put a diagram corresponding to it as we have been doing. The foo module is shown as green as it is ahead of the remote repo and bar is shown with blue outline because we already got the latest code for bar using git pull.

  title  

Now go ahead and push the change for foo to remote repo as well.

[/temp/git/demo/foo]$ git push origin master
Counting objects: 2, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 301 bytes | 301.00 KiB/s, done.
Total 2 (delta 0), reused 0 (delta 0)
To https://github.com/paramvirkarwal/foo.git
72d0c90..afdcd73  master -> master
[/temp/git/demo/foo]$ git status
On branch master
Your branch is up to date with ‘origin/master’.
nothing to commit, working tree clean

Now the diagram should look like below. Both of the modules are now up to date with the remote repos and which is why these are shown with blue outline.

  title  

There are much more additional things that you can do with the git submodules. You can refer the man page for it. From the man page of git:-

$ git submodule help
usage: git submodule [–quiet] add [-b <branch>] [-f|–force] [–name <name>] [–reference <repository>] [–] <repository> [<path>]
or: git submodule [–quiet] status [–cached] [–recursive] [–] [<path>…]
or: git submodule [–quiet] init [–] [<path>…]
or: git submodule [–quiet] deinit [-f|–force] (–all| [–] <path>…)
or: git submodule [–quiet] update [–init] [–remote] [-N|–no-fetch] [-f|–force] [–checkout|–merge|–rebase] [–[no-]recommend-shallow] [–reference <repository>] [–recursive] [–] [<path>…]
or: git submodule [–quiet] summary [–cached|–files] [–summary-limit <n>] [commit] [–] [<path>…]
or: git submodule [–quiet] foreach [–recursive] <command>
or: git submodule [–quiet] sync [–recursive] [–] [<path>…]
or: git submodule [–quiet] absorbgitdirs [–] [<path>…]

Please feel free to leave your comments.

Comments:-

Kamal jeet

18-Sep-2019 23:43

Very helpful thanks for sharing!!????

star_border
outlined_flag

Paramvir

21-Sep-2019 22:12

Thanks Kamal, like always! :)

star_border
outlined_flag