Initially, we had the DevOps framework in which Development and Operation team collaborated to create an agile development ecosystem. Then a new wave came with the name of “DevSecOps” in which we integrated the security into the existing DevOps process. But nowadays a new terminology “GitOps” is getting famous because of its “Single Source of Truth” nature. Its fame has reached to this level that it was a trending topic at KubeCon.
Git is basically a file-system where you can retrieve your content through addresses. It simply means that you can insert any kind of data into git for which Git will hand you back a unique key you can use later to retrieve that content. We would be learning #gitinsideout through this blog
The Git object model has three types: blobs (for files), trees (for folder) and commits.
Objects are immutable (they are added but not changed) and every object is identified by its unique SHA-1 hash
A blob is just the contents of a file. By default, every new version of a file gets a new blob, which is a snapshot of the file (not a delta like many other versioning systems).
A tree is a list of references to blobs and trees.
A commit is a reference to a tree, a reference to parent commit(s) and some decoration (message, author).
Then there are branches and tags, which are typically just references to commits.
Git stores the data in our .git/objects directory. After initialising a git repository, it automatically creates .git/objects/pack and .git/objects/info with no regular file. After pushing some files, it would reflect in the .git/objects/ folder
blob stores the content of a file and we can check its content by command
git cat-file -p
or git show
A commit is defined by tree, parent, author, committer, comment
All three objects ( blob,Tree,Commit) are explained in details with the help of a pictorial diagram.
Often we make changes to our code and push it to SCM. I was doing it once and made multiple changes, I was thinking it would be great if I could see the details of changes through local repository itself instead to go to a remote repository server. That pushed me to explore Git more deeply.
I just created a local remote repository with the help of git bare repository. Made some changes and tracked those changes(type, content, size etc).
Below example will help you understand the concept behind it.
Suppose we have cloned a repository named kunal:
Inside the folder where we have cloned the repository, go to the folder kunal then:
I have added content(hello) to readme.md and made many changes into the same repository as:
adding 2 files modifying one
Go to the refer folder inside .git and take the SHA value for the master head:
This commit object we can explore further with the help of cat-file which will show the type and content of tree and commit object:
Now we can see a tree object inside the tree object. Further, we can see the details for the tree object which in turn contains a blob object as below:
Below is the pictorial representation for the same:
|More elaborated representation for the same :|
Below are the commands for checking the content, type and size of objects( blob, tree and commit)
We can find the details of objects( size,type,content) with the help of #git cat-file
git-cat-file:- Provide content, type or size information for repository objects
You an verify the content of commit object and its type with git cat-file as below:
kunal@work:/home/git/test/kunal/.git # cat logs/refs/heads/master
Checking the content of a blob object(README.md, kunal and sandy)
As we can see first one is adding read me , so it is giving null parent(00000…000) and its unique SHA-1 is 912a4e85afac3b737797b5a09387a68afad816d6
Below are the details that we can fetch from above SHA-1 with the help of git cat-file :
Consider one example of merge:
Created a test branch and made changes and merged it to master.
Here you can notice we have two parents because of a merge request
You can further see the content, size, type of repository #gitobjects like:
This is pretty lengthy article but I’ve tried to make it as transparent and clear as possible. Once you work through the article and understand all concepts I showed here you will be able to work with Git more effectively.
This explanation gives the details regarding tree data structure and internal storage of objects. You can check the content (differences/commits)of the files through local .git repository which stores each object with unique SHA hash. This would clear basically the internal working of git.
Hopefully, this blog would help you in understanding the git inside out and helps in troubleshooting things related to git.
Rocket Science has always fascinated me, but one thing which totally blows my mind is the concept of modules aka. modular rockets. The literal definition of modules states “A modular rocket is a type of multistage rocket which features components that can be interchanged for specific mission requirements.” In simple terms, you can say that the Super Rocket depends upon those Submodules to get the things done.
Similarly is the case in the Software world, where super projects have multiple dependencies on other objects. And if we talk about managing projects Git can’t be ignored, Moreover Git has a concept of Submodules which is slightly inspired by the amazing rocket science of modules.
Hour of Need
Being a DevOps Specialist we need to do provisioning of the Infrastructure of our clients which is sometimes common for most of the clients. We decided to Automate it, which a DevOps is habitual of. Hence, Opstree Solutions initiated an Internal project named OSM. In which we create Ansible Roles of different opensource software with the contribution of each member of our organization. So that those roles can be used in the provisioning of the client’s infrastructure.
This makes the client projects dependent on our OSM. Which creates a problem statement to manage all dependencies which might get updated over the period. And to do that there is a lot of copy paste, deleting the repository and cloning them again to get the updated version, which is itself a hair-pulling task and obviously not the best practice.
Here comes the git-submodule as a modular rocket to take our Super Rocket to its destination.
Let’s Liftoff with Git-Submodules
In simple terms, a submodule is a git repository inside a Superproject’s git repository, which has its own .git folder which contains all the information that is necessary for your project in version control and all the information about commits, remote repository address etc. It is like an attached repository inside your main repository, which can be used to reuse a code inside it as a “module“.
Let’s get a practical use case of submodules.
We have a client let’s call it “Armstrong” who needs few of our ansible roles of OSM for their provisioning of Infrastructure. Let’s have a look at their git repository below.
With the above command, we are adding a submodule named osm_java whose URL is email@example.com:oosm/osm_java.git and branch is armstrong. The name of the branch is coined armstrong because to keep the configuration of each of our client’s requirement isolated, we created individual branches of OSM’s repositories on the basis of client name.
Now if take a look at our superproject provisioner we can see a file named .gitmodules which has the information regarding the submodules.
Here you can clearly see that a submodule osm_java has been attached to the superproject provisioner.
What if there was no submodule?
If that was a case, then we need to clone the repository from osm and paste it to the provisioner then add & commit it to the provisioner phew….. that would also have worked.
But what if there is some update has been made in the osm_java which have to be used in provisioner, we can not easily sync with the OSM. We would need to delete osm_java, again clone, copy, and paste in the provisioner which sounds clumsy and not a best way to automate the process.
Being a osm_java as a submodule we can easily update that this dependency without messing up the things.
By using the above update command we have successfully updated the submodule which actually pulled the changes from OSM’s origin armstrong branch.
What have we learned?
What is Gitolite?
where vagrant is the user of my virtual machine & its IP is 192.168.0.20
Now we will install & create a gitolite user on remote machine which will be hosting gitolite.
2 nitin nitin 4096 Jan 10 17:52 conf/
2 nitin nitin 4096 Jan 9 13:43 keydir/
# Group name & members
@admin = nitin
where ‘@’ denotes the user group i.e @staff is a group & jatin, james are the users of this group and these names must be similar to the key name stored in keydir directory.
For example “jatin” user must have the public key named “jatin.pub”
Let’s have a quick test of our setup
4 (delta 0), reused 0 (delta 0)
master -> master
- Maven-release plugin creates .backup and release.properties files to your working directory which can be committed mistakenly, when they should not be. jgit-flow maven plugin doesn’t create these files or any other file in your working directory.
- Maven-release plugin create two tags.
- Maven-release plugin does a build in the prepare goal and a build in the perform goal causing tests to run 2 times but jgit-flow maven plugin builds project once so tests run only once.
- If something goes wrong during the maven plugin execution, It become very tough to roll it back, on the other hand jgit-flow maven plugin makes all changes into the branch and if you want to roll back just delete that branch.
- jgit-flow maven plugin doesn’t run site-deploy
- jgit-flow maven plugin provides option to turn on/off maven deployment
- jgit-flow maven plugin provides option to turn on/off remote pushes/tagging
- jgit-flow maven plugin keeps the master branch always at latest release version.
How to use Jgit-flow maven Plugin for Release
- Add the following lines in your pom.xml for source code management access
- Add these line to resolve the Jgit-flow maven plugin and put the other option that will be required during the build
com.atlassian.maven.plugins maven-jgitflow-plugin 1.0-m4.3 true false true true true true true true master-test deploy-test
- Add the following lines in your pom.xml for source code management access
Above code snippet will perform following steps:
- Maven will resolve the jgitflow plug-in dependency
- In the configuration section, we describe how jgit-flow plug-in will behave.
- pushRelease XML tag to enable and disable jgit-flow from releasing the intermediate branches into the git or not.
- keepBranch XML tag to enable and disable the plug-in for keep the intermediate branch or not.
- noTag XMl tag to enable and disable the plug-in to create the that tag in git.
- allowUntracked XML tag to whether allow untracked file during the checking.
- flowInitContext XML tag is used to override the default and branch name of the jgit-flow plug-in
- In above code snippet, there is only two branches, master from where that code will be pulled and a intermediate branch that will be used by the jgit-flow plug-in. as I have discussed that jgit-flow plug-in uses the branches to keep it records. so development branch will be created by the plug-in that resides in the local not remotely, to track the release version etc.
- To put your releases into the repository manager add these lines
<distributionManagement> <repository> <id><auth id></id> <url><repo url of repository managers></url> </repository> <snapshotRepository> <id><auth id></id> <url><repo url of repository managers></url> </snapshotRepository> </distributionManagement>
- Put the following lines into your m2/settings.xml with your repository manager credentials
<settings> <servers> <server> <id><PUT THE ID OF THE REPOSITORY OR SNAPSHOTS ID HERE></id> <username><USERNAME></username> <password><PASSWORD></password> </server> </servers> </settings>
Start Release jgit-flow maven plugin command
Finish Release jgit-flow maven plugin command
For a example I have created a repository in github.com. for testing and two branch master-test and deploy-test. It is assumed that you have configured maven and git your system.
This command will take input from you for release version and create a release branch with release/. then it will push this release branch into github repository for temporarily because we are not saving the intermediate branched
Now At the end run this command
$ mvn -Dmaven.test.skip=true jgitflow:release-finish
after finishing this command it will delete release/ from local and remote.
Now you can check the changes in pom file by jgitflow. in the above snapshot, it is master-test branch, you can see in the tag it has removed the snapshot and also increased the version. It hold the current version of the application.
And in the deploy-test branch it show you new branch on which developers are working on
What we intend to do
What all we will be doing to achieve it
- Finalize a SCM tool that we are going to use puppet/chef/ansible.
- Automated setup of Jenkins using SCM tool.
- Automated setup of Nexus/Artifactory/Archiva using SCM tool.
- Automated setup of Sonar using SCM tool.
- Dev Environment setup using SCM tool: Since this is a web app project so our Devw443 environment will have Nginx & tomcat.
- QA Environment setup using SCM tool: Since this is a web app project so our QA environment will have Nginx & tomcat.
- Creation of various build jobs
- Code Stability Job.
- Code Quality Job.
- Code Coverage Job.
- Functional Test Job on dev environment.
- Creation of release Job.
- Creation of deployment job to do deployment on Dev & QA environment.
The reason behind this issue is that if you are using git with ssh protocol it tries to use your private key to perform git operations over ssh protocol & the location it expects is the .ssh folder at home directory of user. To fix this issue you have to create a HOME environment variable and point to your home directory where your .ssh folder exists after that restart Jenkins & now it should work fine.