We often face complications after a certain point when we can not change the foundation layer of our code because we haven’t thought it through and didn’t plan or strategize the way of writing code in the beginning, there are certain points which should be taken under consideration similarly there are some common mistakes which we should avoid.
For those who have surfed straight to this blog, please check out the previous part of this series Unix File Tree Part-1 and those who have stayed tuned for this part, welcome back.In the previous part, we discussed the philosophy and the need for file tree. In this part, we will dive deep into the significance of each directory.
Dayum!! that’s a lot of stuff to gulp at once, we’ll kick out things one after the other.
Let’s talk about the crucial directories which play a major role.
- /bin: When we started crawling on Linux this helped us to get on our feet yes, you read it right whether you want to copy any file, move it somewhere, create a directory, find out date, size of a file, all sorts of basic operations without which the OS won’t even listen to you (Linux yawning meanwhile) happens because of the executables present in this directory. Most of the programs in /bin are in binary format, having been created by a C compiler, but some are shell scripts in modern systems.
- /etc: When you want things to behave the way you want, you go to /etc and put all your desired configuration there (Imagine if your girlfriend has an /etc life would have been easier). whether it is about various services or daemons running on your OS it will make sure things are working the way you want them to.
- /var: He is the guy who has kept an eye over everything since the time you have booted the system (consider him like Heimdall from Thor). It contains files to which the system writes data during the course of its operation. Among the various sub-directories within /var are /var/cache (contains cached data from application programs), /var/games(contains variable data relating to games in /usr), /var/lib (contains dynamic data libraries and files), /var/lock (contains lock files created by programs to indicate that they are using a particular file or device), /var/log (contains log files), /var/run (contains PIDs and other system information that is valid until the system is booted again) and /var/spool (contains mail, news and printer queues).
- /proc: You can think of /proc just like thoughts in your brain which are illusions and virtual. Being an illusionary file system it does not exist on disk instead, the kernel creates it in memory. It is used to provide information about the system (originally about processes, hence the name). If you navigate to /proc The first thing that you will notice is that there are some familiar-sounding files, and then a whole bunch of numbered directories. The numbered directories represent processes, better known as PIDs, and within them, a command that occupies them. The files contain system information such as memory (meminfo), CPU information (cpuinfo), and available filesystems.
- /opt: It is like a guest room in your house where the guest stayed for prolong period and became part of your home. This directory is reserved for all the software and add-on packages that are not part of the default installation.
- /usr: In the original Unix implementations, /usr was where the home directories of the users were placed (that is to say, /usr/someone was then the directory now known as /home/someone). In current Unixes, /usr is where user-land programs and data (as opposed to ‘system land’ programs and data) are. The name hasn’t changed, but its meaning has narrowed and lengthened from “everything user related” to “user usable programs and data”. As such, some people may now refer to this directory as meaning ‘User System Resources’ and not ‘user’ as was originally intended.
Potato or Potaaato what is the difference?
We’ll be discussing those directories which confuse us always, which have almost a similar purpose but still are in separate locations and when asked about them we go like ummmm…….
/bin vs /usr/bin vs /sbin vs /usr/local/bin
This might get almost clear out when I explained the significance of /usr in the above paragraph. Since Unix designers planned /usr to be the local directories of individual users so it contained all of the sub-directories like /usr/bin, /usr/sbin, /usr/local/bin. But the question remains the same how the content is different?
- /usr/bin is a standard directory on Unix-like operating systems that contains most of the executable files that are not needed for booting or repairing the system.
- A few of the most commonly used are awk, clear, diff, du, env, file, find, free, gzip, less, locate, man, sudo, tail, telnet, time, top, vim, wc, which, and zip.
- The /usr/sbin directory contains non-vital system utilities that are used after booting.
- This is in contrast to the /sbin directory, whose contents include vital system utilities that are necessary before the /usr directory has been mounted (i.e., attached logically to the main filesystem).
- A few of the more familiar programs in /usr/sbin are adduser, chroot, groupadd, and userdel.
- It also contains some daemons, which are programs that run silently in the background, rather than under the direct control of a user, waiting until they are activated by a particular event or condition such as crond and sshd.
I hope I have covered most of the directories which you might come across frequently and your questions must have been answered.
Now that we know about the significance of each UNIX directory, It’s time to use them wisely the way they are supposed to be.
Please feel free to reach me out for any suggestions.
Goodbye till next time!
Nature has its own way to reach out for perfection and the same should be our instinct to make our creations perfect.
Consider an ideal software or package which requires multiple files to function properly.
- Binary files
- Configuration files
- Log files
- Data files
- Metadata files during execution
For now, let’s consider there is just one directory and I am keeping all of the dependent files in that directory.
$ ls package-1.binary package-1.conf package-1.data package-1.lib package-1.log package-1.tmp
Another software comes in the picture which has its own dependent files.
$ ls package-1.binary package-1.data package-1.log package-2.binary package-2.data package-2.log package-1.conf package-1.lib package-1.tmp package-2.conf package-2.lib package-2.tmp
Things will get messy while dealing with various software since handling them won’t be easy and will lead to a chaotic situation.
- Binary files –> /dir-1
- Configuration files –> /dir-2
- Log files –> /dir-3
- Data files –> /dir-4
- Meta files –> /dir-5
- Libraries –> /dir-6
As the work gets overloaded I need more admins to support they won’t be able to relate with the naming convention as I did.
To escape this situation the creator of Unix decided to follow a philosophy “Convention over Configuration”.
As the name suggest giving priority to defined convention over individual’s configuration. So that everyone should be on the same page and keeping that in mind everyone else will follow.
And the simulation of the philosophy was like this
- Binary files –> /bin
- Configuration files –> /etc
- Log files –> /log
- Data files –> /var
- Meta files –> /tmp
- Libraries –> /lib
Which resulted in the Unix File Tree
$ tree -d -L 1 . ├── apps ├── bin ├── boot ├── dev ├── etc ├── home ├── lib ├── lib64 ├── lost+found ├── media ├── mnt ├── opt ├── proc ├── root ├── run ├── sbin ├── snap ├── srv ├── sys ├── tmp ├── usr └── var 22 directories
You might be thinking that how will Unix figure out where is the configuration file, where is the binary and rest of the stuff of the software.
Here comes the role of the PATH variable
$ echo $PATH /home/dennis/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
So now we have a proper understanding of why do we need a File Tree.
For diving deep into the significance of each one of the directory stay tuned for Unix File Tree Part-2.
Rocket Science has always fascinated me, but one thing which totally blows my mind is the concept of modules aka. modular rockets. The literal definition of modules states “A modular rocket is a type of multistage rocket which features components that can be interchanged for specific mission requirements.” In simple terms, you can say that the Super Rocket depends upon those Submodules to get the things done.
Similarly is the case in the Software world, where super projects have multiple dependencies on other objects. And if we talk about managing projects Git can’t be ignored, Moreover Git has a concept of Submodules which is slightly inspired by the amazing rocket science of modules.
Hour of Need
Being a DevOps Specialist we need to do provisioning of the Infrastructure of our clients which is sometimes common for most of the clients. We decided to Automate it, which a DevOps is habitual of. Hence, Opstree Solutions initiated an Internal project named OSM. In which we create Ansible Roles of different opensource software with the contribution of each member of our organization. So that those roles can be used in the provisioning of the client’s infrastructure.
This makes the client projects dependent on our OSM. Which creates a problem statement to manage all dependencies which might get updated over the period. And to do that there is a lot of copy paste, deleting the repository and cloning them again to get the updated version, which is itself a hair-pulling task and obviously not the best practice.
Here comes the git-submodule as a modular rocket to take our Super Rocket to its destination.
Let’s Liftoff with Git-Submodules
In simple terms, a submodule is a git repository inside a Superproject’s git repository, which has its own .git folder which contains all the information that is necessary for your project in version control and all the information about commits, remote repository address etc. It is like an attached repository inside your main repository, which can be used to reuse a code inside it as a “module“.
Let’s get a practical use case of submodules.
We have a client let’s call it “Armstrong” who needs few of our ansible roles of OSM for their provisioning of Infrastructure. Let’s have a look at their git repository below.
With the above command, we are adding a submodule named osm_java whose URL is firstname.lastname@example.org:oosm/osm_java.git and branch is armstrong. The name of the branch is coined armstrong because to keep the configuration of each of our client’s requirement isolated, we created individual branches of OSM’s repositories on the basis of client name.
Now if take a look at our superproject provisioner we can see a file named .gitmodules which has the information regarding the submodules.
Here you can clearly see that a submodule osm_java has been attached to the superproject provisioner.
What if there was no submodule?
If that was a case, then we need to clone the repository from osm and paste it to the provisioner then add & commit it to the provisioner phew….. that would also have worked.
But what if there is some update has been made in the osm_java which have to be used in provisioner, we can not easily sync with the OSM. We would need to delete osm_java, again clone, copy, and paste in the provisioner which sounds clumsy and not a best way to automate the process.
Being a osm_java as a submodule we can easily update that this dependency without messing up the things.
By using the above update command we have successfully updated the submodule which actually pulled the changes from OSM’s origin armstrong branch.
What have we learned?
Recently I was asked to set up a CI- Pipeline for a Spring based application.
I said “piece of cake”, as I have already worked on jenkins pipeline, and knew about maven so that won’t be a problem. But there was a hitch, “pipeline of Gitlab CI“. I said “no problem, I’ll learn about it” with a Ninja Spirit.
So for starters what is gitlab-ci pipeline. For those who have already work on Jenkins and maven, they know about the CI workflow of Building a code , testing the code, packaging, and deploy it using maven. You can add other goals too, depending upon the requirement.
The CI process in GitLab CI is defined within a file in the code repository itself using a YAML configuration syntax.
The work is then dispatched to machines called runners, which are easy to set up and can be provisioned on many different operating systems. When configuring runners, you can choose between different executors like Docker, shell, VirtualBox, or Kubernetes to determine how the tasks are carried out.
What we are going to do?
We will be establishing a CI/CD pipeline using gitlab-ci and deploying artifacts to NEXUS Repository.
- Gitlab server, I’m using gitlab to host my code.
- Runner server, It could be vagrant or an ec2 instance.
- Nexus Server, It could be vagrant or an ec2 instance.
Before going further, let’s get aware of few terminologies.
- Artifacts: Objects created by a build process, Usually project jars, library jar. These can include use cases, class diagrams, requirements, and design documents.
- Maven Repository(NEXUS): A repository is a directory where all the project jars, library jar, plugins or any other project specific artifacts are stored and can be used by Maven easily, here we are going to use NEXUS as a central Repository.
- CI: A software development practice in which you build and test software every time a developer pushes code to the application, and it happens several times a day.
- Gitlab-runner: GitLab Runner is the open source project that is used to run your jobs and send the results back to GitLab. It is used in conjunction with GitLab CI, the open-source continuous integration service included with GitLab that coordinates the jobs.
- .gitlab-ci.yml: The YAML file defines a set of jobs with constraints stating when they should be run. You can specify an unlimited number of jobs which are defined as top-level elements with an arbitrary name and always have to contain at least the script clause. Whenever you push a commit, a pipeline will be triggered with respect to that commit.
Strategy to Setup Pipeline
Step-1: Setting up GitLab Repository.
# Now lets start pushing this code to gitlab
Step-2: Install GitLab Runner manually on GNU/Linux
# Give it permissions to execute:
# Optionally, if you want to use Docker, install Docker with:
# Create a GitLab CI user:
# Install and run as service:
Step-3: Registering a Runner
# Run the following command:
# Please enter the gitlab-ci coordinator URL (e.g. https://gitlab.com/):
# Please enter the gitlab-ci token for this runner:
# Please enter the gitlab-ci description for this runner:
# Please enter the gitlab-ci tags for this runner (comma separated):
# Please enter the executor: docker, docker-ssh, shell, ssh, virtualbox, docker+machine, parallels, docker-ssh+machine, kubernetes:
# Please enter the default Docker image (e.g. ruby:2.1):
# You can also create systemd service in /etc/systemd/system/gitlab-runner.service.
[Unit] Description=GitLab Runner After=syslog.target network.target ConditionFileIsExecutable=/usr/local/bin/gitlab-runner [Service] StartLimitInterval=5 StartLimitBurst=10 ExecStart=/usr/local/bin/gitlab-runner "run" "--working-directory" "/home/gitlab-runner" "--config" "/etc/gitlab-runner/config.toml" "--service" "gitlab-runner" "--syslog" "--user" "gitlab-runner" Restart=always RestartSec=120 [Install] WantedBy=multi-user.target
Step-4: Setting up Nexus Repository
You can setup a repository installing the open source version of Nexus you need to visit Nexus OSS and download the TGZ version or the ZIP version.
But to keep it simple, I used docker container for that.
# Install docker
# Launch a NEXUS container and bind the port
You can access your nexus now on http://:8081/nexus.
And login as admin with password admin123.
Step-5: Configure the NEXUS deployment
Clone your code and enter the repository
# Create a folder called
.m2 in the root of your repository
# Create a file called
settings.xml in the
# Copy the following content in settings.xml
Username and password will be replaced by the correct values using variables.
# Updating Repository path in pom.xml
Step-6: Configure GitLab CI/CD for simple maven deployment.
.gitlab-ci.yml, to read the definitions for jobs that will be executed by the configured GitLab Runners.
- NEXUS_REPO_URL: http://:8081/nexus/content/repositories/
- NEXUS_REPO_USER: admin
- NEXUS_REPO_PASS: admin123
Now it’s time to define jobs in
.gitlab-ci.yml and push it to the repo:
Now add the changes, commit them and push them to the remote repository on gitlab. A pipeline will be triggered with respect to your commit. And if everything goes well our mission will be accomplished.
Note: You might get some issues with maven plugins, which will need to managed in pom.xml, depending upon the environment.
In this blog, we covered the basic steps to use a Nexus Maven repository to automatically publish and consume artifacts.