Unix File Tree Part-2

For those who have surfed straight to this blog, please check out the previous part of this series Unix File Tree Part-1 and those who have stayed tuned for this part, welcome back.In the previous part, we discussed the philosophy and the need for file tree. In this part, we will dive deep into the significance of each directory.

Image result for horizontal file tree linux

Dayum!! that’s a lot of stuff to gulp at once, we’ll kick out things one after the other.

Major directories

Let’s talk about the crucial directories which play a major role.

  • /bin: When we started crawling on Linux this helped us to get on our feet yes, you read it right whether you want to copy any file, move it somewhere, create a directory, find out date, size of a file, all sorts of basic operations without which the OS won’t even listen to you (Linux yawning meanwhile) happens because of the executables present in this directory. Most of the programs in /bin are in binary format, having been created by a C compiler, but some are shell scripts in modern systems.
  • /etc: When you want things to behave the way you want, you go to /etc and put all your desired configuration there (Imagine if your girlfriend has an /etc life would have been easier). whether it is about various services or daemons running on your OS it will make sure things are working the way you want them to.
  • /var: He is the guy who has kept an eye over everything since the time you have booted the system (consider him like Heimdall from Thor). It contains files to which the system writes data during the course of its operation. Among the various sub-directories within /var are /var/cache (contains cached data from application programs), /var/games(contains variable data relating to games in /usr), /var/lib (contains dynamic data libraries and files), /var/lock (contains lock files created by programs to indicate that they are using a particular file or device), /var/log (contains log files), /var/run (contains PIDs and other system information that is valid until the system is booted again) and /var/spool (contains mail, news and printer queues).
  • /proc: You can think of /proc just like thoughts in your brain which are illusions and virtual. Being an illusionary file system it does not exist on disk instead, the kernel creates it in memory. It is used to provide information about the system (originally about processes, hence the name). If you navigate to /proc The first thing that you will notice is that there are some familiar-sounding files, and then a whole bunch of numbered directories. The numbered directories represent processes, better known as PIDs, and within them, a command that occupies them. The files contain system information such as memory (meminfo), CPU information (cpuinfo), and available filesystems.
  • /opt: It is like a guest room in your house where the guest stayed for prolong period and became part of your home. This directory is reserved for all the software and add-on packages that are not part of the default installation.
  • /usr: In the original Unix implementations, /usr was where the home directories of the users were placed (that is to say, /usr/someone was then the directory now known as /home/someone). In current Unixes, /usr is where user-land programs and data (as opposed to ‘system land’ programs and data) are. The name hasn’t changed, but its meaning has narrowed and lengthened from “everything user related” to “user usable programs and data”. As such, some people may now refer to this directory as meaning ‘User System Resources’ and not ‘user’ as was originally intended.

Potato or Potaaato what is the difference? 

We’ll be discussing those directories which confuse us always, which have almost a similar purpose but still are in separate locations and when asked about them we go like ummmm…….

/bin vs /usr/bin vs /sbin vs /usr/local/bin

This might get almost clear out when I explained the significance of /usr in the above paragraph. Since Unix designers planned /usr to be the local directories of individual users so it contained all of the sub-directories like /usr/bin, /usr/sbin, /usr/local/bin. But the question remains the same how the content is different?

/usr/bin:

  • /usr/bin is a standard directory on Unix-like operating systems that contains most of the executable files that are not needed for booting or repairing the system. 
  • A few of the most commonly used are awk, clear, diff, du, env, file, find, free, gzip, less, locate, man, sudo, tail, telnet, time, top, vim, wc, which, and zip.

/usr/sbin:

  • The /usr/sbin directory contains non-vital system utilities that are used after booting.
  • This is in contrast to the /sbin directory, whose contents include vital system utilities that are necessary before the /usr directory has been mounted (i.e., attached logically to the main filesystem). 
  • A few of the more familiar programs in /usr/sbin are adduser, chroot, groupadd, and userdel. 
  • It also contains some daemons, which are programs that run silently in the background, rather than under the direct control of a user, waiting until they are activated by a particular event or condition such as crond and sshd.

I hope I have covered most of the directories which you might come across frequently and your questions must have been answered.
Now that we know about the significance of each UNIX directory, It’s time to use them wisely the way they are supposed to be.
Please feel free to reach me out for any suggestions.
Goodbye till next time!

References: https://www.tldp.org/LDP/Linux-Filesystem-Hierarchy/html/usr.htmlhttps://askubuntu.com/questions/130186/what-is-the-rationale-for-the-usr-directoryhttps://askubuntu.com/questions/308045/differences-between-bin-sbin-usr-bin-usr-sbin-usr-local-bin-usr-localhttp://index-of.es/Varios-2/How%20Linux%20Works%20What%20Every%20Superuser%20Should%20Know.pdf
https://imgflip.com/memegenerator

Jenkins Pipeline Global Shared Libraries

Although, the coding language used here is groovy but Jenkins does not allow us to use Groovy to its fullest,  so we can say that Jenkins Pipelines are not exactly groovy. Classes that you may write in src, they are processed in a “special Jenkins way” and you have no control over this. Depending on the various scenarios objects in Groovy don’t behave as you would expect objects to work.

Our thought is putting all pipeline functions in vars is much more practical approach, while there is no other good way to do inheritance, we wanted to use Jenkins Pipelines the right way but it has turned out to be far more practical to use vars for global functions.

Practical Strategy
As we know Jenkins Pipeline’s shared library support allows us to define and develop a set of shared pipeline helpers in this repository and provides a straightforward way of using those functions in a Jenkinsfile.This simple example will just illustrate how you can provide input to a pipeline with a simple YAML file so you can centralize all of your pipelines into one library. The Jenkins shared library example:And the example app that uses it:

Directory Structure

You would have the following folder structure in a git repo:

└── vars
    ├── opstreePipeline.groovy
    ├── opstreeStatefulPipeline.groovy
    ├── opstreeStubsPipeline.groovy
    └── pipelineConfig.groovy

Setting up Library in Jenkins Console.

This repo would be configured in under Manage Jenkins > Configure System in the Global Pipeline Libraries section. In that section Jenkins requires you give this library a Name. Example opstree-library

Pipeline.yaml

Let’s assume that project repository would have a pipeline.yaml file in the project root that would provide input to the pipeline:Pipeline.yaml

ENVIRONMENT_NAME: test
SERVICE_NAME: opstree-service
DB_PORT: 3079
REDIS_PORT: 6079

Jenkinsfile

Then, to utilize the shared pipeline library, the Jenkinsfile in the root of the project repo would look like:

@Library ('opstree-library@master') _
opstreePipeline()

PipelineConfig.groovy

So how does it all work? First, the following function is called to get all of the configuration data from the pipeline.yaml file:

def call() {
  Map pipelineConfig = readYaml(file: "${WORKSPACE}/pipeline.yaml")
  return pipelineConfig
}

opstreePipeline.groovy

You can see the call to this function in opstreePipeline(), which is called by the Jenkinsfile.

def call() {
    node('Slave1') {

        stage('Checkout') {
            checkout scm
        }

         def p = pipelineConfig()

        stage('Prerequistes'){
            serviceName = sh (
                    script: "echo ${p.SERVICE_NAME}|cut -d '-' -f 1",
                    returnStdout: true
                ).trim()
        }

        stage('Build & Test') {
                sh "mvn --version"
                sh "mvn -Ddb_port=${p.DB_PORT} -Dredis_port=${p.REDIS_PORT} clean install"
        }

        stage ('Push Docker Image') {
            docker.withRegistry('https://registry-opstree.com', 'dockerhub') {
                sh "docker build -t opstree/${p.SERVICE_NAME}:${BUILD_NUMBER} ."
                sh "docker push opstree/${p.SERVICE_NAME}:${BUILD_NUMBER}"
            }
        }

        stage ('Deploy') {
            echo "We are going to deploy ${p.SERVICE_NAME}"
            sh "kubectl set image deployment/${p.SERVICE_NAME} ${p.SERVICE_NAME}=opstree/${p.SERVICE_NAME}:${BUILD_NUMBER} "
            sh "kubectl rollout status deployment/${p.SERVICE_NAME} -n ${p.ENVIRONMENT_NAME} "

    }
}

You can see the logic easily here. The pipeline is checking if the developer wants to deploy on which environment what db_port needs to be there.

Benefits

The benefits of this approach are many, some of them are as mentioned below:

  • How to write groovy code is now none of the developer’s perspective.
  • Structure of the Pipeline.yaml is really flexible, where entire data structures can be passed as input to the pipeline.
  • Code redundancy saved to a large extent.

 Jenkinsfiles could actually just look more commonly, like this:

@Library ('opstree-library@master') _
opstreePipeline()

and opstreePipeline() would just read the the project type from pipeline.yaml and dynamically run the exact function, like opstreeStatefulPipeline(), opstreeStubsPipeline.groovy() . since pipeline are not exactly groovy, this isn’t possible. So one of the drawback is that each project would have to have a different-looking Jenkinsfile. The solution is in progress!So, what do you think?

Reference links: 
Image: Google image search (jenkins.io)

Docker-Compose As A Bundled Application

When docker was released as a new containerization tool, it took the market by a storm. With its lightweight images, multi-os support, and ability to ship containers, it’s popularity only roared. I have been using it for more than six months now, I can see why it is so. Hypervisors, another type of virtualizing tools,  have been hard on hardware. Which means they require a lot of resources to run. This increases the cost of running applications way more than those running on containers. This is the problem docker solved and hence, it’s popularity. Docker engine just sits on host OS and translates the instructions from an application to the underlying OS. It does not need one extra layer of virtual OS, just the binaries and libraries of application bundled in the image. Right? Now, hold on to that thought. We all have been working with docker and an extension with docker-compose. Why? Because it makes our job easy, We are spared from typing hundreds of ad-hoc commands in terminal to set up a slightly or very complicated application with certain dependencies. We can just describe it in a `docker-compose.yml` file and our job is done. However, the problem arises when we have to share that compose file:

  • Other users might need to use the file in a different environment, so they will need to edit all the values pertaining to their need, manually, and keep separate compose files for each environment.
  • Troubleshooting various configuration issues can be a tedious task since there is no single place where the configuration of the application can be stored. Changes will have to be made in the file.
  • This also makes communication between Dev and Ops team more tricky than it has to be resulting in communication gap and time wastage.

To have a more clear picture of the issue, we can have look at the below image:

We have compose file and configuration for separate environments, we make changes according to environment needs in different compose files, which could be a long manual task depending on the size of our project.


All of this points to the fact that there is no way to bundle the applications that use efficiently-bundled docker images. See the irony here? Well, there “was” no way, until there was. Enter ‘docker-app’. This, relatively, new tool is the answer to packaging docker-compose applications. I came across it when I was, myself, struggling to re-use a docker-compose application I had written in another environment. As soon as I read about it, I had to try it, which I did and loved. It made the task much easier as it provided a template of compose file and a key-value store for environment dependent parameters.


Now, we have an artefact with extention of ‘.dockerapp’. We can pass configuration values either through CLI or files or both and it will render compose file according to those values.

Let us now go through an example of how the docker app works. I am going to deploy a dummy application Spring3hibernate from Opstree Github repository in QA env and later in PROD by making simple configuration changes.
Installing docker-app is easy, though, there is one thing one should keep in mind: it can be installed as a plugin in docker-CLI or as standalone CLI tool itself. I will be installing it as a standalone CLI tool on linux. If you wish to install it as a plugin to docker-CLI and/or on another OS, visit their Github page: https://github.com/docker/app (Also, please visit github page for basics)
Before continuing, please ensure you have docker-CLI and docker-compose installed.
Please follow below steps to install docker-app:

$ export OSTYPE="$(uname | tr A-Z a-z)"
$ curl -fsSL --output "/tmp/docker-app-${OSTYPE}.tar.gz" \
"https://github.com/docker/app/releases/download/v0.8.0/docker-app-${OSTYPE}.tar.gz"
$ tar xf "/tmp/docker-app-${OSTYPE}.tar.gz" -C /tmp/
$ install -b "/tmp/docker-app-standalone-${OSTYPE}" /usr/local/bin/docker-app

Create a new directory in your home, we’ll call it app home:

$ cd ~
$ mkdir spring3hibernate-app
$ cd spring3hibernate-app/

Now, clone the app from Opstree Github repository. This app needs only mysql as a dependency.

$ git clone https://github.com/opstree/spring3hibernate.git

We need to update database properties file and nginx config file with below contents respectively:

$ vim ~/spring3hibernate-app/spring3hibernate/src/main/resources/database.properties

Replace below content over there:

database.driver=com.mysql.jdbc.Driver
database.url=jdbc:mysql://mysql:3306/employeedb
database.user=admin
database.password=password
hibernate.dialect=org.hibernate.dialect.MySQLDialect
hibernate.show_sql=true
hibernate.hbm2ddl.auto=update
upload.dir=c:/uploads

For nginx conf file:

$ vim ~/spring3hibernate-app/spring3hibernate/nginx/default.conf
server {
    listen       80;
    server_name  localhost;

    location / {
        stub_status on;
        proxy_pass http://springapp1:8080/;

    }
# redirect server error pages to the static page /50x.html
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }

}

Move ‘default.conf’ to ~/spring3hibernate-app/spring3hibernate/nginx/conf/qa/ as we have different conf file for PROD which goes to ~/spring3hibernate-app/spring3hibernate/nginx/conf/prod/

upstream s3hbackend {
    server springapp1:8080;
    server springapp2:8080;
}
server {
       listen 80;
       location / {
           stub_status on;
           proxy_pass http://s3hbackend;
       }
  
       # redirect server error pages to the static page /50x.html
       error_page   500 502 503 504  /50x.html;
       location = /50x.html {
           root   /usr/share/nginx/html;
       }

}

This is the configuration for the nginx load balancer. Remember this, we’ll use it later. Let’s create our docker-app now, make sure you are in the app home directory
when executing this command:

$ docker-app init --single-file s3h

This will create a single file named s3h.dockerapp which will look like this: 

# This section contains your application metadata.
# Version of the application
version: 0.1.0
# Name of the application
name: s3h
# A short description of the application
description:
# List of application maintainers with name and email for each
maintainers:
  - name: ubuntu
    email:


---
# This section contains the Compose file that describes your application services.
version: "3.6"
services: {}


---
# This section contains the default values for your application parameters.

{}

As you can see this file is divided into three parts, metadata, compose, and parameters. They are all in one file because we used –single-file switch. We can divide them up in multiple files by using docker-app split command in app home directory, docker-app merge will put them back in one file. Now, for QA, we have the following configuration for s3h.dockerapp file:

version: 0.1.0
name: s3h
description:
maintainers:
  - name: atbk5
    email: adeel.ahmad@opstree.com


---
version: "3.7"
services:
  mysql:
    image: mysql:5.7
    container_name: mysql
    environment:
      MYSQL_ROOT_PASSWORD: ${mysql.env.rootpass}
      MYSQL_DATABASE: ${mysql.env.database}
      MYSQL_USER: ${mysql.env.user}
      MYSQL_PASSWORD: ${mysql.env.userpass}
    restart: always
    networks:
      - backend
    volumes:
      - db_data:/var/lib/mysql


  spring1:
    depends_on:
      - mysql
    build:
      context: ./spring3hibernate/
      dockerfile: Dockerfile
    container_name: springapp1
    restart: always
    networks:
      - backend
      - frontend


  spring2:
    depends_on:
      - mysql
    build:
      context: ./spring3hibernate/
      dockerfile: Dockerfile
    container_name: springapp2
    restart: always
    networks:
      - backend
      - frontend
    x-enabled: ${spring.app2}


  nginx:
    depends_on:
      - spring1
    image: nginx:alpine
    container_name: proxy
    restart: always
    networks:
      - frontend
    volumes:
      - ${nginx.conf}:/etc/nginx/conf.d
    ports:
      - ${nginx.port}:80
    x-enabled: ${nginx.status}


networks:
  frontend:
  backend:


volumes:
  db_data:


---
mysql:
  env:
    rootpass: password
    database: employeedb
    user: admin
    userpass: password
nginx:
  conf: /home/ubuntu/dockerApp/spring3hibernate/nginx/conf/qa
  port: 81
  status: true
spring:
  app2: false

As mentioned before, first part contains app metadata, second part contains actual compose file with lots of variables, and last part contains values of those variables. Special mention here is x-enabled variable, docker-app provides functionality to temporarily disable a service using this variable. Now, try a few commands:

$ docker-app inspect

It will produce summary of whole app.

$ docker-app render

It will replace all variables with their values and will produce a compose file

$ docker-app render --set nginx.status=”false”

It will remove nginx from docker-app compose as well as deploy

$ docker-app render | docker-compose -f - up

It will spin up all the containers according to rendered compose file. We can see the application running on port 81 of our machine.

$ docker-app --help

To check out more commands and play around a bit.
At this point, it will be better to create two directories in app home: qa and prod. Create a file in qa: qa-params.yml. Another file in prod: prod-params.yml. Copy all parameters from above s3h.dockerapp file to qa-params.yaml (or not). More importantly, copy below changes in parameters to prod-params.yml

mysql:
  env:
    rootpass: password
    database: employeedb
    user: admin
    userpass: password
nginx:
  conf: /home/ubuntu/dockerApp/spring3hibernate/nginx/conf/prod
  port: 80
  status: true
spring:
  app2: true

We are going to loadbalance springapp1 and springapp2 in PROD environment, since we have enabled springapp2 using x-enabled parameter. We have also changed nginx conf bind path to the new conf file and host port for nginx to 80 (for Production). All so easily. Run command:

$ docker-app render --parameters-file ./prod/prod-params.yaml

This command will produce a compose file ready for production deployment. Now run:

$ docker-app render --parameters-file ./prod/prod-params.yml | docker-compose -f - up

And production is deployed … Visit port 80 of your localhost to verify. What’s more exciting is that we can also share our docker-apps through docker hub, we can tag the app and push it to our remote registry as images after logging in:

$ docker login

Provide your username and password for docker hub, create an account if not yet created.

$ docker-app push --tag atbk5/s3h.dockerapp:latest

If we wish to upload additional files as well, we will have to split our project using docker-app split and put additional files in the directory before pushing. The additional files will go as attachments which can be accessed later.

Conclusion

With the arrival of docker app, our large, composite, and containerized applications can also be shipped and re-used as images. That is cool. But there’s something cooler which we haven’t explored yet. Deploying our docker-apps on kubernetes with the goal of exploring how far in management, and how optimal in delivery, we can go with our applications. Let’s keep this as a topic for the next blog. Until then, have a nice one. 🙂

Image Source: https://reflectoring.io/externalize-configuration/

Unix File Tree Part-1

Related image

Nature has its own way to reach out for perfection and the same should be our instinct to make our creations perfect.

Dennis Ritchie, father of Unix and an esteemed computer scientist might have implied the same approach for Unix directory structure.

Why?

Before getting into the hierarchy of Unix File Tree lets discuss why we need it. The need for a directory structure arises when multiple users are handling multiple software along with their dependent files. Let me explain this with a couple of scenarios.

Scenario-1:

Consider an ideal software or package which requires multiple files to function properly.

  • Binary files
  • Configuration files
  • Log files
  • Data files
  • Metadata files during execution
  • Libraries

 For now, let’s consider there is just one directory and I am keeping all of the dependent files in that directory. 

$ ls
package-1.binary  package-1.conf  package-1.data  package-1.lib  package-1.log  package-1.tmp

Another software comes in the picture which has its own dependent files.

$ ls
package-1.binary  package-1.data  package-1.log  package-2.binary  package-2.data  package-2.log
package-1.conf    package-1.lib   package-1.tmp  package-2.conf    package-2.lib   package-2.tmp

Things will get messy while dealing with various software since handling them won’t be easy and will lead to a chaotic situation.

Scenario-2:

Suppose I am a system admin and managing all of the software in the above scenario-1. To make things organized I created different directories to place the dependent files.
  • Binary files –> /dir-1
  • Configuration files –> /dir-2
  • Log files –> /dir-3
  • Data files –> /dir-4
  • Meta files –> /dir-5
  • Libraries –> /dir-6

As the work gets overloaded I need more admins to support they won’t be able to relate with the naming convention as I did.
To escape this situation the creator of Unix decided to follow a philosophy “Convention over Configuration”.
 As the name suggest giving priority to defined convention over individual’s configuration. So that everyone should be on the same page and keeping that in mind everyone else will follow.
And the simulation of the philosophy was like this

  • Binary files –> /bin
  • Configuration files –> /etc
  • Log files –> /log
  • Data files –> /var
  • Meta files –> /tmp
  • Libraries –> /lib

Which resulted in the Unix File Tree

$ tree -d -L 1
.
├── apps
├── bin
├── boot
├── dev
├── etc
├── home
├── lib
├── lib64
├── lost+found
├── media
├── mnt
├── opt
├── proc
├── root
├── run
├── sbin
├── snap
├── srv
├── sys
├── tmp
├── usr
└── var

22 directories

You might be thinking that how will Unix figure out where is the configuration file, where is the binary and rest of the stuff of the software.
Here comes the role of the PATH variable


$ echo $PATH
/home/dennis/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
These are environment variables specifying a set of directories where executable programs are located. In general, each executing process or user session has its own PATH setting.

So now we have a proper understanding of why do we need a File Tree.
For diving deep into the significance of each one of the directory stay tuned for Unix File Tree Part-2.

Cheers!

Speeding up Ansible Execution Part 2

MITOGEN


In the previous post, we discussed various ways to reduce the ansible-playbook execution time, those changes were mostly made in the ansible config file, by adding or adjusting certain parameters in the file. But as you may have noticed that those methods were not that effective in certain cases, while using those methods we have to be very cautious about the result as they may affect ansible performance in one way or the other.


Generally, for the slower ansible execution, the main culprit is the way ansible is executed on the hosts. It creates multiple SSH connections and does not fully utilize the available resources. To tackle this problem, MITOGEN came to rescue !!!

Mitogen is a distributed programming library for Python. The Mitogen extension is a set of plug-ins for Ansible that enable it to operate via Mitogen, vastly improving its performance and enhancing its functional capability.

We all know about the strategies in ansible – linear, free & debug., the mitogen is just defined in the strategy column of the config file, so it is just a strategy, we are not making any other changes in the config file of the ansible so it is not affecting any other parameter, it is just the way, playbooks will be executed on the hosts.

Now coming to the mitogen installation part, we just have to download this package at a particular location and make some changes in the ansible config file as shown below,

[defaults]

strategy_plugins = /path/to/mitogen/ansible_mitogen/plugins/strategystrategy = mitogen_linear

we have to define the path where we have stored our mitogen files, and mention the strategy as “mitogen_linear”, under the default section of the config file, and we are good to go.

Now, after the Mitogen installation part, when we run our playbook, we will notice a reasonable reduction in the execution time,

Mitogen is fast because of the following reasons,

  • One connection is created per target and system logs aren’t spammed with repeated authentication events.
  • A single network roundtrip is used to execute a step whose code already exists in RAM on the target.
  • Processes are aggressively reused, avoiding the cost of invoking Python and recompiling imports, saving 300-800 ms for every playbook step.
  • Code is cached in the RAM, which further increases the speed.
  • Generally, ansible repeatedly rewrites and extracts ZIP files to temporary directories in the target hosts, mitogen also reduces these rewrites.

      All the above-mentioned features make the ansible to run faster.

      Mitogen is another extension for ansible that provides a decrease in its execution time and it is very easy to use, I think MITOGEN is very underrated and one of its kind, and we should definitely give it a try.

      I hope I have explained everything well, any suggestion/queries are highly appreciated.


Thanks !!!

Source:

https://mitogen.networkgenomics.com/ansible_detailed.html

What Without Internet

What without Internet?

I had a dream a few days ago in which the existence of the internet was gone, When I woke up I thought about what would happen if there is no Internet for a day?


Sure, it would cause quite a bit of panic and uproar and it would be havoc for an organization to work without the internet, but if the internet resumed normally after 24 hours are over, things would return to normal pretty quickly.


Now, switch it off for a longer time, possibly a week or a month, that would have a more lasting impact, since, in that time, a significant number of people would find themselves unable to meet their obligations or do their business at all. This would be somewhat mitigated by the fact that the situation is a sort of a ‘natural disaster’, but still, those who really depend on the internet for their business would likely feel a lasting negative impact.
               
What if I say there are some organizations that work in a situation like there is no internet, yes it’s right due to some sort of security reasons they don’t prefer to use the public internet. Banks, space organizations, and many security agencies fall under this category.


Now, a question arises here how they manage to do regular updates and the installation of different packages in their different systems? The answer is quite simple: “the use of satellite server”.


Recently I got a task in relation to this context, in which:

1. A prerequisite here is that you don’t have internet connectivity in your system but one of the systems with which you can connect has internet connectivity.
2. Setup individual satellite server in your local network.
3. Install packages and regular updates.


To do so here I prefer to use the FTP satellite server

How to implement Ftp satellite server

Pre-requisites

An Ubuntu Server, and a non-root user with sudo privileges.
The system is configured with vsftpd

Suppose we are doing the installation of Jenkins

Make a directory pkg.jenkins.io in /var/www/html/



 Contents of pkg.jenkins.io


 

 

 

 

 

 

 

 

 

 

 

Paste the host link of debian file in /etc/apt/source.list.d

 

 Run the command sudo apt-get update

 Now run the command for installation of the package

 


“ The internet made fame wack and anonymity cool ”

So, far from the above context, we have learned about setting up FTP for users with a local account. If you need to use an external authentication source, you might want to look into vsftpd’s support of virtual users. This offers a rich set of options through the use of PAM, the Pluggable Authentication Modules, and is a good choice if you manage users in another system such as LDAP or Kerberos.

I hope I explained everything clearly enough to understand. If you have some better way of implementing a satellite server please help me to improve this blog.

Thanks for reading my writing. I’d really appreciate any kind of feedback in the comments.

Cheers till next time!!!!

Redis Zero Downtime Cluster Migration

A few days back I came across a problem of migrating a Redis Master-Slave setup to Redis Cluster. Initially, I thought it to be a piece of cake since I have been already working on Redis, but there was a hitch, “Zero Downtime Migration”. Also, the redis was getting used as a database, not as Caching Server. So I started to think of different ways for migrating Redis Master-Slave setup to Redis Cluster and finally, I came up with an idea of migration.
Before we jump to migration, I want to give an overview regarding when we can use Redis as a database, and how to choose which setup we should go with Master-Slave or Cluster mode.

Redis as a Database

Sometimes getting data from disks can be time-consuming. In order to increase the performance, we can put the requests those either need to be served first or rapidly in Redis memory and then the Redis service there will keep rest of the data in the main database. So the whole architecture will look like this:-

Image result for redis as database

Redis Master-Slave Replication

Beginning with the explanation about Redis Master-Slave. In this phenomenon, Redis can replicate data to any number of nodes. ie. it lets the slave have the exact copy of their master. This helps in performance optimizations.

I bet now you can use Redis as a Database.

Redis Cluster

A Redis cluster is simply a data sharding strategy. It automatically partitions data across multiple Redis nodes. It is an advanced feature of Redis which achieves distributed storage and prevents a single point of failure.

Replication vs Sharding

Replication is also known as mirroring of data. In replication, all the data get copied from the master node to the slave node.

Sharding is also known as partitioning. It splits up the data by the key to multiple nodes.

As shown in the above figure,  all keys 1, 2, 3, 4 are getting stored on both machine A and B.

In sharding, the keys are getting distributed across both machine A and B. That is, the machine A will hold the 1, 3 key and machine B will hold 2, 4 key.

I guess now everyone has a good idea about Redis working mechanism. So let’s start discussing the migration of Redis.

Migration

Unfortunately, redis doesn’t have a direct way of migrating data from Redis-Master Slave to Redis Cluster. Let me explain it to you why?

We can start Redis service in either cluster mode or standalone mode. Now your solution would be that we can change the Redis Configuration value on-fly(means without restarting the Redis Service) with redis-cli. Yes, you are absolutely correct we can change the Redis configuration on-fly but unfortunately, Redis Mode(cluster or standalone) can’t be decided on-fly, for that we have to restart the service.

I guess now you guys will understand my situation :).

For migration, there are multiple ways of doing it. However, we needed to migrate the data without downtime or any interruptions to the service.

We decided the best course of action was a steps process:-

  • Firstly we needed to create a different Redis Cluster environment. The architecture of the cluster environment was something like
  • The next step was to update all the services (application) to send all the write operations to both servers(cluster and master-slave). The read commands (GET) will still go to the old setup.
  • But still, we don’t have the guarantee that all non-expirable data would make it over. So we can run a step to iterate through all of the keys and DUMP/RESTORE them into the new setup. 
  • Once the new Redis Server looks good we could make the appropriate changes to the application to point solely to the new Redis Server.

I know the all steps are easy except the second step. Fortunately, redis provides a method of key scanning through which we can scan all the key and take a dump of it and then restore it in the new Redis Server.
To achieve this I have created a python utility in which you have to define the connection details of your old Redis Server and new Redis Server.

You can find the utility here.

https://github.com/opstree/redis-migration

I have provided the detail information on using this utility in the README file itself. I guess my experience will help you guys while redis migration.

Replication or Clustering?

I know most people have a query that when should we use replication and when clustering :).

If you have more data than RAM in a single machine, use Redis Cluster to shard the data across multiple databases.

If you have less data than RAM in a machine, set up a master-slave replication with sentinel in front to handle the fai-lover.

The main idea of writing this blog was to spread information about Replication and Sharding mechanism and how to choose the right one and if mistakenly you have chosen the wrong one, how to migrate it from :).

There are multiple factors yet to be explored to enhance the flow of migration if you find that before I do, please let me know to improve this blog.

I hope I explained everything and clear enough to understand.

Thanks for reading. I’d really appreciate any and all feedback, please leave your comment below if you guys have some feedbacks.

Happy Coding!!!!