Docker Inside Out – A Journey to the Running Container

Introduction: 

Necessity is the mother of invention, same happens here in case of docker. With the pressure of splitting monolithic applications for the purpose of ease, we arrived at docker and it made our life much simpler. We all access docker with docker-cli command but I wonder what it does behind the scene, to run a container. Let’s get deeper into it in this very blog.

There’s a saying that “Behind every successful man, there is a woman”. I would love to give my perspective on this. One of the things that I have actively observed from the life of successful people I know is that there is a lot of truth in this statement but it varies with different situations and in most of the cases these women are not directly helping men in their prime work but taking care of other important work around so that they can concentrate on their prime work. Keeping this in my mind, I am expecting that there are other components as well which are behind docker-cli command that leads to the successful creation of containers. Whenever I talk about docker containers with developers who are new to docker in my organization the only thing I hear from them is “docker-cli command is used to invoke docker daemon to run container”

But, Docker daemon is not the process that gets executed when a container is meant to be run – it delegates the action to containerd which then controls a list of runtimes (runc by default) which is then responsible for creating a new process (calling the defined runtime as specified in the configuration parameters) with some isolation and only then executing the entrypoint of that container.

Components involved

  • Docker-cli
  • Dockerd
  • Containerd
  • RunC
  • Containerd-shim

Docker-cli: Used to make Docker API calls.

Dockerd:  dockerd listens for Docker API requests, dockerd can listen for Docker Engine API requests via three different types of Socket: unix, tcp, and fd and manages host’s container life-cycles with the help of containerd. Hence, actual container life-cycle management is outsourced to containerd. 

Containerd: Actually manages container life-cycle through the below mentioned tasks:

  • Image push and pull
  • Management of storage
  • Of course executing containers by calling runc with the right parameters to run containers.

Let’s go through some subsystems of containerd :

Runc: Containerd uses RunC to run containers according to the OCI specification.

Containerd shim: With docker 1.11 version, this component has been added. This is the parent process of every container started and it also allows daemon-less containers. First it allows the runtimes, i.e. runc, to exit after which it starts the container.  This way we don’t have to have the long running processes for containers.  When you start nginx you should only see the nginx process and the shim.  

Daemon-less: When I say daemon-less containers in the above paragraph, it means there is an advantage of this. When containerd shim was not there, upgrading docker daemon without restarting all your containers was a big pain. Hence, containerd shim got introduced to solve this problem.

The communication between Dockerd and ContainerD

We can see how docker delegates all the work of setting up the container to containerd. Regarding interactions between docker, containerd, and runc, we can understand that without even looking at the source code – plain strace and pstree can do the job.

Command:

when no containers running:

ps fxa | grep docker -A 3 

Result:

Working of all components together

Well, To see how all these components work together? We need to initialize a container i.e. Nginx in our case. We will be firing the same command after running an Nginx container.

This shows us that we have two daemons running – the docker daemon and the docker-containerd daemon.

Given that dockerd interacts heavily with containerd all the time and the later is never exposed to the internet, it makes sense to bet that its interface is unix-socket based.

High-Level overview of initializing container

Initializing container to see involvement of all components

Command:

docker run --name docker-nginx -p 80:80 -d nginx
docker ps
pstree  -pg | grep -e docker -e containerd 
ps fxa | grep -i “docker” -A 3 | grep -v “java”

Summary

By now, it might be clear that dockerd is not only the single component involved while running a container. We got to know what all components are backing a running container beside dockerd and how they work together to manage the lifecycle of a container.

I hope we have a good understanding of the docker components involved. Now it’s time to see things practically on your own with commands discussed in this blog without mugging up theoretical concepts.

That’s all till next time, thanks for reading, I’d really appreciate your feedback, please leave your comment below if you guys have any feedback or any queries.

Happy Containerization !!

References :

How to test Ansible playbook/role using Molecules with Docker

Why Molecule?

Have you ever faced issue that your Ansible code gets executed successfully but something went wrong like, service is not started, the configuration is not getting changed, etc?

There is another issue which you might have faced, that your code is running successfully on Redhat 6 but not running successfully on Redhat 7, to make your code smart enough to run on every Linux flavour, in order to achieve this, molecule came into the picture. Let’s start some brainstorm in Molecule.

Molecule has capability to execute YML linter and custom test cases which you have written for your Ansible code. We will explain the linter and test cases below

Why code testing is required?

Sometimes during the playbook execution, although it executes playbook fine but it does not give us the desired result so in order to check this we should use code testing in Ansible.

In general, code testing helps developer to find bugs in code/application and make sure the same bugs don’t cause the application to break. it also helps us to deliver application/software as per standard of code. code testing helps us to increase code stability.

Introduction :

This whole idea is all about to use Molecule (A testing tool) you can test your Ansible code whether it’s functioning correctly on all Linux flavour including all its functionalities or not.

Molecule is a code linter program that analyses your source code for potential errors. It can detect errors such as syntax errors; structural problems like the use of undefined variables, etc.

The molecule has capabilities to create a VM/Container environment automatically and on top, it will execute your ansible code to verify all its functionalities.

Molecule Can also check syntax, idempotency, code quality, etc

Molecule only support Ansible 2.2 or latest version

NOTE: To run ansible role with the molecule in different OS flavour we can use the cloud, vagrant, containerization (Docker)
Here we will use Docker……………………

Let’s Start……………

How Molecule works:

“When we setup molecule a directory with name “molecule” creates inside ansible role directory then it reads it’s main configuration file “molecule.yml” inside molecule directory. Molecule then creates platform (containes/Instances/Servers) in your local machine once completed it executes Ansible playbook/role inside newly created platform after successful execution, it executes test cases. Finally Molecule destroy all newly created platform”

Installation of Molecule:

Installation of the molecule is quite simple.

$ sudo apt-get update
$ sudo apt-get install -y python-pip libssl-dev
$ pip install molecule [ Install Molecule ]
$ pip install --upgrade --user setuptools [ do not run in case of VM ]

That’s it…………

Now it’s time to setup Ansible role with the molecule. We have two option to integrate Ansible with molecule:

  1. With New Ansible role
  2. with existing Ansible role

1. Setup new ansible role with molecule:

$ molecule init role --role-name ansible-role-nginx --driver-name docker

When we run above command, a molecule directory will be created inside the ansible role directory

2. Setup the existing ansible role with molecule:

Goto inside ansible role and run below command.

$ molecule init scenario --driver-name docker

When we run above command, a molecule directory will be created inside the ansible role directory

NOTE: Molecule internally uses ansible-galaxy init command to create a role

Below is the main configuration file of the molecule:

  • molecule.yml – Contains the definition of OS platform, dependencies, container platform driver, testing tool, etc.
  • playbook.yml – playbook for executing the role in the vagrant/Docker
  • tests/test_default.py | we can write test cases here.

Content of molecule.yml

cat molecule/default/molecule.yml

---
molecule:
  ignore_paths:
    - venv

dependency:
  name: galaxy
driver:
  name: docker
lint:
  name: yamllint	
platforms:
  - name: centos7
    image: centos/systemd:latest
    privileged: True
  - name: ubuntu16
    image: ubuntu:16.04
provisioner:
  name: ansible
  lint:
    name: ansible-lint
#    enabled: False
verifier:
  name: testinfra
  lint:
    name: flake8
scenario:
  name: default  # optional
  create_sequence:
    - create
    - prepare
  check_sequence:
    - destroy
    - dependency
    - create

Explanation of above contents:

Dependency:

Testing roles may rely upon additional dependencies. Molecule handles managing these dependencies by invoking configurable dependency managers.

“Ansible Galaxy” is the default dependency manager.

Linter:

A linter is a problem which analyses our code for potential errors.

What code linters can do for you?

Code linter can do:

  1. Syntax errors;
  2. Check for undefined variables;
  3. Best practice or code style guideline.
  4. Extra lines.
  5. Extra spaces. etc

**We have linters for almost every programming languages like we have yamllint for YAML languages, etc.

yamllint: It checks for syntax validity, key repetition, lines length, trailing spaces, indentation, etc.

provisioner: Ansible is the default provisioner. No other provisioner will be supported.

Flake8:– is the default verifier linter. Usage python file

platforms:

What platform (Containers) will be created and Ansible code will be executed.

Driver:

Driver defines your platform where your Ansible code will be executed

Molecule supports below drivers:

  • Azure
  • Docker
  • EC2
  • GCE
  • Openstack
  • Vagrant

Scenario:

Scenario – scenario defines what will be performed when we run molecule

Below is the default scenario:

–> Test matrix

└── default
├── lint
├── destroy
├── dependency
├── syntax
├── create
├── prepare
├── converge
├── idempotence
├── side_effect
├── verify
└── destroy

However, we can change this scenario and sequence by changing molecule.yml file :

scenario:
  name: default  # optional
  create_sequence:      # molecule create 
    - create
    - prepare
  check_sequence:       # molecule check 
    - destroy
    - dependency
    - create
    - prepare
    - converge
    - check
    - destroy
  converge_sequence:    # molecule converge 
    - dependency
    - create
    - prepare
    - converge
  destroy_sequence:     # molecule destroy 
    - cleanup
    - destroy
  test_sequence:        # molecule test 
#    - lint
    - cleanup
    - dependency
    - syntax
    - create
    - prepare
    - converge

NOTE: If anyone scenario (action) fails, others will not be executed. this is the default molecule behaviour

Here I am defining all the scenarios:

lint: Checks all the YAML files with yamllint

destroy: If there is already a container running with the same name, destroy that container

Dependency: This action allows you to pull dependencies from ansible-galaxy if your role requires them

Syntax: Checks the role with ansible-lint

create: Creates the Docker image and use that image to start our test container.

prepare: This action executes the prepare playbook, which brings the host to a specific state before running converge. This is useful if your role requires a pre-configuration of the system before the role is executed.

Example: prepare.yml

---
- name: Prepare
  hosts: all
  gather_facts: false
  tasks:
    - name: Install net-tools curl
      apt: 
      	name: ['curl', 'net-tools']
      	state: installed 
      when: ansible_os_family == "Debian"

NOTE: when we run “molecule converge” below task will be performed :

====> Create –> create.yml will be called
====> Prepare –> prepare.yml will be called
====> Provisioning –> playbook.yml will be called

converge: Run the role inside the test container.

idempotence: molecule runs the playbook a second time to check for idempotence to make sure no unexpected changes are made in multiple runs:

side_effect: Intended to test HA failover scenarios or the like. See Ansible provisioner

verify: Run tests inside the container which we have written

destroy: Destroys the created container

NOTE: When we run molecule commands, a directory with name molecule created inside /tmp which is molecule managed, which contains ansible configuration, Dockerfile for all linux flavour and ansible inventory

cd /tmp/molecule

tree
.
└── osm_nginx
└── default
├── ansible.cfg
├── Dockerfile_centos_systemd_latest
├── Dockerfile_ubuntu_16_04
├── inventory
│ └── ansible_inventory.yml
└── state.yml

state.yml :- maintain scenario which has been performed .

Molecule managed

---
converged: true
created: true
driver: docker
prepared: true

Testing:

This is is most important part of Molecule where we will write some test cases.

Testinfra is the default test runner.

Below module should be installed:

  • $ pip install testinfra
  • $ molecule verify

Molecule calls below file for unit test using “testinfra” verifier

molecule/default/tests/test_default.py

verifier:

Verifier is used for running your test cases.

Below are the three verifiers which we can use in Molecule

  • testinfra – It usage python language for writing test cases.
  • goss – It usage yml language for writing test cases.
  • serverspac – usage ruby language for writing test cases.

Here I am using testinfra as verifier for writing test case.

Molecule commands:

  • # molecule check [ Run playbook.yml in check mode ]
  • # molecule create [ Create instance/ Platform]
  • # molecule destroy [ destroy instance / Platform]
  • # molecule verify [ perform unit test ]
  • # molecule test [ It performs below default scenario in sequence ]
  • # molecule prepare
  • #molecule converge

NOTE: To enable logs to run a command with –debug flag

$ molecule –debug test

Sample Test cases :

cat molecule/default/tests/test_default.py

import os

import testinfra.utils.ansible_runner

testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
    os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all')

def test_user(host):
    user = host.user("www-data")
    assert user.exists

def test_nginx_is_installed(host):
    nginx = host.package("nginx")
    assert nginx.is_installed


def test_nginx_running_and_enabled(host):
  os = host.system_info.distribution
  if os == 'debian':
    nginx1 = host.service("nginx")
    assert nginx1.is_running
    assert nginx1.is_enabled

def test_nginx_is_listening(host):
    assert host.socket('tcp://127.0.0.1:80').is_listening

 That’s all ! we have covered all required topics which will help you to create your own environment of Molecule and test cases.

Thanks all !!! see you soon with new and effective blog 🙂

Links you may refer:

https://yamllint.readthedocs.io/en/stable/

One more reason to use Docker – part II

Hey guys we are back with one more reason to use docker part II. I hope you must have explored our previous blog in this series of one more reason to use docker, if not I would suggest you to read that one also.

As we discussed in our previous blog, docker can be used in multiple scenarios, it all depends on the use case of what you want to do with it. It’s a shipping container that can run anything you want to run inside it.

It can either be database, elasticsearch, proxy, scheduled job or be an application.

Running binaries or trying out a new software

As a developer or devops, you are always trying out some software or the other. Many a times you must have struggled while installing new software package onto your machine or setting up different environment for running application.
Suppose I have written a terraform infra code for my infra setup, to test it I would require to install terraform binary. On top of that, I also need an SCM like ansible on the same machine to test/run my ansible role which cannot be run without proper python settings!

Oh God !! so much struggle to install these binaries on system 😥. And if you are a developer you have to go through set of jira processes 😜 will have to request infra team for a new machine for POC of a new software with required package.

The problem doesn’t end here, your system will get piled up with unused binaries once you are done with POC. It will be a whole different task to clean that up.

It’s not always a pleasant experience to set things up after downloading the software. Time is of essence and sometimes all we are looking for is to fire a few commands and that’s it. The Docker model is a super simplistic way of running software binaries, which behind the scene takes care of getting the image and running it for you.

It’s not just about new software. Consider an example that you want to spin up a Database Server (MySQL) quickly on your laptop. Or setup a Redis Server for your team. Docker makes this dead simple. Want to run MySQL Server? All you need to do is :

docker run -d -p 3306:3306 tutum/mysql

Now in case of terraform you simply need to pull latest docker image for terraform and you are good to test/run your terraform code:

docker pull hashicorp/terraform

# run the binary from a container.

docker run -i -t hashicorp/terraform plan main.tf

It’s that simple, You could save hours of your time. It’s use case is not limited to this, we can use it in case of Demos !!

In our organisation, we generally have weekend SHOA(Saturday Hands-On Activity) sessions, which is a great platform for learning and sharing knowledge in an interactive way.
We use docker for showcasing demo, as Docker images are an ideal way to package and demo your tool or application. Also, it’s a great way for conducting hands on workshop. Normally participants get stuck while setting up tools instead of doing the real workshop agenda. Using docker will save time and participants will be able to use that time to learn what they intended to learn out of the workshop.

That’s the power of docker fire it and you are On! I hope you guys liked it and will also try to use Docker in your multiple use cases.

As we discussed in our last blog, there are many scenarios of using docker which are yet to be explored.

Thanks for reading, I’d really appreciate your feedback. Cheers till the next time.

Image source: https://whyjava.files.wordpress.com/2017/05/docker.png?w=555

Docker Logging Driver

The  docker logs command batch-retrieves logs present at the time of execution. The docker logs command shows information logged by a running container. The docker service logs command shows information logged by all containers participating in a service. The information that is logged and the format of the log depends almost entirely on the container’s endpoint command.

These logs are basically stored at “/var/lib/docker/containers/.log”, So basically it is not easy to use this file by using Filebeat because the file will change every time when the new container is up with a new container id.

So, How to monitor these logs which are formed in different files ? For this Docker logging driver were introduced to monitor the docker logs.

Docker includes multiple logging mechanisms to help you get information from running containers & services. These mechanisms are called logging drivers. These logging drivers are configured for the docker daemon.

To configure the Docker daemon to default to a specific logging driver, set the value of log-driver to the name of the logging driver in the daemon.json file, which is located in /etc/docker/ on Linux hosts or C:\ProgramData\docker\config\ on Windows server hosts.

The default logging driver is json-file. The following example explicitly sets the default logging driver to syslog:

{                                            
  “log-driver”: “syslog”
}

After configuring the log driver in daemon.json file, you can define the log driver & the destination where you want to send the logs for example logstash & fluentd etc. You can define it either on the run time execution command as “–log-driver=syslog –log-opt syslog-address=udp://logstash:5044” or if you are using a docker-compose file then you can define it as:

“`
logging:
driver: fluentd
options:
fluentd-address: “192.168.1.1:24224”
tag: “{{ container_name }}”
“`

Once you have configured the log driver, it will send all the docker logs to the configured destination. And now if you will try to see the docker logs on the terminal using the docker logs command, you will get a msg:

“`
Error response from daemon: configured logging driver does not support reading
“`

Because all the logs have been parsed to the destination.

Let me give you an example that how i configured logging driver fluentd
and parse those logs onto Elasticsearch and viewed them on Kibana. In this case I am configuring the logging driver at the run-time by installing the logging driver plugin inside the fluentd but not in daemon.json. So make sure that your containers are created inside the same docker network where you will be configuring the logging driver.

Step 1: Create a docker network.

“`
docker network create docker-net
“`

Step 2: Create a container for elasticsearch inside a docker network.

“`
docker run -itd –name elasticsearch -p 9200:9200 –network=docker-net elasticsearch:6.4.1
“`

Step 3: Create a fluentd configuration where you will be configuring the logging driver inside the fluent.conf which is further being copied inside the fluentd docker image.

fluent.conf

“`

@type forward
port 24224
bind 0.0.0.0

@type copy

@type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix fluentd
logstash_dateformat %Y%m%d
include_tag_key true
type_name access_log
tag_key app
flush_interval 1s
index_name fluentd
type_name fluentd

@type stdout

“`

This will also create an index naming as fluentd & host is defined in the name of the service defined for elasticsearch.

Step 4: Build the fluentd image and create a docker container from that.

Dockerfile.fluent

“`
FROM fluent/fluentd:latest
COPY fluent.conf /fluentd/etc/
RUN [“gem”, “install”, “fluent-plugin-elasticsearch”, “–no-rdoc”, “–no-ri”, “–version”, “1.9.5”]
“`

Here the logging driver pluggin is been installed and configured inside the fluentd.

Now build the docker image. And create a container.

“`
docker build -t fluent -f Dockerfile.fluent .
docker run -itd –name fluentd -p 24224:24224 –network=docker-net fluent
“`

Step 5: Now you need to create a container whose logs you want to see on kibana by configuring it on the run time. In this example, I am creating an nginx container and configuring it for the log driver.

“`
docker run -itd –name nginx -p 80:80 –network=docker-net –log-driver=fluentd –log-opt fluentd-address=udp://:24224 opstree/nginx:server
“`

Step 6: Finally you need to create a docker container for kibana inside the same network.

“`
docker run -itd –name kibana -p 5601:5601 –network=docker-net kibana
“`

Now, You will be able to check the logs for the nginx container on kibana by creating an index fluentd-*.

Types of Logging driver which can be used:

       Driver           Description

  •  none:           No logs are available for the container and docker logs does  not return any output.
  •  json-file:     The logs are formatted as JSON. The default logging driver for Docker.
  •  syslog:     Writes logging messages to the syslog facility. The syslog daemon must be running on the host machine.
  •  journald:     Writes log messages to journald. The journald daemon must be running on the host machine.
  •  gelf:     Writes log messages to a Graylog Extended Log Format (GELF) endpoint such as Graylog or Logstash.
  •  fluentd:     Writes log messages to fluentd (forward input). The fluentd daemon must be running on the host machine.
  •  awslogs:     Writes log messages to Amazon CloudWatch Logs.
  •  splunk:     Writes log messages to splunk using the HTTP Event Collector.
  •  etwlogs:     Writes log messages as Event Tracing for Windows (ETW) events. Only available on Windows platforms.
  •  gcplogs:     Writes log messages to Google Cloud Platform (GCP) Logging.
  •  logentries:     Writes log messages to Rapid7 Logentries.