Puppet module to setup nodejs deployment

I would like to share my puppet module to setup nodejs deployment infrastructure on a linux box. This module performs the basic setup required to facilitate the automated deployment of a nodejs app. Very soon I’ll be introducing another generic puppet module that will run on top of this module & provide a full fledged automatic deployment of any node app. To view the source code of this module you can refer my github repository.

Let’s talk about what this module actually does. First of all we create a nodejs user which we will use for all deployment related activities of all the node app’s, as a convention we have created a folder /home/nodejs/nodeapps this folder will contain all the code of our node applications.

This modules adds 2 scripts as well the first one is deployNodeApp.sh, deployNodeApp.sh is a generic script that assumes that node app code will be present in tar form at /home/nodejs it will clean existing code of nodeapp at /home/nodejs/nodeapps untar the code at corresponding directory of node app & restart the node app. As another convention we are using upstart for managing the node app i.e starting & stopping the node app I’ll talk about the upstart configuration in my next blog where I’ll talk about generic puppet module for a node app. Another script startNodeApp.sh will take care of starting the node app after doing some per-processing such as loading some environment specific properties of node app which we don’t want to commit in the codebase i.e want to separate it out from deployment process choosing a specific version of node.

This module also takes cares of installing nvm for nodejs user so that nodejs version can be managed locally for this user or app.

Though we already have a puppet module for nodejs, but I had some specific requirements which I wanted to handle that’s why I’ve created this module.

Let me know if you have some points of improvement in this module, one thing that I wanted to add in this module is to add npm installation but it had some other dependencies also I had some doubts whether I should have npm as part of nodejs module or not.

Automated DB Updater Release First Release

Initial version of Automated DB Updater Release ADU

With this blog I’m releasing the intial version of a python utility to provide automated db updates across various environments for different components.

The code for this utility is hosted on github
https://github.com/sandy724/ADU

You can clone the read only copy of this codebase by url given below
https://github.com/sandy724/ADU.git

To understand the basic idea about this utility go thorugh this blog
http://sandy4blogs.blogspot.in/2013/07/automated-db-updater.html

How to use this utility
Checkout the code at some directory, add the path of this directory in PYTHONPATH environment variable
Create a database with a script’s metadata table with given below ddl

CREATE TABLE `script_metadata` (
  `name` varchar(100) DEFAULT NOT NULL,
  `version` int(11) DEFAULT NOT NULL,
  `executed` tinyint(1) NOT NULL DEFAULT ‘0’,
  `env` varchar(30) DEFAULT NOT NULL,
  `releas` varchar(30) DEFAULT NOT NULL,
  `component` varchar(30) DEFAULT NOT NULL
)
Create a database.properties, containing connection properties of each environment database

[common_db]
dbHost=localhost
dbPort=3306
dbUser=root
dbPwd=root
db=test
 
 
[env1]
dbHost=localhost
dbPort=3306
dbUser=root
dbPwd=root
db=test

Here common_db represents connection to database which will contain metadata of scripts for monitoring

Now execute the pythong utility
Copy the client(updateDB.py) to directory of your choice, make sure that property configration file should also be at this directory
python updateDB.py -f -r –env

Puppet module for setting up Multiple mongo’s with replication

In this blog I’ll be talking about a puppet module, that can be used to installing multiple mongo’s with replication on a single machine. Since I’m very new to puppet so you may find this module very crude, but it works :). Their were couple of puppet module already available but most of them are only for installing a single instance of mongo at a machine & I’ve a specific requirement of installing multiple instances of mongo having master slave replication between them. As I already said that this module may be quiet crude or basic so please bear with that & my approach may also seem a bit unconventional so please let me know what all can be improved in this module or how things could have been done in a better way.

So let’s start with the actual details first of this module is hosted on github(https://github.com/sandy724/Mongo), if you want to look at the source code you can clone it from github. For installing mongo you would be executing the command
puppet apply -e “class {mongo:port => , replSet => ,master => ,master_port => ,}”

Command for installing master
puppet apply -e “class {mongo:port => 27017, replSet => sdrepsetcommon,master => master, master_port => 27017,}”

Command for installing slave
puppet apply -e “class {mongo:port => 27018, replSet => sdrepsetcommon,master => slave,master_port => 27017,}”

Before going into the details what all this module is doing I will share some details of mongo

  • You can start mongo by executing mongod command
  • You can provide a configration file which contains details such as
    • log directory where mongo would be generating the logs
    • port at which mongo would be listening for requests
    • dbpath where mongo would be storing all the data
    • pidfilepath containg process id of mongo instance, that would be used to check whether mongo is running or not
    • replSet name of the replicaset
  • You need to have a mongo as a service installed in you system to start an instance of mongo
  • For replication you need to execute rs.initiate command on the master mongo
  • For adding another instance into replication you need to execute rs.add(“:”)  command on the master mongo
Now let’s go into more details what all this component does, I’ll be listing down all the steps in bulleted points
  • As you can figure out this module is expecting few parameters :
    • port : port at which mongo would be listening,
    • replSet : name of replicaset which would be used for managing replication
    • master : A string parameter which would signify whether the mongo setup is for master or slave
    • master_port : Port at which master instance of mongo would be listening
  • First of all we create a mongo user
  • Parent Log directory for the mongo instance is created if it doesn’t exists with mongo user as owner.
  • Mongo db directory is created under /data/mongo with a naming convention replSet_port, i.e if replSet parameter is sdrepsetcommon & port is 27017 then the data directory for this mongo instance would be  /data/mongo/sdrepsetcommon_27017. This directory would be owned by mongo user.
  • A mongo service would be installed if not already their.
  • A mongo restart shell script is also placed at the mongo db directory
  • A file is also placed under the mongo db directory that have a mongo command to setup replication, this file is created conditionally depending on whether we are setting up a master or slave instance.
  • Finally the replication command is executed on mongo server & restart script is also executed
This concludes the setting up of a mongo instance on a machine.

Just for more details to start mongo we are using mongod -f command, this configuration file is saved as a template & the mongo modules processes the template with the values passed & creates the desired mongod.conf. In our case we are evaluating following properties of mongod.conf : logpath, port, dbpath, pidfilepath, replSet

Automation tips and tricks

As promised I’m back with the summary of cool stuff that I’ve done with my team in Build & Release domain to help us deal with day to day problems in efficient & effective way. As I said this month was about creating tools/utilities that sounds very simple but overall their impact in productivity & agility of build release teams and tech verticals was awesome :).

Automated deployment of Artifacts : If you have ever worked with a set of maven based projects that are interdependent on each other, one of the major problem that you will face in such a setup is to have the latest dependencies in your local system. Here I’m assuming two things you would be using a Maven Repo to host the artifacts & the dependencies would be SNAPSHOT dependencies if their is active development going on dependencies as well. Now the manual way of making sure that maven repo will always have the latest SNAPSHOT version is that every-time somebody does change in the code-base he/she manually deploy that artifact to maven repo. What we have done is that for each & every project we have created a Jenkins job that check if code is checked in for a specific component & if so that component’s SNAPSHOT version get’s deployed to maven repo. The impact of these utilities jobs was huge as now all the developers doesn’t have to focus on deploying their code to maven repo & also keeping track of who last committed the code was also not needed.

Log Parser Utility : We have done further improvement in our event based log analyzer utility. Now we also have a simple log parser utility through which we can parse the logs of a specific component & segregate the logs as per ERROR/WARN/INFO. Most importantly it is integrated with jenkins so you can go to jenkins select a component whose log needs to be analyzed, once analysis is finished the logs are segregated as per our configuration(in our case it is ERROR/WARN/INFO) after that in the left bar these segregations are shown with all the various instances of these categories and user can click on those links to go exactly at the location where that information is present in logs

Auto Code Merge : As I already told we have a team of around 100+ developers & a sprint cycle of 10 days and two sprints overlap each other for 5 days i.e first 5 days for development after tat code freeze is enforced and next 5 days are for bug fixing which means that at a particular point of time there are 3 parallel branches on which work is under progress one branch which is currently deployed in production second branch on which testing is happening and third branch on which active development is happening. You can easily imagine that merging these branches is a task in itself. What we have done is to create an automated code merge utility that tries to merge branches in a per-defined sequence if automatic merge is successful the merge proceeds for next set of branches otherwise a mail is sent to respective developers whose files are in conflict mode

Hope you will get motivated by these set of utilities & come up with new suggestions or point of improvements

Automation tips and tricks January 2013

I’m starting a new blog series in which I’ll be talking about various cool things or automations that I along with my team done in a month and what are my plans for next month.

Talking about January 2013, I’ve done following things

1.) Streamlining of environments : The big step in streamlining the environments is to change the owners of the application from root user to tomcat user & making ports of all the application consistent across environments i.e dev, qa, pt & staging. This will help me in my long term goal of introducing a server configuration tool most preferably puppet.
2.) Log Analyzer Utility : One of the major challenge that teams face is to get real time notifications of any exceptions that occur in the server logs, to overcome this problem we have written a log analyzer utility that will scan a log file backed by a meta file, this meta file have the information about who all should be notified for an exception. This utility is written in shell script and integrated with Jenkins CI server so that we can schedule the execution of this utility as per convenience, currently jenkins is executing this utility after every 15 minutes.
3.) System monitor : Off late we were facing challenge of servers getting disk out of space & when whole system goes down then only we were able to figure out the issue is due to disk space outage due to huge log files, to overcome this problem we have built a small shell utility that scans couple of folder’s recursively and provide a list of top 10 files whose size is greater then a specified threshold. In our case we have set this threshold as 1 GB, also all these variables can be provided as input to this utility such as folder’s to scan regular expression of files which needs to be considered the threshold value

This is what we have achieved in the month of January 2013 although these utilities seems to be but obvious and simple one but the effect they have in the productivity of the team is considerable.

Now plans for the month of February 2013, usually I choose those things which we are doing manually, this month we will be working on following things
1.) Utility which can perform automated merge if possible
2.) Utility that can automatically upload the artifacts to a central server(artifactory in our case)
3.) Integration of git common operations with Jenkins