Monday, January 21, 2019

Log Parsing of Windows Servers on Instance Termination

 
As we all know that how critical are Logs as a part of any system, they give you deep insights about your application, what your system is doing and what caused the error. Depending on how logging is configured logs may contain transaction history, timestamps and amounts debited/credited into client's account and a lot more.
 

On an enterprise level application, your system goes to multiple hosts, managing the logs across multiple hosts can be complicated. Debugging the error in the application across hundreds of log files on hundreds of servers can be very time consuming and complicated and not the right approach so it is always better to move the logs to a centralized location.

Lately in my company I faced a situation which I assume is a very commonly faced scenario in Amazon's Cloud where we might have to retain application logs from multiple instances behind an Auto Scaling group.  Let's assume an example for better understanding. 

Suppose your application is configured to be logging into C:\Source\Application\web\logs Directory. The Application running has variant incoming traffic, sometimes it receives requests which can be handled by 2 servers, other times it may require 20 servers to handle the traffic.

When there is a hike in traffic, Amazon Ec2's smart AutoScaling Group uses the configuration and scales from 2 server to many (According to ASG Policy) and during this phase, the application running in the newly launched Ec2's also log into C:\Source\Application\web\logs .... but when there's a drop in traffic, the ASG triggers a scale down policy, resulting to termination of instances, which also results in deletion of all the log files inside the instances launched via ASG during high traffic time.

Faced a similar situation ?  No worries, now in order to retain logs I figured out an absolute solution.

Here, in this blog, the motive is to sync the logs from dying instances at the time of their termination. This will be done using AWS Services, the goal is to trigger a Powershell Script in the instance using SSM which sync logs to S3 Bucket with sufficient information about the dying instances. For this we will require 2 things:

1) Configuring SSM agent to be able to talk to Ec2 Instances
2) Ec2 Instances being able to write to S3 Buckets

For the tutorial we will be using Microsoft Windows Server 2012 R2 Base with the AMI ID: ami-0f7af6e605e2d2db5

A Blueprint of the scenario to be understood below:

1) Configuring SSM agent to be able to talk to Ec2 Instances

SSM Agent is installed by default on Windows Server 2016 instances and instances created from Windows Server 2003-2012 R2 AMIs published in November 2016 or later. Windows AMIs published before November 2016 use the EC2Config service to process requests and configure instances.

If your instance is a Windows Server 2003-2012 R2 instance created before November 2016, then EC2Config must be upgraded on the existing instances to use the latest version of EC2Config. By using the latest EC2Config installer, you install SSM Agent side-by-side with EC2Config. This side-by-side version of SSM Agent is compatible with your instances created from earlier Windows AMIs and enables you to use SSM features published after November 2016.

This simple script can be used to update Ec2Config and then layer it with the latest version of SSM agent. This will always install AwsCli which is used to push logged archives to S3

#ScriptBlock
if(!(Test-Path -Path C:\Scripts )){
mkdir C:\Tmp
}
cd C:/Tmp
wget https://s3.ap-south-1.amazonaws.com/asg-termination-logs/Ec2Install.exe -OutFile Ec2Config.exe
wget https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/windows_amd64/AmazonSSMAgentSetup.exe -OutFile ssmagent.exe
wget https://s3.amazonaws.com/aws-cli/AWSCLI64PY3.msi -OutFile awscli.msi
wget https://s3.amazonaws.com/aws-cli/AWSCLISetup.exe -OutFile awscli.exe
Invoke-Command -ScriptBlock {C:\Tmp\Ec2Config.exe /Ec /S /v/qn }
sleep 20
Invoke-Command -ScriptBlock {C:\Tmp\awscli.exe /Ec /S /v/qn }
sleep 20
Invoke-Command -ScriptBlock {C:\Tmp\ssmagent.exe /Ec /S /v/qn }
sleep 10
Restart-Service AmazonSSMAgent
Remove-Item C:\Tmp

An IAM Role is Required for SSM to Ec2 Instance Conversation:
IAM instance role: Verify that the instance is configured with an AWS Identity and Access Management (IAM) role that enables the instance to communicate with the Systems Manager API.
Add instance profile permissions for Systems Manager managed instances to an existing role
  • Open the IAM console at https://console.aws.amazon.com/iam/.
  • In the navigation pane, choose Roles, and then choose the existing role you want to associate with an instance profile for Systems Manager operations.
  • On the Permissions tab, choose Attach policy.
  • On the Attach policy page, select the check box next to AmazonEC2RoleforSSM, and then choose Attach policy.
Now, Navigate to Roles > and select your role. 

That should look like:



2) Ec2 Instances being able to write to S3 Buckets

An IAM Role is Required for Ec2 to be able to write to S3:

IAM instance role: Verify that the instance is configured with an AWS Identity and Access Management (IAM) role that enables the instance to communicate with the S3 API.

Add instance profile permissions for Systems Manager managed instances to an existing role
  • Open the IAM console at https://console.aws.amazon.com/iam/.
  • In the navigation pane, choose Roles, and then choose the existing role you want to associate with an instance profile for Systems Manager operations.
  • On the Permissions tab, choose Attach policy.
  • On the Attach policy page, select the check box next to AmazonS3FullAccess, and then choose Attach policy. 

That should look like:











This Powershell script saved in C:/Scripts/termination.ps1 will pick up log files from: $SourcePathWeb:
and will output logs into:
$DestFileWeb
with a IP and date-stamp to recognize and identify the instances and where the logs originate from later.
Make sure that the s3 bucket name and –region and source of log files is changed according to the preferences. 



#ScriptBlock
$Date=Get-Date -Format yyyy-MM-dd
$InstanceName="TerminationEc2"
$LocalIP=curl http://169.254.169.254/latest/meta-data/local-ipv4 -UseBasicParsing

if((Test-Path -Path C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date )){
Remove-Item "C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date" -Force -Recurse
}

New-Item -path "C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date" -type directory
$SourcePathWeb="C:\Source\Application\web\logs"
$DestFileWeb="C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date/logs.zip"

Add-Type -assembly "system.io.compression.filesystem"
[io.compression.zipfile]::CreateFromDirectory($SourcePathWeb, $DestFileWeb)

C:\'Program Files'\Amazon\AWSCLI\bin\aws.cmd s3 cp C:\Users\Administrator\workdir s3://terminationec2 --recursive --exclude "*.ok" --include "*" --region us-east-1


If the above settings are done fine then manually the script should produce a success suggesting output:





Check your S3, Bucket for seeing if it has synced logs to there. Now, because the focus of this blog trigger a Powershell Script in the instance using SSM which syncs the logs to S3 Bucket so we will try running the script through SSM > Run Command.



Select and run of the instances having the above script and configuration. The output should be pleasing.








The AMI used by the ASG should have the above configuration (Can be archived via created a ami from ec2 having above config and then adding it into Launch Configuration of the ASG). The ASG we have here for the tutorial is named after my last name : "group_kaien".

Now, the last and the Most important step is configuration the
Cloudwatch > Event > Rules.

Navigating to Cloudwatch>Event>Rules: Create Rule.



This would return the following JSON config:

{
"source": [
"aws.autoscaling"
``
],
"detail-type": [
"EC2 Instance Terminate Successful",
"EC2 Instance-terminate Lifecycle Action"
],
"detail": {
"AutoScalingGroupName": [
"group_kaien"
]
}
}

On the right side of Targets:
Select

SSM Run Command:
  • Document: AwsRunPowerShellScript
  • Target key: “Instanceids or tag:<Ec2TagName>
  • Target Values: <Tag Value>

 Configure parameter
  • Commands: .\termination.ps1
  • WorkingDirectory: C:\Scripts.ps1
  • ExecutionTimeout: 3600 is default
Making sure that on termination event happening, the powershell script is run and it syncs logs to S3. This is what our configuration looks like:





 

For more on setting up Cloudwatch Events refer :
https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CWE_GettingStarted.html

Wait for the AutoScaling Policies to run such that new instances are created and terminated, with above configuration. The terminating instances will sync their logs S3 before they are fully terminated. Here’s the output on S3 for me after a scale down activity was done.



 



































Conclusion



Now with this above, we have learned how to export logs to S3 automatically from a dying instance, with the correct date/time stamp as mentioned in the termination.ps1 script.
Hence, fulfilling the scope of the blog.
Stay tuned for more


3 comments:

What Without Internet

What without Internet? I had a dream a few days ago in which the existence of the internet was gone, When I woke up I though...