5 posts tagged with "Observability"

Developing Real-time log monitoring and email — alerting with Server-less Architecture using Terraform

July 12, 2024 · 6 min read

Why Log Monitoring ?

Lets say that you have build a certain app ( Here we are building an app based on micro-service architecture) using containerized solution in EKS (Elastic Kubernetes Service) or running an standalone app in EC2 (Elastic Cloud Compute) instance. And to monitor this app, we are sending the application logs to cloud watch-logs. But having to keep a constant eye on the this resource log group is tiresome and sometimes technically challenging, as there are hundred other micro-services that send their own logs to their log groups. And as this app scales up, we need to invest more human resources to perform mundane tasks such as monitor these logs, which could be better utilized in developing new business frontiers.

What if we can build an automated solution, which scales efficiently in terms of cost and performance, help us with monitor and alert if there are any issues within the logs ? We can build this tool in one of the two architecture styles mentioned below :

Using Server based architecture (or)
Server-less architecture.

Server-Centric (or) Server-less Architecture?

With the advent of the cloud technologies, we have moved from server-centric to on demand servers to now the server-less. Before we choose server-centric, on-demand servers or server-less architecture, we must ask ourselves few questions:

How am i going to serve the feature that i am developing? ( Is it extension of available Eco-system or a stand-alone feature?)
What should be its availability and Scalability? What is it runtime requirement?
Does the feature have stateful or stateless functionality?
What is my budget of running this feature?

If the your answers to above questions are quite ambiguous, always remember one thing Prefer Server-less over Server-Centric, if your solution can be build as server-less ( Your Cloud Architect might help you with decision).

In my case, as my log-Monitoring system is

A Standalone system
It is event-based ( the event here is log), which needs to be highly available and should be scalable for logs from different services.
The feature is Stateless.
Budget is Minimum.

Given the above answers, i have chosen Server-less Architecture.

Case Example

This system can be better illustrated by an example. Let say that we have built our application in JAVA ( application is running in tomcat within a node in EKS) and this application in deployed within the EKS cluster.

Example Log -1

java.sql.SQLTransientConnectionException: HikariPool-1 — Connection is not available, request timed out after 30000ms.  
 at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithin Transaction(TransactionAspectSupport.java:367)

Example Log -2

at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(Transaction Interceptor.java:118) 
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143) 
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)

We would like to get notified every time the application log reads the keyword “ERROR” or “Connection Exception”, as seen in the log above.To achieve this, lets build our monitoring and alerting system.

Key components to build Log Monitoring and Alerting System

AWS Cloud-watch Logs
AWS log filter pattern
AWS Lambda
Simple Notification Service (SNS)
Email Subscription

We combine the above AWS resources, as shown in the architecture diagram above, to create a Real-time server-less log monitoring system.

Building Infrastructure and Working with Terraform

Lets first create a log group, which would receive all the application logs

terraform {
 required_providers {
  aws = {
   source = "hashicorp/aws"
   version = "~> 3.0"
  }
 }
}

# Configure the AWS Provider
provider "aws" {
 region = var.region
}

# Extract the current account details
data "aws_caller_identity" "current" {}

data "aws_region" "current" {}

# Create a Log group to send your application logs
resource "aws_cloudwatch_log_group" "logs" {
 name = "Feature_Logs"
}

Once this resource is created, we expose all our log traffic from application layer in EKS to this log group. As the application starts working, all its outputs and errors are sent as log stream to this log group.

After the above step, we start receiving the logs. Every time the application layer throws an error or connection exception, we would like to get notified, so our desired keywords are “Error” and “Connection Exception” within the log stream of the Cloud watch log group.
We can do this, using the cloud-watch log subscription filter which helps parse all those logs and find the logs which contain either the keyword “Error” or such keywords.

resource "aws_cloudwatch_log_subscription_filter" "logs_lambdafunction_logfilter" {
 name = "logs_lambdafunction_logfilter"

 # role_arn = aws_iam_role.iam_for_moni_pre.arn
 change_log_group_name = aws_cloudwatch_log_group.logs.name

 filter_pattern = "?SQLTransientConnectionException ?Error" // Change the error patterns here

 destination_arn = aws_lambda_function.logmonitoring_lambda.arn
}

When the cloud-watch log subscription filter sends logs to any receiving service such as AWS lambda , they are base64 encoded and compressed with the gzip format. In order for us to unzip , decode the logs and send them to SNS, we need AWS Lambda service.

We create this Lambda service, as a log based triggered event(Thanks to cloudwatch logs), which receives the log events from log group, Unzips it, decodes it base 64, and sends the log to the SNS topic, whose arn is passed as Environment variable to the lambda function.

resource "aws_lambda_function" "logmonitoring_lambda" {
 function_name = "logmonitoring_lambda"
 filename   = data.archive_file.Resource_monitoring_lambda.script.output_path
 script     = data.archive_file.Resource_monitoring_lambda.script
 output_path  = data.archive_file.Resource_monitoring_lambda.script.output_path
 handler    = "lambda_function.lambda_handler"
 package_type = "Zip"
 role      = aws_iam_role.iam_for_moni_pre.arn
 runtime    = "python3.9"
 source_code_hash = filebase64sha256(data.archive_file.Resource_monitoring_lambda.script.output_path)

 timeouts {}

 tracing_config {
  mode = "PassThrough"
 }

 environment {
  variables = {
   sns_arn = "${aws_sns_topic.logsns.arn}"
  }
 }
}

resource "aws_lambda_permission" "allow_cloudwatch" {
 statement_id = "AllowExecutionFromCloudWatch"
 action    = "lambda:InvokeFunction"
 function_name = aws_lambda_function.logmonitoring_lambda.function_name
 principal   = "logs.${data.aws_region.current.name}.amazonaws.com"
 source_arn  = "arn:aws:logs:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:*"
}

Having received the decoded logs from lambda, the SNS (Simple Notification Service) topic sends this filtered log to its email subscription and the subscribed email owner gets the email about the filtered log.

resource "aws_sns_topic" "logsns" {
 name = "logsns"
}

resource "aws_sns_topic_subscription" "snstoemail_email_target" {
 topic_arn = aws_sns_topic.logsns.arn
 protocol = "email"
 endpoint = var.email
}

The resources in this architecture, as it it is server-less, are only invoked when there there are such key words in the logs. Hence this method is cost optimized.

If you would like to connect with me , you can follow my blog here (or) on linked-in and you can find all the code in my Git-hub.

Here is the lambda python script:

import gzip
import json
import base64
import boto3
import os

def lambda_handler(event, context):
  log_data = str(gzip.decompress(base64.b64decode(event["awslogs"]["data"])), "utf-8")
  json_body = json.loads(log_data)
  print(json_body)

  sns = boto3.client('sns')
  print(os.environ['snsarn'])
  response = sns.publish(
    TopicArn=str(os.environ['snsarn']),
    Message=str(json_body)
  )
  print(response)

InfraSecOps : Enable Monitoring and automated continuous Compliance of Security Groups using Cloud-watch and Lambda

July 12, 2024 · 5 min read

As a Dev-ops engineer, we use different compute resources in our cloud, to make sure that different workloads are working efficiently. And in order to restrict the traffic accessing our compute resources ( EC2/ECS/EKS instance in case of AWS) , we create stateful firewalls ( like Security groups in AWS). And as a lead engineer, we often describe the best practices for configuring the Security groups.But when we have large organization working on cloud, monitoring and ensuring each team follows these best practices is quite a tedious task and often eats up lot of productive hours. And it is not as if we can ignore this, this causes security compliance issues.

For example, the Security group might be configured with following configuration by a new developer ( or some rogue engineer). If we observe the below , security group which is supposed to restrict the traffic to different AWS resources is configured to allow all kinds of traffic on all protocols from the entire internet. This beats the logic of configuring the securing the resource with security group and might as well remove it.

{  
    "version": "0",  
    "detail-type": "AWS API Call via CloudTrail",  
      "responseElements": {  
        "securityGroupRuleSet": {  
          "items": \[  
            {  
              "groupOwnerId": "XXXXXXXXXXXXX",  
              "groupId": "sg-0d5808ef8c4eh8bf5a",  
              "securityGroupRuleId": "sgr-035hm856ly1e097d5",  
              "isEgress": false,  
              "ipProtocol": "-1",  --> It allows traffic from all protocols  
              "fromPort": -1, --> to all the ports  
              "toPort": -1,  
              "cidrIpv4": "0.0.0.0/0" --> from entire internet, which is a bad practice.  
            }  
          \]  
        }  
      },  
    }  
  }

This kind of mistake can be done while building a Proof Of Concept or While testing a feature, which would cost us lot in terms of security. And Monitoring these kind of things by Cloud Engineers takes a toll and consumes a lot of time.What if we can automate this monitoring and create a self-healing mechanism, which would detect the deviations from best practices and remediate them?

The present solution that i have built in AWS, watches the each Security group ingress rule ( can be extended to even egress rules too) the ports that it is allowing, the protocol its using and the IP range that it communicating with. These security group rules are compared with the baseline rules that we define for our security compliance, and any deviations are automatically removed. These base-rules are configured in the python code( which can be modified to our liking, based on the requirement).

Components used to build this system

AWS Cloud trail
AWS event bridge rule
AWS lambda
AWS SNS
S3 Bucket
Whenever a new activity ( either creation/modification/deletion of rule) is performed in the security group, its event log not sent as event log to cloud watch ,but as api call to cloud trail. So to monitor these events, we need to first enable cloud trail. This cloud trail will monitor all the api cloud trails from EC2 source and save them in a log file in S3 bucket.
Once these api calls are being recorded, we need to filter only those which are related to the Security group api calls. This can be done by directly sending all the api call to another lambda or via AWS event bridge rule. The former solution using lambda is costly as each api call will invoke lambda, so we create a event bridge rule to only cater the api calls from ec2 instance.

3. These filtered API events are sent to the lambda, which will check for the port, protocol and traffic we have previously configured in the python code( In this example, i am checking for wildcard IP — which is entire internet, all the ports on ingress rule. You can also filter with with the protocol that you don't want to allow. Refer the code for details)

4. This python code will filter all the security groups and find the security group rules, which violate them and delete them.

Creating a rouge security group ruleThe lambda taking action and deleting the rouge rule

5. Once these are deleted, SNS is used to send email event details such as arn of security group rule, the role arn of the person creating this rule, the violations that the rule group has done in reference to baseline security compliance. This email altering can help us to understand the actors causing these deviations and give proper training on the security compliance. The details are also logged in the cloud-watch log groups created in the present architecture.

For entire python code along with terraform code, please refer the following Github repo. To replicate this system in your environment, change the base security rules that you want to monitor for in python and type terraform apply in the terminal. Sit back and have a cup of coffee, while the terraform builds this system in your AWS account.

Liked my content ? Feel free to reach out to my LinkedIn for interesting content and productive discussions.

Secure your data and internet traffic with your Personalized VPN in AWS

July 12, 2024 · 8 min read

Introduction

In today’s era, the internet has become embedded into the very fabric of our lives. It has revolutionized the way we communicate, work, shop, and entertain ourselves. With the increasing amount of personal information that we share online, data security has become a major concern. Cyber-criminals are constantly on the lookout for sensitive data such as credit card information, social security numbers, and login credentials to use it for identity theft, fraud, or other malicious activities.

Moreover, governments and companies also collect large amounts of data on individuals, including browsing history, location, and personal preferences, to model the behavior of the users using deep-learning clustering models. This data can be used to coerce users psychologically to buy their products or form an opinion that they want us to form.

To overcome this issue, we can use a VPN which can be used to mask the user’s identity and route our traffic through a remote server. In addition, we can bypass internet censorship and access content that may be restricted in our region, which enables us to access our freedom to consume the data we want rather than what the governments/legal entities want us to consume. The VPNs we will discuss are of two types: Public VPNs such as Nord VPN, Proton VPN, etc., and private VPNs. Let’s try to understand the differences amongst them.

Private vs Public VPN

Public VPNs are VPN services that are available to the general public for a fee or for free. These services typically have servers located all around the world, and users can connect to any of these servers to access the internet.

Private VPNs, on the other hand, are VPNs that are created and managed by individuals or organizations for their own use. Private VPNs are typically used by businesses to allow remote employees to securely access company resources, or by individuals to protect their online privacy and security.

Not all VPNs are created equal, and there are risks associated with using public VPN services over private VPN as follows:

Risks of Using Private VPNs

Trustworthiness of the VPN Provider

When using a private VPN, you are essentially entrusting your online security to the VPN provider. If the provider is untrustworthy or has a history of privacy breaches, your data could be compromised. Unfortunately, many private VPN providers have been caught logging user data or even selling it to third-party companies for profit.

Potential for Malware or Adware

Some private VPNs have been found to include malware or adware in their software. This can be particularly dangerous, as malware can be used to steal sensitive information, while adware can slow down your computer and make browsing the web more difficult.

Unreliable Security of your Data

Private VPNs may not always provide reliable security. As the service is managed by the third-party service, it is difficult to understand how their system is working behind the closed doors. There may be logging data which can be easily used to identify the user, which would straight away remove the idea of anonymity of use.

Benefits of Creating Your Own Personal VPN

Complete Control Over Your Online Security

By creating your own personal VPN, you have complete control over your online security. You can choose the encryption protocol, server locations, and other security features, ensuring that your data is protected to the fullest extent possible.

No Third-Party Involvement

When using a private VPN provider, you are relying on a third-party to protect your online security. By creating your own personal VPN, you eliminate this risk entirely, as there are no third parties involved in your online security.

Cost-Effective

While some private VPN providers charge high monthly fees for their services, creating your own personal VPN can be a cost-effective solution. By using open-source software and free server software, you can create a VPN that is just as secure as a private VPN provider, but without worrying about your browsing history privacy or the excessive costs.

Setting up OpenVPN in AWS

OpenVPN is an open-source project that can be used to create your custom VPN using their community edition and setting things up on your VPN server. Once the VPN server is set up, we use the Open-VPN client to connect to our VPN server and tunnel our traffic through the instance. For setting up the Open-VPN server, we are going to need the following things:

An AWS account
A little bit of curiosity..!

We are going to set up the VPN server in an AWS EC2 instance, which would be used to connect with our Open-VPN client on all our devices.The Open-VPN company also provides a purpose-built OpenVPN Access Server as an EC2 AMI which comes out of the box with AWS-friendly integration , which we are going to use in this blog.

Setup Open-VPN server in AWS:

Once you have setup the AWS, login to your AWS account and search for EC2.
Once you are in the AWS EC2 console, switch to the region you want you VPN to be in and then click “Launch instances” button on the right side of the screen.
In the Ec2 creation console, search for AMI named “openvpn”. You will see a lot of AMI images. Based on the number of VPN connections you require, select the AMI. For the Current demonstration, I am choosing AMI which serves two VPN connection.
Choosing the above VPN, sets the Security group by itself. Ensure that the Ec2 is publicly accessible ( Either with EIP or setting Ec2 in public-subset). Once done press “Launch Instance”.
When we connect to the Ec2 instance, we are greeted with the OpenVPN server agreement. Create the settings as shown below and at the end, create an password.
Once done, open https://:943/admin’ where you would see an login page. Enter your user name and login that you have set in the VPN server, which in my case, username is openvpn and enter your previously set password.
You would enter the openVPN settings page. In configuration>Vpn settings, scroll to the bottom and toggle “Have clients use specific DNS servers” to ON. In the primary DNS enter 1.1.1.1 and in secondary dns enter 8.8.8.8. After this, click save changes on the bottom of the screen.
If you scroll to the top you will see a banner with “Update Running Server”, click on it.
You are set on the Open-VPN server side !

Connecting to Open-VPN server from our device:

Once the server is configured, we would require client to connect to out openVPN server. For that purpose we need to install “Open-VPN connect”

For Windows : Download and install the open-VPN connect from here
For Mobile : Search for “openvpn connect” in the play-store (for Android) and in app-store(for apple)
For Linux:

First ensure that your apt supports the HTTPS transport:

apt install apt-transport-https

Install the Open-VPN repository key used by the OpenVPN 3 Linux packages

curl -fsSL <https://swupdate.openvpn.net/repos/openvpn-repo-pkg-key.pub> | gpg --dearmor > /etc/apt/trusted.gpg.d/openvpn-repo-pkg-keyring.gpg

Then you need to install the proper repository. Replace $DISTRO with the release name depending on your Debian/Ubuntu distribution. For distro list, refer here

curl -fsSL <https://swupdate.openvpn.net/community/openvpn3/repos/openvpn3-$DISTRO.list> >/etc/apt/sources.list.d/openvpn3.list 
apt update 
apt install openvpn3

Once installed open “Open-VPN connect“ , we should see the something like below.
In the URL form, enter the IP of your EC2 instance and click NEXT. Accept the certificate pop-ups you would get during this process.
In the user name form, enter the username that you have set in the server and same for the password. Then click IMPORT.
Once imported, click on the radio button and enter your credentials again.
Once connected you should see the screen link this. Voila ! enjoy using your private VPN in EC2.

Liked my content ? Feel free to reach out to my LinkedIn for interesting content and productive discussions.

Developing Real-time resource monitoring via email on AWS using Terraform

July 12, 2024 · 4 min read

One of the main tasks as an SRE engineer is to maintain the infrastructure that is developed for the deployment of the application. As each of the service exposes the logs in different way, we need plethora of sns and lambdas to monitor the infrastructure. This increases the cost of monitoring, which would compel management to drop this monitoring system.

But what if i say that, we can develop this monitoring system for less than 24 cents ? And what if i say that you can deploy this entire monitoring system with just a single command “Terraform apply”? Sounds like something that you would like to know? Hop on the Terraform ride !

Key components to build the infrastructure

In order to create an monitoring system to send email alerts, we need 3 components:

Event Bridge
SNS
Email subscription

We can build a rudimentary monitoring system, by combining all these components. But the logs we get as email, would be as following:

{
  "version": "1.0",
  "timestamp": "2022-02-01T12:58:45.181Z",
  "requestContext": {
    "requestId": "a4ac706f-1aea-4b1d-a6d2-5e6bb58c4f3e",
    "functionArn": "arn:aws:lambda:ap-south-1:498830417177:function:gggg:$LATEST",
    "condition": "Success",
    "approximateInvokeCount": 1
  },
  "requestPayload": {
    "Records": [
      {
        "eventVersion": "2.1",
        "eventSource": "aws:s3",
        "awsRegion": "ap-south-1",
        "eventTime": "2022-02-01T12:58:43.330Z",
        "eventName": "ObjectCreated:Put",
        "userIdentity": {
          "principalId": "A341B33DQLH0UH"
        },
        "requestParameters": {
          "sourceIPAddress": "43.241.67.169"
        },
        "responseElements": {
          "x-amz-request-id": "GX86AGXCNXB5ZYVQ",
          "x-amz-id-2": "CPVpR8MNcPsNBzxcF8nOFqXbAIU60/zQlNC6njLp+wNFtC/ZnZF0SFhfMuhLOSpEqMFvvPqLA+tyvaXJSYMXAByR5EuDM0VF"
        },
        "s3": {
          "s3SchemaVersion": "1.0",
          "configurationId": "09dae0eb-9352-4d8a-964f-1026c76a5dcc",
          "bucket": {
            "name": "sddsdsbbb",
            "ownerIdentity": {
              "principalId": "A341B33DQLH0UH"
            },
            "arn": "arn:aws:s3:::sddsdsbbb"
          },
          "object": {
            "key": "[variables.tf]",
            "size": 402,
            "eTag": "09ba37f25be43729dc12f2b01a32b8e8",
            "sequencer": "0061F92E834A4ECD4B"
          }
        }
      }
    ]
  },
  "responseContext": {
    "statusCode": 200,
    "executedVersion": "$LATEST"
  },
  "responsePayload": "binary/octet-stream"
}

Not so easy to read right ? What if we can improve it, making it legible for anyone to understand what is happening?

To make it easy to read, we use the feature in the Event bridge called input transformer and input template. This feature helps us in transforming the log in our desired format without using any lambda function.

Infrastructure Working

The way our infrastructure works is as follows:

Our event bridge will collect all the logs from all the events from the AWS account, using event filter.
Once collected, these are sent to input transformer to parse and read our desired components.
After this, we use this parsed data to create our desired format using input template.

Input transformer and input templete for event bridge rule

This transformed data is published to the SNS that we have created.
We create a subscription for this SNS, via email,SMS or HTTP.

And Voila ! you have your infrastructure ready to update the changes…!

Here is the entire terraform code:

terraform {  
  required_providers {  
    aws = {  
      source  = "hashicorp/aws"  
      version = "~> 3.0"  
    }  
  }  
}\# Configure the AWS Provider  
provider "aws" {  
  region = "ap-south-1" #insert your region code  
}resource "aws_cloudwatch_event_rule" "eventtosns" {  
  name = "eventtosns"  
  event_pattern = jsonencode(  
    {  
      account = [ 
        var.account,#insert  your account number  
     ]  
    }  
  )}resource "aws_cloudwatch_event_target" "eventtosns" {\# arn of the target and rule id of the eventrule  
  arn  = aws_sns_topic.eventtosns.arn  
  rule = aws_cloudwatch_event_rule.eventtosns.idinput_transformer {  
    input_paths = {  
      Source      = "$.source",  
      detail-type = "$.detail-type",  
      resources   = "$.resources",  
      state       = "$.detail.state",  
      status      = "$.detail.status"  
    }  
    input_template = "\\"Resource name : <Source> , Action name : <detail-type>,  
      details : <status> <state>, Arn : <resources>\\""  
  }  
}resource "aws_sns_topic" "eventtosns" {  
  name = "eventtosns"  
}resource "aws_sns_topic_subscription" "snstoemail_email-target" {  
  topic_arn = aws_sns_topic.eventtosns.arn  
  protocol  = "email"  
  endpoint  = var.email  
}\# aws_sns_topic_policy.eventtosns:  
resource "aws_sns_topic_policy" "eventtosns" {  
  arn = aws_sns_topic.eventtosns.arnpolicy = jsonencode(  
    {  
      Id = "default_policy_ID"  
      Statement = [ 
        {  
          Action = [ 
            "SNS:GetTopicAttributes",  
            "SNS:SetTopicAttributes",  
            "SNS:AddPermission",  
            "SNS:RemovePermission",  
            "SNS:DeleteTopic",  
            "SNS:Subscribe",  
            "SNS:ListSubscriptionsByTopic",  
            "SNS:Publish",  
            "SNS:Receive",  
         ]  
          condition = {  
            test     = "StringEquals"  
            variable = "AWS:SourceOwner"  
            values = [ 
              var.account,  
           ]  
          }Effect = "Allow"  
          Principal = {  
            AWS = "\*"  
          }  
          Resource = aws_sns_topic.eventtosns.arn  
          Sid      = "__default_statement_ID"  
        },  
        {  
          Action = "sns:Publish"  
          Effect = "Allow"  
          Principal = {  
            Service = "events.amazonaws.com"  
          }  
          Resource = aws_sns_topic.eventtosns.arn  
          Sid      = "AWSEvents_lambdaless_Idcb618e86-b782-4e67-b507-8d10aaca5f09"  
        },  
     ]  
      Version = "2008-10-17"  
    }  
  )  
}

This entire infrastructure can be deployed using Terraform apply on above code.

Liked my content ? Feel free to reach out to my LinkedIn for interesting content and productive discussions.

Developing Visual Documentation using diagrams in python : Diagrams as Code - a novel approach for graphics

July 12, 2024 · 5 min read

We as developers, have read the documentation for different frameworks/libraries, while developing features. But when it comes to us developing documentation for the feature, we usually are in hurry, as our sprint ended or the project has pushed way beyond the deadline.

In addition to that, when we develop the documentation in black ink( just reminiscing the previous version of documentation, in literal ink!), sometimes it very difficult to communicate complex cloud architecture or even systems design via text. So we overcome this problem using images, but we have to leave our beloved IDEs to create them in illustrator/photoshop. What if i tell you that we can develop awesome graphics right from our IDEs, using python.

Introducing Diagrams, a python library which lets you create cloud architecture diagrams using code!!!

Diagrams

Diagrams is an python library, which offers Diagrams as Code (DaC) purpose. It helps us build architecture of our system as code and track changes in our architecture. Presently it covers some of major providers such as AWS, Azure, GCP and other providers such as Digital ocean, Alibaba cloud etc. In addition to that they also support onPrem covering several services such as apache,Docker, Hadoop etc.

Advantages of using Diagrams

Still considering whether to use diagrams or not? How about the following reasons:

No Additional Software Overhead: To create diagrams traditionally, we might want to use softwares such as illustrator or photoshop, which requires additional licenses. Even if we choose open source such as inkscape or Gimp, we still need to install these resources. With diagrams, there is no such thing, just pip install diagrams , you are good to go!
No need to search for high resolution images: When developing these images, we would like to have high resolution images, which can be exported to screen of any size. And often it is a hassle to get these kind of images. Thanks to diagrams in-built repository of images , we are able to build high resolution architecture diagrams with ease.
Ease of Editing: Lets say the your architecture changes during the project timeline( Hey, I know it happens in a project), but changing each of these components manually takes lot of time and effort. Thanks to the Diagrams as code framework, we do this work with ease with few lines of the code.
Reusability: Creating diagrams via code helps us in replicating the product, without any additional effort. All we need to do is import code and lo, behold, we have have our work ready in front of us. Thanks to the power of coding, we are able to replicate and create reusability with our code.

Now that we have seen the reasons why to use it, let’s get our hands dirty working with diagrams in python environment.

Diagrams Implementation example with custom node development and clustering:

Here i am going to create the diagram for project on Developing Real-time resource monitoring via email on AWS using Terraform. To brief the project, I have developed serverless architecture to create notifications for any state change or status change etc. in a clean readable format (rather than in complicated json) in real time via email service. This architecture is developed in the AWS and deployed using terraform. For more details, read this article.

At highend architecture , the components involved are:

Eventbridge
SNS and
Email

The email component is not available in the diagrams library. To create it, we can create our custom email node using custom node development method, where we pass our local image as new node, using following code.

from diagrams.custom import Customemail = Custom(‘Name that you want to see’, ‘path of the image’)

Now that we have our components ready, lets code:

with Diagram(“AWS resource monitoring via email notification”) as diagram1: email = ‘/content/drive/MyDrive/gmail-new-icon-vector-34182308.jpg’ emailicon = Custom(‘Email notification’, email) Eventbridge(“Event bridge rule”) >> Lambda(“Lambda”) >> SNS(‘SNS’) >> emailicon

By implementing the above code, we get the following:

As we have developed this is AWS environment using Terraform, I would like to create a cluster wrapping on the above code,using diagrams.Cluster.

with Diagram(“AWS resource monitoring via email notification”) as diag: email = ‘/content/drive/MyDrive/gmail-new-icon-vector-34182308.jpg’ emailicon = Custom(‘Email notification’, email) with Cluster (“Terraform”): with Cluster (‘AWS’): Eventbridge(“Event bridge rule”) >> Lambda(“Lambda”) >> SNS(‘SNS’) >> emailicon

After embedding it in the cluster, the final image looks like:

Final image for the entire architecture

Here is the Final code in totality:

from diagrams import Diagram  
from diagrams.aws.compute import Lambda  
from diagrams.aws.integration import SNS  
from diagrams.aws.integration import Eventbridge  
from diagrams import Cluster, Diagram  
from diagrams.custom import Customwith Diagram(“AWS resource monitoring via email notification”) as diag:  
  email = ‘/content/drive/MyDrive/gmail-new-icon-vector-34182308.jpg’  
  emailicon = Custom(‘Email notification’, email) with Cluster (“Terraform”): with Cluster (‘AWS’): Eventbridge(“Event bridge rule”) >> Lambda(“Lambda”) >> SNS(‘SNS’) >> emailicon

Follow me on Medium and Github for more Cloud, Dev-ops related content.

Happy Learning and Good Day..!

Why Log Monitoring ?​

Server-Centric (or) Server-less Architecture?​

Case Example​

Key components to build Log Monitoring and Alerting System​

Building Infrastructure and Working with Terraform​

Components used to build this system

Introduction​

Private vs Public VPN​

Risks of Using Private VPNs​

Benefits of Creating Your Own Personal VPN​

Setting up OpenVPN in AWS​

Setup Open-VPN server in AWS:​

Connecting to Open-VPN server from our device:​

Key components to build the infrastructure​

Infrastructure Working​

Diagrams Implementation example with custom node development and clustering:​

Why Log Monitoring ?

Server-Centric (or) Server-less Architecture?

Case Example

Key components to build Log Monitoring and Alerting System

Building Infrastructure and Working with Terraform

Introduction

Private vs Public VPN

Risks of Using Private VPNs

Benefits of Creating Your Own Personal VPN

Setting up OpenVPN in AWS

Setup Open-VPN server in AWS:

Connecting to Open-VPN server from our device:

Key components to build the infrastructure

Infrastructure Working

Diagrams Implementation example with custom node development and clustering: