Skip to main content

2 posts tagged with "Go"

View All Tags

Optimizing Golang Docker images with multi-stage build

· 5 min read

With the increasing scale of development required to build an product, a large number of developers are required to develop, share and maintain the code. And as each of the developer’s environment are different from one another, it becomes quite a hassle to create similar environments with similar library versions. To solve this issue we use Docker, which creates similar environment experience for all the developers. When using docker, often we face problem of creating a large docker images sometimes racking up to some GBs of space. This idea very much defeats the idea that Docker has evolved beyond the classic VMs — “To create light weight and resource-light images that the developers can work easily ”

To solve this problem of docker image bloating, we have several solutions such as using dockerignore to not add unnecessary files, using distroless/minimal base images or Minimizing the number of layers. But when we are building an application, we would different tools which would rule out the possibility of using distroless images. And when building we would be dealing with several steps so there is so much availability to reduce the layers within the dockerfile.

The tools that we use to build the application are often not used when running application. So what if we can somehow separate/remove these build tools and have only the tools which will run the application? Enter the Multi-stage builds in docker.

Docker : Multi-Stage Builds

Multi-stage builds is an implementation of Builder Pattern in a Dockerfile which helps in minimizing the size of the final container, improving run time performance, allowing for better organization of Docker commands. A multi-stage build is done by breaking the single-stage dockerfile into different sections (You can think them as different jobs like build, stage etc) within the same dockerfile, thereby creating separation of environments. As each of this step would use base image that is only useful to that step while passing its outputs to next step, we can keep the docker image lean. This can also be done using different dockerfiles in a CI (Continuous Integration pipeline), by passing the output of one stage to another. But the Multi-stage feature from docker remove the need to create all these steps with our pipeline and helps keep it clean.

Creating the Multi-Stage Docker file

For explaining this, we will be building and running a Movie application written in Golang, which performs basic crud operations. The code for the app can be found here.

As we know in Go, in order for the app to run we need to compile it. On compilation it will create a executable ( pertaining to that OS) and only this executable is required to run the application. To illustrate the power of multi-stage build let’s first build it as a single stage Docker file.

Once we run docker build on the above file, we get the following executable which is around 350 MB.

Now lets separate build stage and execution stage into two different environments. For build-stage environment, lets use the Golang image based on alpine which comes loaded with all the tools required to run, test, build and validate the Golang. We build our application using the tools using this environment tools. Once this is done we pass the executable to environment which is execution/production environment which will run the executable.

Since the executable is created, we would not require much of the previous environment tools and can work with a base alpine image. Once we run docker build on this file, we observe that size of the file is around 13 MB ( named crud_multistage in the below picture) compared to 350 MB (named crud in the below picture) from single-stage Dockerfile. This multi-stage build offered around 95% reduction in total sie of the docker image

Since this image is very small, it easier for portability and can be easily to deploy in production. Although the multi-stage build sounds like a fantastic idea, there are certain scenarios when this should be used and certain scenarios when this should be avoided.

When not to use Multi-stage build:

  • When the language you are writing completely packages requirements into a single file ( like in case of GO etc) or at-least as a group of file ( in case of JavaScript etc).
  • If you are not planning to run docker exec commands on final artifact to explore the application code.
  • If you don’t require the tools and files used in build stage, further down the line to debug the final artifact.

When to use Multi-stage build:

  • When you want to minimize the total size of the final Docker image that you deploy into production.
  • When you want to speed up your CI/CD processes by running steps/stages in the Docker file in parallel.
  • When different layers in Your Docker file are straight-forward and standardized.
  • When you are fine with loosing the build intermediaries and only want the final docker artifact.

Tfblueprintgen: A Tool to Simplify Terraform Folder Setup and Provide Base Resource Modules

· 4 min read

Whether it’s a React front-end app, a Go CLI tool, or a UI design project, there is always initial toil to figure out the optimal folder structure. As this initial decision influences a lot of flexibility, extensibility, self-explanation, and easy maintenance in our projects, it is key decision to ensure a smooth developer experience.

When working with a new tool/technology/framework, our journey typically starts with reading official “getting started” handbook from their official website or even reading some articles on the same topic. We use these resources and start getting our hands dirty with hands-on experience, often using its structure as a foundation for more complex real-world projects. But these articles or tutorials are often serve us good in initial phases of the project, when the complexity is low. When we are solving complex problem involving multiple actors, the legibility and maintainability takes precedence. It becomes a daunting task to later refactor or sometimes rewrite everything from the scratch. To reduce this hassle and tackle this issue head on, I’ve distilled my Terraform experience into a CLI tool. This tool generates a battle-tested folder structure along with basic modules, allowing us to quickly hit the ground running.

Structuring Terraform Folders

Most companies and their ops teams find it cumbersome to manage multiple environments across multiple regions for their applications. We can structure our terraform folders as follows:

  • Folder structure organized by Region
  • Folder structure by Resources ( like AWS EC2, or Azure Functions etc)
  • Folder structure on use case ( like front-end app, networking etc)
  • Folder structure organized by Account
  • Folder structure organized by environment and
  • A Hybrid of all the above

Given the above options it quite becomes confusing to the teams starting with terraform to decide how to structure their projects. Based on my experience here are my 2 cents on how to structure a terraform project:

  • Create a modular style code with each module containing all the resources required to create for each use-case. These modules would serve as base blueprints which can be utilized across different environments.
  • For ex: In case of AWS, The front-end module should consist of Cloud-front, S3 bucket, cloud-front origin access control, s3 policy bucket policy and s3 bucket public access block.
  • Create a folder structure for each of the environments that you are deploying. This statement would be true, if the architecture across all the environments doesn’t change and their deployment strategies does not change.

Tfblueprintgen: A Terraform tool to generate folder structure and base blueprints

Based on the above postulates, i have created a CLI tool called Tfblueprintgen, which generates the folder structure along with the modular working blocks to create AWS building blocks. In terms of folder structure, the structure will look something like below.

Image 1 : Generated Terraform folder structure with base modules

To run the tool download the both windows and Linux binaries from here or you can build your own binary from here. Use the the binary ( if in Windows double-click to run Tfblueprintgen.exe or if it is Linux run the binary using ./Tfblueprintgen)

Image 2 : Running the tfblueprintgen tool

As described in the image 1, the tool generates two things:

  • A Parent folder which contains all the main terraform files ( outputs.tf, variables.tf and main.tf )for each environment separated in their own folders.
  • A Module folder which contains all the different basic resources, segregated in their own separate folders.

These modules can be leveraged within each of the environment folders, by calling those modules using module block and these can be applied using “terraform apply”

With this setup, you can hit the ground up and running in no time. Feel free to add more ideas as issues and Stay tuned to project.

Liked my content ? Feel free to reach out to my LinkedIn for interesting content and productive discussions.