DSB-April-2022

DSB-April-2022

Discoverix Sdn Bhd work logs for April 2022

ยท

22 min read

DSB Log April 2022

Author: Ahmad Afiq Azmi


What I Learn ๐Ÿ“š

Docker

  1. Docker and Container ๐Ÿณ

    • Install at different OS system (Windows & Linux)
    • Understand what is container
    • Differentiate docker and image
    • Install docker and build simple Node.js app
    • Understand layer of docker build
      • docker build .
    • Run the docker
      • docker run -p <ourport>:<app-port> <key>
    • Check docker process
      • docker ps -a
    • Stop docker process
      • docker stop <docker-name>
  2. Docker ๐Ÿณ

    • Stopping & Restaring Containers
      • docker ps --help
        • show containers
      • restart docker
        • docker ps -a > docker start <names>
    • Understanding Attached & Detached Containers
      • docker start <names>, NOT blocking terminal (Detached Mode default)
      • docker run -p <our port:app port> key, blocking terminal (Attached Mode default)
        • attached means, we're listening to the output of that container
    • By default, if you run a Container without -d, you run in "attached mode"
      • If you started a container in detached mode (i.e. with -d), you can still attach to it afterwards without restarting the Container with the following command:
        • docker attach CONTAINER, attaches you to a running Container with an ID or name of CONTAINER.
    • docker run -i, interactive, keep STDIN open even if not attached (espcially Python)
    • docker image inspect <id>
    • Copy Files into & from a container
      • docker cp <local> <docker dir> or docker cp <docker dir> <local>
      • we can put specific path to file or whole directory to copy
    • Naming & Tagging Containers and Images
      • docker run -p <our port:app port> -d --rm --name <tag_name> <key>
        • use for naming alias to easy remember for stop the docker or start
      • Images
        • FROM name:tag
        • name: defines a group of possible more specialized, images (node)
        • tag: defines a specialized image within a group of images (14)
        • also when building images docker build -t <name>:<tag>
    • Sharing Images and Containers
      • can share Dockerfile
        • simply run docker build .
      • can share Built Image
        • Download an image run a container based on it.

Reminder โš : The Dockerfile instructions might need surrounding files/folders (e.g. source code)

  • Pushing Images to:
    • Docker HUb
      • Official Docker image registry
      • public, private & "official" images
    • Private Registry
      • any provider / registry we want to use
      • only our own (or team) Images
    • Share : docker push <image_name>
    • Use: docker pull <image_name>
    • Login into docker cli: docker login, before pushing into docker hub repo

Managing Data & Working with Volumes

  • We saw, that anonymous volumes are removed automatically, when a container is removed.

  • This happens when you start / run a container with the --rm option.

  • If you start a container without that option, the anonymous volume would NOT be removed, even if you remove the container (with docker rm ...).

  • Still, if you then re-create and re-run the container (i.e. you run docker run ... again), a new anonymous volume will be created. So even though the anonymous volume wasn't removed automatically, it'll also not be helpful because a different anonymous volume is attached the next time the container starts (i.e. you removed the old container and run a new one).

  • Now you just start piling up a bunch of unused anonymous volumes - you can clear them via docker volume rm VOL_NAME or docker volume prune.

  • Bind mounts shortcut

    • Just a quick note: If you don't always want to copy and use the full path, you can use these shortcuts:
      • macOS / Linux: -v $(pwd):/app
      • Windows: -v "%cd%":/app
  • Adding more to the .dockerignore File

    • You can add more "to-be-ignored" files and folders to your .dockerignore file.

    • For example, consider adding the following to entries:

    • Dockerfile
      .git
      
    • This would ignore the Dockerfile itself as well as a potentially existing .git folder (if you are using Git in your project).

    • In general, you want to add anything which isn't required by your application to execute correctly.

  • Environment Variables & Security

    • One important note about environment variables and security: Depending on which kind of data you're storing in your environment variables, you might not want to include the secure data directly in your Dockerfile.

    • Instead, go for a separate environment variables file which is then only used at runtime (i.e. when you run your container with docker run).

    • Otherwise, the values are "baked into the image" and everyone can read these values via docker history <image>.

    • For some values, this might not matter but for credentials, private keys etc. you definitely want to avoid that!

    • If you use a separate file, the values are not part of the image since you point at that file when you run docker run. But make sure you don't commit that separate file as part of your source control repository, if you're using source control.

Cheet Sheet Images & Containers - Docker ๐Ÿณ

Images

  • Images are one of the two core building blocks Docker is all about (the other one is "Containers").

  • Images are blueprints / templates for containers. They are read-only and contain the application as well as the necessary application environment (operating system, runtimes, tools, ...).

  • Images do not run themselves, instead, they can be executed as containers.

  • Images are either pre-built (e.g. official Images you find on DockerHub) or you build your own Images by defining a Dockerfile.

  • Dockerfiles contain instructions which are executed when an image is built (docker build .), every instruction then creates a layer in the image. Layers are used to efficiently rebuild and share images.

  • The CMD instruction is special: It's not executed when the image is built but when a container is created and started based on that image.

Containers

  • Containers are the other key building block Docker is all about.

  • Containers are running instances of Images. When you create a container (via docker run ), a thin read-write layer is added on top of the Image.

  • Multiple Containers can therefore be started based on one and the same Image. All Containers run in isolation, i.e. they don't share any application state or written data

  • You need to create and start a Container to start the application which is inside of a Container. So it's Containers which are in the end executed - both in development and production

Key Docker Commands

For a full list of all commands, add--help after a command - e.g.docker --help , docker run --help etc

Full documentation here.

  • docker build . : Build a Dockerfile and create your own Image based on the file

    • t NAME:TAG : Assign a NAME and a TAG to an image
  • docker run IMAGE_NAME : Create and start a new container based on image IMAGENAME (or use the image id)

    • --name NAME : Assign a NAME to the container. The name can be used for stopping and removing etc.
    • -d : Run the container in detached mode - i.e. output printed by the container is not visible, the command prompt / terminal does NOT wait for the container to stop
    • -it : Run the container in "interactive" mode - the container / application is then prepared to receive input via the command prompt / terminal. You can stop the container with CTRL + C when using the -it flag
    • --rm : Automatically remove the container when it's stopped
  • docker ps : List all running containers

    • -a : List all containers - including stopped ones
  • docker images : List all locally stored images

  • docker rm CONTAINER : Remove a container with name CONTAINER (you can also use the container id)

  • docker rmi IMAGE : Remove an image by name / id

  • docker container prune : Remove all stopped containers

  • docker image prune : Remove all dangling images (untagged images)

    • -a : Remove all locally stored images
  • docker push IMAGE : Push an image to DockerHub (or another registry) - the image name/tag must include the repository name/ url

  • docker pull IMAGE : Pull (download) an image from DockerHub (or another registry) - this is done automatically if you just docker run IMAGE and the image wasn't pulled before.

Cheat Sheet Data Volumes


Data & Volumes

Images are read-only - once they're created, they can't change (you have to rebuild them to update them).

Containers on the other hand can read and write - they add a thin "read-write layer" on top of the image. That means that they can make changes to the files and folders in the image without actually changing the image.

But even with read-write Containers, two big problems occur in many applications using Docker:

  1. Data written in a Container doesn't persist: If the Container is stopped and removed, all data written in the Container is lost
  2. The container Container can't interact with the host filesystem: If you change something in your host project folder, those changes are not reflected in the running container. You need to rebuild the image (which copies the folders) and start a new container

Problem 1 can be solved with a Docker feature called "Volumes".

Problem 2 can be solved by using "Bind Mounts".


Volumes

Volumes are folders (and files) managed on your host machine which are connected to folders / files inside of a container.

There are two types of Volumes:

  • Anonymous Volumes: Created via -v /some/path/in/container and removed automatically when a container is removed because of --rm added on the docker run command
  • Named Volumes: Created via -v some-name:/some/path/in/container and NOT removed automatically

With Volumes, data can be passed into a container (if the folder on the host machine is not empty) and it can be saved when written by a container (changes made by the container are reflected on your host machine).

Volumes are created and managed by Docker - as a developer, you don't necessarily know where exactly the folders are stored on your host machine. Because the data stored in there is not meant to be viewed or edited by you - use "Bind Mounts" if you need to do that!

Instead, especially Named Volumes can help you with persisting data.

Since data is not just written in the container but also on your host machine, the data survives even if a container is removed (because the Named Volume isn't removed in that case). Hence you can use Named Volumes to persist container data (e.g. log files, uploaded files, database files etc)-

Anonymous Volumes can be useful for ensuring that some Container-internal folder is not overwritten by a "Bind Mount" for example.

By default, Anonymous Volumes are removed if the Container was started with the --rm option and was stopped thereafter. They are not removed if a Container was started (and then removed) without that option.

Named Volumes are never removed, you need to do that manually (via docker volume rm VOL_NAME , see reference below)


Bind Mounts

Bind Mounts are very similiar to Volumes - the key difference is, that you, the developer, set the path on your host machine that should be connected to some path inside of a Container.

You do that via
-v /absolute/path/on/your/host/machine:/some/path/inside/of/container .

The path in front of the : (i.e. the path on your host machine, to the folder that should be shared with the container) has to be an absolute path when using -v on the docker run command.

Bind Mounts are very useful for sharing data with a Container which might change whilst the container is running - e.g. your source code that you want to share with the Container running your development environment.

Don't use Bind Mounts if you just want to persist data - Named Volumes should be used for that (exception: You want to be able to inspect the data written during development).

In general, Bind Mounts are a great tool during development - they're not meant to be used in production (since you're container should run isolated from it's host machine).


Key Docker Commands

  • docker run -v /path/in/container IMAGE : Create an Anonymous Volume inside a Container

  • docker run -v some-name:/path/in/container IMAGE : Create a Named Volume (named some-name ) inside a Container

  • docker run -v /path/on/your/host/machine:path/in/container IMAGE : Create a Bind Mount and connect a local path on your host machine to some path in the Container

  • docker volume ls : List all currently active / stored Volumes (by all Containers)

  • docker volume create VOL_NAME : Create a new (Named) Volume named VOL_NAME . You typically don't need to do that, since Docker creates them automatically for you if they don't exist when running a container

  • docker volume rm VOL_NAME : Remove a Volume by it's name (or ID)

  • docker volume prune : Remove all unused Volumes (i.e. not connected to a currently running or stopped container)




Networking: (Cross-)Container Communication

  • Docker Network Drivers

    • Docker Networks actually support different kinds of "Drivers" which influence the behavior of the Network.

    • The default driver is the "bridge" driver - it provides the behavior shown in this module (i.e. Containers can find each other by name if they are in the same Network).

    • The driver can be set when a Network is created, simply by adding the --driver option.

      • docker network create --driver bridge my-net
        
    • Of course, if you want to use the "bridge" driver, you can simply omit the entire option since "bridge" is the default anyways.

    • Docker also supports these alternative drivers - though you will use the "bridge" driver in most cases:

      • host: For standalone containers, isolation between container and host system is removed (i.e. they share localhost as a network)

      • overlay: Multiple Docker daemons (i.e. Docker running on different machines) are able to connect with each other. Only works in "Swarm" mode which is a dated / almost deprecated way of connecting multiple containers

      • macvlan: You can set a custom MAC address to a container - this address can then be used for communication with that container

      • none: All networking is disabled.

      • Third-party plugins: You can install third-party plugins which then may add all kinds of behaviors and functionalities

    As mentioned, the "bridge" driver makes most sense in the vast majority of scenarios.


Cheat Sheet Networks Request


Network / Requests

In many application, you'll need more than one container - for 2 main reasons:

  1. It's considered a good practice to focus each container on one main task (e.g. run a web server, run a database, ...)
  2. It's very hard to configure a Container that does more than one "main thing" (e.g. run a web server AND a database)

Multi-Container apps are quite common, espcially if you're working on "real applications".

Often, some of these need to communicate though:

  • either with each other
  • or with the host machine
  • or with the world wide web


Communicatiing with the World Wide Web (WWW)

Communicating with the WWW (i.e. sending Http request or other kinds of requests to other servers) is thankfully very easy.

Consider this JavaScript example - though it'll always work, no matter which technology you're using

fetch('https://some-api.com/my-data').then(...)

This very basic code snippet tries to send a GET request to some-api.com/my-data .

This will work out of the box, no extra configuration is required! The application, running in a Container, will have no problems sending this request.


Communicating with Host Machine

Communicating with the Host Machine (e.g. because you have a database running on the Host Machine) is also quite simple, though it doesn't work without any changes.

One important note: If you deploy a Container onto a server (i.e. another machine), it's very unlikely that you'll need to communicate with that machine. Communicating to the Host Machine typically is a requirement during development - for example because you're running some development database on your machine.

Again, consider this JS example:

fetch('localhost:3000/demo').then(...)

This code snippet tries to send a GET request to some web server running on the local host machine (i.e. outside of the Container but not the WWW).

On your local machine, this would work - inside of a Container, it will fail. Because localhost inside of the Container refers to the Container environment, not to your local host machine which is running the Container / Docker!

But Docker has got you covered!

You just need to change this snippet like this:

fetch('host.docker.internal:3000/demo').then(...)

host.docker.internal is a special address / identifier which is translated to the IP address of the machine hosting the Container by Docker.

Important: "Translated" does not mean that Docker goes ahead and changes the source code. Instead, it simply detects the outgoing request and is able to resolve the IP address for that request.


Communicating with Other Containers

Communicating with other Containers is also quite straightforward. You have two main options:

  1. Manually find out the IP of the other Container (it may change though)
  2. Use Docker Networks and put the communicating Containers into the same Network

Option 1 is not great since you need to search for the IP on your own and it might change over time.

Option 2 is perfect though. With Docker, you can create Networks via docker network create SOME_NAME and you can then attach multiple Containers to one and the same Network.

Like this:

docker run -network my-network --name cont1 my-image
docker run -network my-network --name cont2 my-other-image

Both cont1 and cont2 will be in the same Network.

Now, you can simply use the Container names to let them communicate with each other - again, Docker will resolve the IP for you (see above)

fetch('cont1/my-data').then(...)


Docker Core Concepts

Containers:

  • Read-write layer on top of image
    • Isolated
    • Single-task-focused
    • Shareable, reproducible
    • Stateless (+ volumes) Images
  • Created with instructions (layers)
    • Blueprints for Containers
    • Code + environment
    • Read-only / does not run
    • Can be build + shared

Key Commands

Build an image based on a Dockerfile

docker build -t NAME:TAG .

-t NAME:TAG : Name of versions of an images
. : Build context

Run a container based on a remote or local Image

docker run --name NAME --rm -d IMAGE

--name NAME : Container name
--rm : Remove once stopped
-d : Detached mode

Share (push) and Image to a Registry (default:DockerHub)

docker push REPOSITORY/NAME:TAG

Fetch (pull) an Image from a Registry (default:DockerHub)

docker pull REPOSITORY/NAME:TAG

Docker Containers & Data

Containers are isolated and stateless.

Isolated

  • Containers have their own data and filesystem, detached from the host machine filesystem

    • Use Bind Mounts to connect host machine folders
    • -v /local/path:/container/path

Stateless

  • They can store data internally, but data will be lost if the container is removed and replaced by a new one

    • Use Volumes to persist data
    • -v NAME:/container/path

Docker vs Docker Compose

Repeating long docker build and docker run commands get annoying - especially when working with multiple containers.

Docker Compose allos you to pre-define build and run configuration in a .yaml file.

docker-compose up : Build missing images and start all containers
docker-compose down : Stop all started containers

Local Host (Development) vs Remote Host (Production)

Local host / DevelopmentRemote Host / Production
Isolated, encapsulated, reproducible development environmentsIsolated, encapsulated, reproducibleenvironments
No dependency or software clashesEasy updates: Simply replace a running container with an updated one

Develop your application in the same environment you'll run it in after deployment

Deployment is Optional

It's perfectly fine to use Docker (and Docker Compose) for local development!

  • Encapsulated environments for different projects
  • No global installation of tools
  • Easy to share and re-produce

Deployment Considerations

  1. Replace Bind Mounts with Volumes or COPY
  2. Multiple containers might need multiple hosts - but they can also run on the same host (depends on application)
  3. Multi-stage build help with apps that need a build step

Control vs Ease-of-use

ControlEase-of-use
You can launch a remote server, install Docker and run your containersYou can use a managed service instead
Full control but you also need to manage everythingLess control and extra knowledge required but easier to use, less responsibility

Kubernetes


What is Kubernetes?

According to their own website, Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.

This means that Kubernetes is a collection of tools / services and concepts that help you deploy containerized applications - typically across multiple hosts (i.e. multiple remote machines).

Kubernetes simplifies the deployment and configuration of complex containerized applications and it helps with topics like scaling and load balancing.

Services like AWS ECS also help with that but of course you have to follow the AWS-specific rules, syntax and logic / concepts for that. To a certain extent, you are "locked in" - if you want to switch to a different provider / host, you have to "translate" all your deployment configs etc.

With Kubernetes, you can set up a configuration (following the Kubernetes rules and concepts) which will work on any host that supports Kubernetes - no matter if it's a cloud provider or your own, Kubernetes-configured data center.


How does Kubernetes work?

We'll dive into the specific and concret examples later (in the next section), but generally, you can run a couple of commands against a Kubernetes Cluster (a network of machines which are configured to support Kubernetes) to then deploy and start Containers.

Typically, you will write down Kubernetes configuration files which describe your target state

  • with a couple of Kubernetes commands, you can then bring that state to life on your cluster.

Here's an example configuration file:

apiVersion: apps/v1
kind: Deployment
metadata:
    name: users-deployment
spec:
    replicas: 2
    selector:
        matchLabels:
            app: users
    template:
        metadata:
            labels:
                app: nginx
    spec:
        containers:
            - name: users
            image: my-repo/users-application


Why Kubernetes

If your projects and hence your deployments become more complex (multiple Containers, scaling, load balancing), you probably don't want to run and monitor all your containers on your remote machines manually.

You might not want to do this for various reasons - also see the "Deployment" section, earlier in the course.

Kubernetes makes setting up and configuring complex deployments easy. It will also automatically monitor your containers and restart them if they go down. It makes scaling and load balancing easy (as it's built-in).

Managing data in volumes across multiple machines is also easy to set up. And much more. For all these reasons, you might want to use Kubernetes.

As mentioned above already, you could also use any other managed Container service (e.g. AWS ECS) but you would then be kind of "locked in". With Kubernetes, you indeed use a open- source "tool" and you can use your Kubernetes configuration on any machine and any provider which supports Kubernetes.

What Kubernetes is NOT!

It's important to understand that Kubernetes is NOT one of the following things:

  • a cloud provider
    • a cloud provider service (though cloud provider might offer Kubernetes-specific services)
      • a tool or service that manages infrastructure - Kubernetes will NOT create and launch any machines or do anything like that (managed Kubernetes services by cloud providers might do that)
        • a single tool or software - Kubernetes is a collection of concepts and tools (see below)

Core Concepts

Kubernetes is a collection of concepts and tools.

Specifically, a Kubernetes Cluster is required to run your Containers on. A Kubernetes Cluster is simply a network of machines.

These machines are called "Nodes" in the Kubernetes world - and there are two kinds of Nodes:

  • The Master Node: Hosts the "Control Plane" - i.e. it's the control center which manages your deployed resources
    • Worker Nodes: Machines where the actual Containers are running on

The Master Node hosts a couple of "tools" / processes:

  • An API Server: Responsible for communicating with the Worker Nodes (e.g. to launch a new Container)
    • A Scheduler: Responsible for managing the Containers, e.g. determine on which Node to launch a new Container

CHMOD 644

Permissions of 644 mean that the

  • owner of the file has READ and WRITE access.
  • group members and other users on the system only have READ access.

Diff. between 744 and 644

chmod 744chmod 644
Owner can READ, WRITE, EXECUTEOwner can READ, WRITE
Group & others can READ onlyG roup & others can READ only

TAR

~$ sudo tar -zcvf /home/<username>/temp/<timestamp>_access_log.tar.gz -C /home/appuser/docker/sg_ai_app/volumes/sg_ai_fe/logs/ $(cd /home/appuser/docker/sg_ai_app/volumes/sg_ai_fe/logs/ && ls <timestamp>*_access.log)
~$ sudo tar -zcvf <directory_tar_output/filename.tar.gz> -C <working_directory> <list_of_file_we_want_to_compile>

Note:

directory_tar_output/filename.tar.gz : Directory to output the tar file location
<working_directory> : Directory where's the file we want to compile.
<list_of_file_we_want_to_compile> : We can use command substititon for getting the list

It turns out, $() is called a command substitution. The command in between $() or backticks (โ€œ) is run and the output replaces $(). It can also be described as executing a command inside of another command.

For example: $(cd /home/appuser/docker/sg_ai_app/volumes/sg_ai_fe/logs/ && ls <timestamp>*_access.log))




What I Do ๐Ÿ’ป

  1. Create command for maintaining Ngninx Server for Docker
  2. Learn pipe, tee, xargs
  3. Apply command substitution $()

Info ๐Ÿ“: It turns out, $() is called a command substitution. The command in between $() or backticks (โ€œ) is run and the output replaces $() . It can also be described as executing a command inside of another command.

  1. Apply command to:
    • empty out files echo -n "" <file>
      • meaning to empty out will null value
    • change file permission to mode 755 chmod -R 755
    • Compile all file based on timestamp on file name using find and glob using find or ls
      • sudo tar -cvf <todaytimestamp-month>_backup.tar $(ls 202202*_access.log)
      • find . -name "202202*_access.log" | xargs tar -cv | xz -9 -T0 > archive.tar.xz
    • To extract the tar file
      • tar -xf <filename>.tar.xz
    • Remove archived tar file except N latest "timestamp_filename" ls -F *<filename>.tar | head -n -2 | xargs -r rm
    • remove file older than N days
      • find /home/appuser/db_dump/ -mtime +7 -name '*.tar.bz2' -ls -exec rm {} \;
      • find /home/ appuser/db_dump/ -mtime +7 -name '*.tar.bz2' -ls

Reminder โš : Be careful removing files with find. Run the commmand -ls before removing the files to check what you are removing first.

  1. Identify Heroku Dashboard Error with Github

  2. Identify React js error when running frontend application inside Docker container.

    • This happen when I try to run multiple containers with database, backend and frontend together at the same time.

    • At first glance, I thought my Dockerfile is the problem or my docker-compose.yaml file configuration is error but in the end its just version compatibility that I need to downgrade Node.js 17+ to Node.js 16+

Error message "error:0308010C:digital envelope routines::unsupported"
  1. Create 2 commands to delete the mongodb log

Removing the mongod.log.<swap_date_time_utc> file according to year and month and another to year:

  • Year and month
~$ sudo find . -name "mongod.log.<YYYY-MM>-*" | xargs sudo rm
  • Year
~$ sudo find . -name "mongod.log.<YYYY>-*" | xargs sudo rm

Reminder: Before puting pipe xargs sudo rm, ensure the output for find command get the result what you want to delete it first.

Note: <YYYY-MM> replace with YYYY-MM of your choice. For example, 2022-01 and for <YYYY>, replace with YYYY of your choice, for example, 2022.


References ๐ŸŒ

  1. How to extract tar.xz files in Linux and unzip all files
  2. How to change permissions for a folder and its subfolders/files in one step
  3. How to tar a file in Linux using command line
  4. How to Extract or Unzip tar.gz Files from Linux Command Line
  5. 3 Ways to Delete All Files in a Directory Except One or Few Files with Extensions
  6. Delete all files older than 30 days, based on file name as date
  7. 5 Ways to Empty or Delete a Large File Content in Linux
  8. How to Remove Files and Directories Using Linux Command Line
  9. Linux / UNIX: How To Empty Directory
  10. List the contents of a tar or tar.gz file
  11. Docker & Kubernetes: The Practical Guide [2022 Edition]
  12. Heroku and Github : Items could not be retrieved, Internal server error
  13. Docker | How to Dockerize a Django application (Beginners Guide)
  14. Introduction to Microservices, Docker, and Kubernetes
  15. Error message "error:0308010C:digital envelope routines::unsupported"
  16. DOCKER - How To Resolve "react exited with code 0"
  17. Official Website & Docs - Kubernetes
  18. Head command in Linux with examples
  19. What is $() in Linux?
ย