This is my notes from containerizing a legacy application with Docker compose. We have to run multiple instances of our application because we’re unable to secure additional VMs for this single-VM education environment. The application is target of containerization, because it requires mass reconfiguration (around TCP port) to run multiple instances of the application. We want to use the same application configuration file for multiple containers, and map the TCP port to different groups of ports on the host, leveraging port mapping in Docker. On the other hand, the auxiliary services are not being containerized, such as Cassandra database and ElasticSearch because they can be shared for multiple application instances. In other words, we use Docker to isolate processes of the same application.
Prepare environment
The CentOS server needs to have docker-ce (through YUM) as well as docker-compose (direct download). They can be installed this way:
$ sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
$ sudo yum install docker-ce docker-ce-cli containerd.io
$ curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
$ sudo chmod +x /usr/local/bin/docker-compose
$ sudo systemctl start docker
Our Docker registry is not publicly available. So we need to port the Docker image we need to remote server and load it into the local registry. We first examine the registry locally:
$ curl -XGET https://admin:[email protected]/v2/dhunch/tags/list | python -m json.tool
Once we identify the image, we export it to a tar file:
$ docker save docker.digihunch.com/dhunch > dhunch_image.tar
SCP the file to remote server and load it locally:
$ docker load -i /home/dhunch/dhunch_image.tar
$ docker image ls
We need to distinguish these commands:
- docker save: saves an (non-running) image with all layers to file
- docker export: saves a running or paused container to file
- docker import: import the contents from a tarball to create a filesystem image, most used with docker export
- docker load: load an image from a tar archive or STDIN, most used with docker save
Build docker-compose file
I need to cater to the customer environment with a newly create docker-compose file. The customer environment includes specific storage and networking configurations. Docker compose’s official documentation is here. We repeat the following commands for our troubleshooting:
$ docker-compose up -d
$ docker-compose exec -it dhunch1 bash
$ docker container ls
Once we start the container, the status might go unhealthy after it starts. The documentation explains two reasons you’re seeing an unhealthy container:
- a single run of the command takes longer than the specified timeout
- health check fails; the health check command will retry a number of times before it declares the container as unhealthy.
In our case, It is most likely because it does not pass a built-in health check mechanism. We need to understand where the health check was defined. There are four ways to enable health check:
- Dockerfile instruction when building the image
- Docker run command
- Docker-compose or docker stack yaml file
- Docker service
With #1, unfortunately, you can’t reverse engineer an image and view the Dockerfile that were used to built it and review the health check statement. What you can do is check docker events, or inspect the container, and go to the log files as specified under logPath section in the inspection result and look for HealthCheck section. We determined it is the case, then we can disable, or override the built-in healthcheck command from image, with a statement in docker compose.
For network interface, docker compose also allows us to specify MAC address for each container with mac_address keyword (for license key). MAC address generator are available on the internet.
EntryPoint vs CMD
The difference between EntryPoint and CMD is very important when launching container. Some literature also mentions RUN, which is only used when building a new layer of images so it is not relevant here (in the context of launching a container from image). EntryPoint and CMD has similar functionalities both allowing you to specify a command to run. The difference is whether they can be overwritten by command line arguments that user provide to docker-compose or docker run in an ad-hoc manner. As their names suggests, EntryPOINT means what is specified under it must be executed as it launches into the container, regardless of any adhoc commands. On the other hand, CMD is just an entry to save users from typing in a command every time they run docker compose or docker run. Should user prefer a different command, it can be provided as an explicit argument and it will be respected overwriting the pre-defined CMDentry in Dockerfile or command entry in docker-compose.yml.
Both CMD and EntryPoint supports shell and exec forms. More details here.
Choice of Networking
With single-host deployment, the containerized application needs to communicate with other existing, non-containerized service on host, such as database or elastic search. If docker uses host network, the container shares interface with the host and it does not have its own IP address. Host network removes isolation between container and host. This allows container to run the application that was licensed to the host based on MAC address. There is also no port mapping from container to host network. Container simply uses port on host, and is subject to the availability of TCP/UDP port on host.
We will have to use bridge network here. We can force MAC address the app container, and pre-generate license. For container to communicate with a service on host, through bridge network, there are two problems to address:
- Container knows the IP of the host (layer-3 connectivity, ping);
- Making host service available to container (layer-4 connectivity, telnet);
Docker creates its own interface for bridge network. If it’s an unnamed network, i.e. not explicitly declared under networks section in docker compose, then interface docker0 is used. If it’s a named network, then an interface name starting with br- is used.
The first problem is easier to address, we simply needs to IP address of the host on the interface. We can validate by pinging from container to host. Docker can also use host.docker.internal to reference the host. Unfortunately, this stopped working for linux since 18.09.3.
It is reportedly to be fixed in 20.04 and until it is available, we may add it to manual dns. The following command outputs the entry to add to /etc/hosts in container.
# ip -4 addr show $(basename -a /sys/class/net/* | grep ^br-) | grep -Po 'inet \K[\d.]+' | awk '{print $1 " host.docker.internal"}'
To do this automatically in docker compose, we need some tricks:
- Store the Host IP in host environment variable ( use an export command)
- Use compose to pass host environment variable to container environment variable
- Have the container write its environment variable to /etc/hosts
The compose file will contain a line like this:
services:
myenv1:
image: alpine
command: >
sh -c "apk update &&
echo $$HostDNSLine >> /etc/hosts &&
bash"
#network_mode: bridge
environment:
- HostDNSLine=${HOSTDNSREC}
Note ampersand might be mistakenly displayed as & in the above. Then we run it with the following:
# export HOSTDNSREC=$(echo 1.2.3.4 host.docker.internal) && docker-compose up
The second problem is harder to address because the service on host may not bind to docker’s interface. Some services such as ssh bind to all interfaces on host and you can telnet to port 22 with any IP address the host is associated with. This is however not the case for most other services, such as Cassandra or Elastic Search. They typically only bind to main interface, such as ens192, or eth0, and not to the docker interface. In order to make the service available to container, we either need to bind these services to the docker interface, or use iptables rules as an alternative.
Suppose it is a named network and Docker’s interface name is br-90ae024d5324, and the service on host listens to port 9042, we will need the following two commands from host:
# sysctl -w net.ipv4.conf.br-90ae024d5324.route_localnet=1
# iptables -t nat -A PREROUTING -p tcp -i br-90ae024d5324 --dport 9042 -j DNAT --to-destination 127.0.0.1:9042
Note that docker compose can configure to run sysctl in container but not from host. If there are multiple ports, we can turn this into a shell script:
#!/bin/bash
tcp_port_list="9200 9042 8302 8303 8304 8305 8306"
if_name=$(basename -a /sys/class/net/* | grep ^br- | head -1)
echo enable route localnet on interface $if_name
sysctl -w net.ipv4.conf.$if_name.route_localnet=1
for tcp_port in $tcp_port_list; do
echo open host tcp port $tcp_port to interface $if_name
iptables -t nat -A PREROUTING -p tcp -i $if_name --dport $tcp_port -j DNAT --to-destination 127.0.0.1:$tcp_port
done
echo $(ip -4 addr show $(basename -a /sys/class/net/* | grep ^br-) | grep -Po 'inet \K[\d.]+' | awk '{print $1 " host.docker.internal"}')
On the other hand, binding service to multiple interfaces usually require some re-configuration on the service itself. For example, if it is Elastic Search, we need to update [network.host] entry in elasticsearch.yml to include multiple IP addresses. For Cassandra, we need to update rpc_address to 0.0.0.0 or set rpc_interface in cassandra.yml.
Integration with storage
The application in the container need to store files to storage available to host, whether it is an NFS share or a block disk. We can use volume mapping with Docker compose, to map a path in container to a path presented to host as persistent volume. At this step, we might run into permission issues.
By default, containers initializes as root (uid=1) within the container, and the entrypoint script launches application as root. When application writes to persistent volume, files are written as root user. In the legacy non-container setup, we expect the application to write file as dhunch user. Moreover, NFS volume will not allow writing files as root (if the server has root squash configured). To address this, there are two approaches:
- launch container as a regular user
- launch container as root user, then have the entrypoint script launch application as regular user (dhunch)
For approach 1, we need to tell Docker to launch container as a regular user by specify the uid and gid for container to run application. We can specify the following envrionment variable in the compose yaml:
user: ${CURRENT_UID}
Then we assign the environment variable before running docker-compose:
# export CURRENT_UID=$(id -u dhunch):$(id -g dhunch) && docker-compose up
This allows container to initialize as the regular user. However, if the entry point script needs to perform activities that requires root permission within the container, it will fail. For example, a regular user in container will not be able to update /etc/hosts;
With approach 2, we do not specify user in docker compose so container initializes as root. Then the entry point script launches application as regular user. For example, use su command before launch Java:
su dhunch -c "
exec java \
-Xms512M -Xmx8192M \
-Djava.io.tmpdir=$APP_HOME/var/tmp \
-server \
-XX:CompileCommandFile=$APP_HOME/etc/hotspot_compiler \
-jar $APP_HOME/lib/jar/jruby-complete-*.jar \
--1.9 \
$APP_HOME/lib/rubybin/runapp.rb
"
Before doing this, we need to first create user dhunch within container, and the uid and gid must match those of the host. So that when container picks up dhunch user, it converts it to the correct uid.
groupadd -g 1011 dhunch
useradd -m -c 'regular user' -u 1011 -g 1011 dhunch
To further understand how uid and gid work, here are two posts with more information.
This user ownership setup will also work for NFS. To configure NFS, we need some extra client-side configurations in the container, as well as a special volume driver for NFS. Refer to this post.