Michaël Gallego

This is my blog. What can you expect here? Well... Zend Framework 2, Amazon AWS...

Twitter

Google+

LinkedIn

Github

Last.fm

Using Elastic Beanstalk multi-container with PHP

I’m a big fan of Elastic Beanstalk on AWS, as it allows me to deploy application quite easily, while still benefiting from all AWS features (load balancer, auto scaling…). The recent addition of API Gateway to AWS tools will likely make using Elastic Beanstalk even more conveniently.

Old-school Elastic Beanstalk

Until now, my usual workflow when using Elastic Beanstalk was to use the pre-built. Those pre-built AMIs comes in different languages (from NodeJS to Java, and of course PHP). The Beanstalk team has been quite fast to upgrade PHP versions (although the image for PHP 5.6 was released nearly 1 year after the PHP release), and the images usually come with stuff that answer to most typical needs: Apache 2.4, main extensions already installed with PHP (Intl, Opcache…).

This is definitely the fastest way to deploy an app of Beanstalk, but unfortunately, after using them for nearly 2 years, some problems have arisen:

  • You are dependant to AWS for upgrading to newer PHP version. As I said, Amazon took nearly 1 year to release an AMI for PHP 5.6.
  • If you want to install custom extensions, you’ll need to use the infamous .ebextensions file. This is always a very painful experience.
  • From my experience, there was some breaking compatibilities between AMIs. While this can be expected, it is always frustrating that updating to a newer AMI (even with the same PHP version) can break your app.
  • Those pre-built AMIs are not your local machine: something that works on your machine may not work once deployed to Elastic Beanstalk.

Docker to the rescue!

In 2014, the Elastic Beanstalk launched support for single container Docker. However, not being a very technical guy when it comes to servers. Furthermore, using Docker on OS X used to be very painful. As a consequence, I never managed to make this work.

More recently, Elastic Beanstalk launched the multi-container Docker image. This feature is actually based onto another AWS service called Amazon ECS. Under the hood, it uses Docker Compose. I’ve finally decided it’s time to use Docker!

Docker provides us some major advantages:

  • You are no longer restricted to Amazon images. Actually, I’ve been able to deploy a demo app using PHP 7 beta!
  • You can finely configure the image, and install the extensions you actually need.
  • The underlying architecture you deploy to Elastic Beanstalk will be the same than what you have on your computer.

Unfortunately, the documentation is really lacking. AWS only has a very short documentation with a single index.php file. There are still of lot of shadows area to me, and I’d be happy if you could share your thoughts to make this workflow even easier!

Pre-requisites

For this tutorial, I assume that you have Docker and AWS EB CLI tools properly installed. If you are using OS X, you will also need boot2docker.

Basic application structure

Let’s first create a new folder, with this structure:

php-app
    composer.json
    public
        index.php
    vendor
        ...
Dockerrun.aws.json

As you can see, the root folder contains a php-app folder that will actually contain the typical PHP application. At the root, we also have a special file called Dockerrun.aws.json. This file allows to indicate to Elastic Beanstalk which Docker image to use, what is the relationship between them (remember that we are using multi-container Docker, so we could have one PHP container as well as one MySQL container).

First, you will need to initialize your Elastic Beanstalk application by using the eb init command (be sure to select “Docker Multi-Container” environment type).

Here is our index.php file:

<?php

chdir(dirname(__DIR__));    
include 'vendor/autoload.php';

phpinfo();

Nothing fancy here. Then, our Dockerrun.aws.json:

{
  "AWSEBDockerrunVersion": 2,
  "volumes": [
    {
      "name": "php-app",
      "host": {
        "sourcePath": "/var/app/current/php-app"
      }
    }
  ],
  "containerDefinitions": [
    {
      "name": "php-apache",
      "image": "php:7.0-apache",
      "essential": true,
      "memory": 128,
      "mountPoints": [
        {
          "sourceVolume": "php-app",
          "containerPath": "/var/www/html",
          "readOnly": true
        }
      ],
      "portMappings": [
        {
          "hostPort": 80,
          "containerPort": 80
        }
      ]
    }
  ]
}

This file actually follows the Amazon ECS service syntax, as explained here.

The volumes key allows you to define where are the files you want to use for your application. /var/app/current actually refers to the root of your application. In our case, we are using /var/app/current/php-app as this is where lies our actual PHP code.

Then, the containerDefinitions key allows to configure each container, which image to use… I won’t go into details here, but here are some important parameters:

  • image: a Docker image is simply a pre-built operating system, that contains all the software you need to launch a container that will execute your code. There are A LOT of existing images, especially for PHP. The official PHP Docker image comes with a lot of pre-defined images, for each PHP version, as well as with or without Apache. For instance, in this tutorial, we are using the PHP 7.0 version with Apache.
  • mountPoints: this will actually copy data from the source volume to a folder inside the container. Here, we indicate to Elastic Beanstalk to copy the volume identified by the php-app name (the one we’ve defined in volumes) to the /var/www/html folder.
  • portMappings: this allows to expose a port for Apache. This is needed, otherwise Elastic Beanstalk will refuse to create the container.

Creating a custom Docker image

Unfortunately, the official Docker image for PHP is voluntarily lightweight. For instance, major extensions for PHP (Opcache, Intl, PDO MySQL…) and Apache (Rewrite…) are not installed by default. The problem is that the multi-container environment type in Elastic Beanstalk does not allow us to customize a Docker image.

Therefore, we will need to create our own image, and publish it to Docker Hub, in order to be able to use it in our application.

I’m not sure yet if it’s better create the customized Dockerfile as part of your application or as part of another repository. In anyway, I’ve created a small Dockerfile that is built upon the official PHP7 + Apache, to automatically installs the Opcache, Intl, PDO MySQL/PostgreSQL as well as enabling the Rewrite engine for Apache.

Here is the final Dockerfile

FROM php:7.0-apache

RUN apt-get update && apt-get install -y zlib1g-dev libicu-dev libpq-dev
RUN docker-php-ext-install opcache
RUN docker-php-ext-install intl
RUN docker-php-ext-install pdo_mysql
RUN docker-php-ext-install pdo_pgsql

RUN ["cp", "/etc/apache2/mods-available/rewrite.load", "/etc/apache2/mods-enabled/"]

I’ve pushed this code into this demo repository, and configured Docker Hub to automatically re-builds whenver the repo is updated. You can see the image in this Docker Hub page.

Using the image in Elastic Beanstalk

Now that we have a more usable image, we will need to modify our Dockerrun.aws.json, but before that, now that we have configured our Apache to have the Rewrite module, let’s add a new config folder at the root of our folder, with a simple vhost.conf file, so that our structure now looks like this:

config
    vhost.conf
php-app
    composer.json
    public
        index.php
    vendor
        ...
Dockerrun.aws.json

The vhost.conf file sets up a simple Virtual Host that works well with Zend Framework 2, for instance:

<VirtualHost *:80>
 ServerName default
 DocumentRoot "/var/www/html/public"

 <Directory "/var/www/html/public">
   DirectoryIndex index.php
   AllowOverride All
   Order allow,deny
   Allow from all

   RewriteEngine On
   RewriteCond %{REQUEST_FILENAME} -s [OR]
   RewriteCond %{REQUEST_FILENAME} -l [OR]
   RewriteCond %{REQUEST_FILENAME} -d
   RewriteRule ^.*$ - [NC,L]
   RewriteRule ^.*$ index.php [NC,L]
 </Directory>
</VirtualHost>

Let’s now update our Dockerrun.aws.json:

{
  "AWSEBDockerrunVersion": 2,
  "volumes": [
    {
      "name": "php-app",
      "host": {
        "sourcePath": "/var/app/current/php-app"
      }
    },
    {
      "name": "apache",
      "host": {
        "sourcePath": "/var/app/current/config"
      }
    }
  ],
  "containerDefinitions": [
    {
      "name": "php-apache",
      "image": "maestrooo/eb-docker-php7",
      "essential": true,
      "memory": 128,
      "mountPoints": [
        {
          "sourceVolume": "php-app",
          "containerPath": "/var/www/html",
          "readOnly": true
        },
        {
          "sourceVolume": "apache",
          "containerPath": "/etc/apache2/sites-enabled",
          "readOnly": true
        }
      ],
      "portMappings": [
        {
          "hostPort": 80,
          "containerPort": 80
        }
      ]
    }
  ]
}

As you can see, we have added a few things.

First, because we have a new folder (config), we are going to create a new volumes (called “apache” for simplicity as it contains Apache config files). The name is not really important, however the source path is.

Next, we are doing small changes to the container definitions:

  • First, we have changed the image to use our custom Docker image.
  • Then, we have added a new mount point. Basically, this says: “hey, copy the content of the source volume called apache to the location indciated in the container path (here, this is simply the config for Apache).

Installing other dependencies (like a database)

Okay, we now have a functional PHP + Apache… but let’s admit that this is pretty limited!

We’d like, at least, a database (MySQL for instance, but feel free to choose the one of your choice).

One issue I had initially was that I actually wanted a different setup for development and production. For instance, to save on costs and being to be able to work without Internet, I’d prefer to have my database installed locally. On the other hand, once deployed, I’d prefer to use a hosted service like Amazon RDS instead of hosting the database on my own instances. This leaded me to the question: considering that Dockerrun.aws.json is used to install software both locally and in production, how could I manage having dependencies that would install only locally?

Hopefully, Nick Humrich and his tweet gave me the solution. Dockerrun.aws.json supports an additional key called localContainerDefinitions. This one works exactly the same as containerDefinitions, but the associated images are only pulled when using the eb local command! Exactly what we wanted!

As a consequence, let’s add those lines in our Dockerrun.aws.json file:

"localContainerDefinitions": [
  {
    "name": "mysql",
    "image": "mysql:5.6",
    "essential": true,
    "portMappings": [
      {
        "hostPort": 3306,
        "containerPort": 3306
      }
    ],
    "environment": [
      {
        "name": "MYSQL_ROOT_PASSWORD",
        "value": "password"
      },
      {
        "name": "MYSQL_DATABASE",
        "value": "my_db"
      }
    ]
  }
]

This code create a new container (hence the benefits of the multi-container Docker) that I’ve named mysql. It pulls the official MySQL image, with version 5.6.

I’ve specified it as essential. This means that, if for any reason, Docker is unable to create this container, the whole container creation will fail. I’ve then associated the 3306 port to this container.

Finally, the environment key allows to pass some additional environment variables that are used to automatically create a new database. There are other pre-defined variables that you can find in the documentation.

For now, I’ve been unable to know how I could make the data persistent within the Dockerrun.aws.json syntax. I’ve tried to create a new volumes and mount it, but unfortunately data is removed whenever I kill the container.

Testing

The nice thing about Docker and Elastic Beanstalk is that we can test our application locally. Because we are using Docker, this means that what we will push to the server will be exactly the same thing. In other words: if it works locally, it will work once deployed.

Elastic Beanstalk comes with a nice tool that allows to run the container locally. To do that, just run the eb local run at the root. The first time, Docker will fetch the image from the Internet, and will be put in cache for next uses.

Once it’s done, write the eb local open command, and it will open the application into your browser. If everything is going well, you should see the phpinfo()!

Connecting to the instance

You may want to connect to your local container, in order to run commands or see what is happening. Hopefully, this is quite easy to do with Docker and boot2docker. Just follow those steps:

  1. Get the container ID by typing docker ps command.
  2. Type the command docker exec -it <container-id> /bin/bash
  3. You will be connected to the instance!

Deploying

Now that we have everything working locally, we can start to deploy. This is actually easy: at the root, use the command eb deploy (you’ll need to create an environment on Beanstalk first).

Common problems

I’ve been able to make this work nicely. However, there are a few problems that can arise. I’ve found a solution for some of them, but for others, I still have no idea about how to deal with that. Do not hesitate to write a comment if you have the answer!

Environment variables

You will need to have environment variables (that will hold various API keys, some useful environment data…). To do so, Elastic Beanstalk CLI tool offer a nice command: eb local setenv. For instance, to set the DB host: eb local setenv DB_HOST=foo.

The nice thing is that it will save this locally, inside the .ebextensions folder (it won’t be committed, which is nice!). Next question is: how can we add the variable to our Beanstalk environment?

To do that, add a file inside your .ebextensions folder. This is a folder specific for Elastic Beanstalk, that will be read when you are deploying on Elastic Beanstalk. Call this file 01_env_variables.conf for instance. This file is a simple YAML file that must follow this format.

To add environment variable, copy-paste this code:

option_settings:
  - namespace: aws:elasticbeanstalk:application:environment
    option_name: DB_HOST
    value: default_value

Once deployed, you will find this value in your environment configuration, and you will be able to fill the value right into the Beanstalk interface.

Then, you can retrieve the value in your PHP code using the getenv method.

To push vendor or not to push?

Most of the time, we do not want to commit the vendor folder when pulling dependencies from Composer. Often, people let Elastic Beanstalk fetch the dependencies using the composer.json file.

However, I highly discourage you to do that, because of those reasons:

  • What if a new version of a dependency is released that break compatibility (this can happen, even for minor releases)? This means you may not have the same version locally and on AWS.
  • What if GitHub is down when Beanstalk is trying to fetch dependencies?
  • Small instances, like t1.micro, may not have enough memory to resolve a complex composer.json file

Hopefully, there is a clean solution to that: Elastic Beanstalk allows you to have a specific file, called .ebignore file. This file works exactly the same as .gitignore, but is used only by Elastic Beanstalk when it creates the zip for deployment.

This way, you can create this file, copy-paste the whole content of your .gitignore BUT removing the vendor/* part. As a consequence, the vendor folder will be part of the deployed artifact.

How to use local tools?

One another problem I’ve failed to find an answer is how could I connect to a local tool, like a local MySQL or local Elasticsearch? I’ve tried several things but unfortunately, I’ve been unable to make my Docker code connects to a MySQL DB installed locally on my Mac. If anyone has an answer, I’d be happy to hear about it!

Having said that, I consider it better practice to add other containers as part of the localContainerDefinitions key.

Conclusion

Multi-container in Beanstalk is really exciting. It opens a lot of possibilities for people like me that hate managing servers. It really helps to more correctly mimic a production environment locally.

Also, because everything is embedded into a Docker image, other developers can more quickly develop on their end, not having to care installing the right PHP version, its dependencies… manually.

However, some workflows are still very obscure to me, and prevent me to actually use this more seriously for now.