AWS Lambda – running python bundles and arbitrary executables

In a previous post, I mentioned using Amazon Linux EC2 to create AWS Lambda compatible packages. While this works, another way to create packages that can run on AWS Lambda is to create them locally via a Docker Amazon Linux image . One downside I’ve found to this method is that sometimes these images are incompatible with some of the system files in the Lambda runtime, but at the time of writing this, I found the docker-lambda project to both create compatible lambda linux images as well as a great way to shorten lambda development cycles by emulating a lambda environment you can invoke locally.

To start, here are the instructions to build a Python 3.6 docker lambda image (of course, make sure you have Docker installed):

git clone https://github.com/lambci/docker-lambda.git   # clone project from git
cd docker-lambda/                                       # go to project directory
npm install                                             # install project node.js dependencies
cd python3.6/build                                      # go to the python Dockerfile build
docker build .                                          # build the image as per instructions in 
                                                        # the Dockerfile (takes time...)
docker images                                           # show docker images, note the id of the built image
docker tag 32e7f5244861 lambci/python3.6:build          # name and tag the built docker image using its id 
docker run -it lambci/python3.6:build /bin/bash         # create a new container based on new image and
                                                        # run it interactively (/bin/bash command is needed
                                                        # because CMD ["/bin/bash"] is not included as the
                                                        # last line in the Dockerfile
exit                                                    # leave docker container
docker ps -a                                            # locate the newly created container from the above
                                                        # above command, and note the name given to it
docker start -i vibrant_heyrovsky                       # resume interactive session with the container
                                                        # using the container name found above

So, now you have a console to a compatible Amazon Linux shell. To create lambda functions, you basically zip all the relevant files and upload to AWS lambda and after that, you can remotely invoke the required function on Lambda .

My current method will be to have two console windows – one is the above console to the docker bash, and another is a console of the host operating system (whatever OS you are running Docker on). This way, you can easily zip the lambda packages in the Docker console, and then copy them from your OS console (and from there upload them to AWS Lambda)

Setting up an AWS lambda user

Now that we have a local Lambda-compatible environment, let’s create an actual AWS user that will be used to upload and run the packages that we’ll create in our local Lambda-compatible Docker container.

To run the following, make sure you first have the AWS CLI installed on your OS.

Let’s create our lambda user using the above CLI. Of course, the assumption is that you already have a credentials file in your .aws directory which enable you to do the next part. If not, you’ll need to create a user with the appropriate privileges from the AWS IAM console, get that user’s aws key id and aws secret, then locally run aws configure and follow the instructions. This will create your initial credentials file.

We’ll now create a user that we’ll use for AWS lambda. The information here is based on this excellent simple tutorial with some minor changes to suit this one.

# Create a user group 'lambda_group'
$ aws iam create-group --group-name lambda_group

# Create a user 'lambda_user'
$ aws iam create-user --user-name lambda_user

# Add our user to the group
$ aws iam add-user-to-group --user-name lambda_user --group-name lambda_group

# Create a password for this user
$ aws iam create-login-profile --user-name lambda_user --password _your_password_here_

# Create a CLI access key for this user
$ aws iam create-access-key --user-name lambda_user

# Save user's Secret and Access Keys somewhere safe - we'll need them later

Now that we have a user, let’s authorise this user to run lambda functions, copy s3 files etc. To do this, we create a policy and grant that policy to the user we just created.

For that, create a file with the following json, and name it lambda_policy.json

{
   "Version": "2012-10-17",
   "Statement": [{
       "Effect": "Allow",
       "Action": [
          "iam:*",
          "lambda:*",
          "s3:*"
       ],
       "Resource": "*"
   }]
}

now grant the above policy to our lambda user:

aws iam put-user-policy --user-name lambda_user --policy-name lambda_all --policy-document file://lambda_policy.json

Now, let’s configure our AWS CLI so that we can perform actions as lambda_user

$ aws configure --profile lambda_user

> AWS Access Key ID [None]: <your key from the above create-access-key command>
> AWS Secret Access Key [None]: <your secret from the above create-access-key command>
> Default region name [None]: us-east-1 (or whatever region you use)
> Default output format [None]: json 

# AWS stores this information under [lambda_user] at ~/.aws/cretentials file

Finally, we need to create a role which is needed when creating a lambda function and determines what actions the lambda function is permitted to perform.

To create the role, create a file named basic_lambda_role.json with the following json text:

{
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Allow",
        "Principal": { "AWS" : "*" },
        "Action": "sts:AssumeRole"
    }]
}

Then create the role via the CLI:

$ aws iam create-role --role-name basic_lambda_role --assume-role-policy-document file://basic_lambda_role.json

The above will return the role identifier as an Amazon Resource Name (ARN), for example: arn:aws:iam::716980512849:role/basic_lambda_role . You’ll need this ARN whenever you create a new lambda function so hold on to it.

We now have all the ingredients to create, update and invoke AWS Lambda functions. We’ll do that later, but first – let’s get back to creating the code package that is required when creating a lambda function. The code package is just a zip file which contains all your code and its dependencies that are uploaded to lambda when you create or update your lambda function. The next section will explain how to do this.

Creating a AWS Lambda code package

We’ll start with creating and invoke a python package that has some dependencies, and then show how to create a package that can run arbitrary executables on AWS Lambda

Creating a local Python 3.6 package

So now, let’s make a package example that will return the current time in Seoul. To do this, we’ll install a python module named arrow, but we’ll install it in a local directory since we need to package our code with this python module. To do this, open your docker console that is running the lambda compatible environment and:

cd /var/task             # move to the base lambda directory in the docker image
mkdir arrowtest          # Create a directory for the lambda package we're going to make
cd arrowtest             # move in to the directory
pip install arrow -t ./  # install the arrow python library in this directory
ls                       # take a look at what has been added

next, we’ll create our lambda function which we’ll later invoke. (you might want to install an editor of your choice on the docker console using yum, for example via yum install vim).

So, let’s create arrowtest.py :

import sys
import arrow

def lambdafunc(event, context):
    utc = arrow.utcnow()
    SeoulTime = utc.to('Asia/Seoul')
    return "The time in Seoul is: %s" % (SeoulTime.format())

#just for local testing
if len(sys.argv)==2 and sys.argv[1]=='test':
    print(lambdafunc(None, None))

and test that it works locally in the docker shell:

python arrowtest.py test

Ok, so we have the python file with the lambda function, we have the dependencies, no all we need to do is zip the contents of the entire directory and add this zip file as a parameter to the lambda function creation.

This would work, however with larger Python libraries, you might want to remove certain files that aren’t being used by you python code and would just waste space on lambda. My rather primitive but effective method for doing this is cloning the complete directory and start removing files that seem pointless until something breaks, and then I put them back and try other things until I’m happy with the size reduction. In the cloned directory, I actually rename directories before removing them as it’s easier to run the script after renaming and rename them back if we see that the directory is needed by the script.

Let’s do it for this example:

cd ..
pwd  # should be /var/task
cp -r arrowtest arrowtest_clone
cd arrowtest_clone
ls      # let's see what's in here
du -hd1 # note how much space each directory takes (1.2MB)

Installed python libraries can contain many directories and files of different types. There are python files, binary dynamic libraries (usually with .so extensions) and others. Knowing what these are can help decide what can be removed to make the zipped package more lean. In this example, the directory sizes are a non issue, but other python libraries can get much larger.

an example of some stuff I deleted

rm -rf *.dist-info
rm -rf *.egg-info
rm six.py
rm -rf dateutil # we're not making use of this - it's just wasting space
# test that the script is still working after all we've deleted
python arrowtest.py test 
du -hd1 # we're down to 332K from 1.2MB and the script still works.

now, let’s package this directory in a zip file. if you don’t have zip installed on your docker container yet then

yum install zip

and now after removing unneeded files and dependencies, lwt’s pack our directory:

zip -r package.zip .

now that we have the package on the docker container, let’s copy it to our OS from our OS console:

docker cp vibrant_heyrovsky:/var/task/arrowtest_clone/package.zip .

(replace vibrant_heyrovsky with the name of your docker image).

So we have a zipped package that we tested on docker – let’s create a lambda function from this package and invoke it (replace arn:aws:iam::716980512849:role/basic_lambda_role with your own ARN):

aws lambda create-function --region us-east-1 --function-name lambdafunc --zip-file fileb://package.zip --role arn:aws:iam::716980512849:role/basic_lambda_role --handler arrowtest.lambdafunc --runtime python3.6 --profile lambda_user

and finally, let’s see if we can get AWS lambda to tell us the current time in Seoul:

aws lambda invoke --invocation-type RequestResponse --function-name lambdafunc --region us-east-1 --log-type Tail  --profile lambda_user out.txt  #invoke the function
cat out.txt  # check the result

the file out.txt contains the return value of the called lambda function. Next we’ll see how to update to a new package and how to pass parameters to the lambda function.

To be continued…

Quick file search for Windows

Can’t believe I haven’t heard of this utility before (in my defense, I haven’t been a heavy Windows user for a few years now):

Anyway: the software’s name is “Everything” – get it here

Amazing indexing speed of all your drives and immediate lookup of any filename, including parts of a name, regex searches etc. Perfect for when you know that file you’re looking for exists somewhere in that almost infinite maze of folders and files but using an exhaustive search would take hours or more.

Compiling binaries for AWS Lambda

To use binaries from within AWS lambda functions, you need to have the compiled binaries available within the lambda package used and for that you sometimes need to compile those yourself. To do that, you can either compile on an Amazon Linux docker image (more information on how to do this here and from the aws docs) or compile the binary on an EC2 instance that uses the same environment as the underlying AWS Lambda execution environment. This latter is what’s described here.

You can launch this EC2 by the AMI mentioned here:

Note that the EC2 instance that will be launched is an Amazon Linux EC2 (Not Debian/Ubuntu) which means that to login you’ll need to use

ssh -i my-private-key.pem ec2-user@the.ipa.ddr.ess

The following table summarizes various user names for different EC2 platforms (taken from here):

AMI type ssh username to use
Amazon Linux AMI ec2-user
Centos AMI centos
Debian AMI admin or root
Fedora AMI ec2-user or fedora
RHEL AMI ec2-user or root
SUSE AMI ec2-user or root
Ubuntu AMI ubuntu or root

Of course, don’t forget to chmod 600 your private key or you’ll get a complaint from ssh…

Once logged in to the Amazon Linux EC2, you’ll want to install the relevant development tools to be able to compile whatever it is you’re compiling. Unlike Debian family of Linux distributions (Debian, Ubuntu, Mint etc.) which use apt/apt-get/dpkg to install packages, Amazon Linux uses the yum package manager (RedHat, Fedora etc.), and while on a Debian system you’d use sudo apt-get install build-essential to install your gcc compiler and other necessary development tools, to achieve more or less the same with yum, enter:

sudo yum groupinstall "Development Tools"

Now you’re ready to start compiling. Hopefully, the next post will explain how to package and launch the compiled binaries and their dependencies. I expect a lot of environment/path definitions, dependency tracking, and other gotchas along the way, but one step at a time…

AWS Lambda or EC2

Everything is a trade-off (of course, that’s actually another Maslow-type hammer, but nevertheless a useful one) and many times (for me at least) a lot of time is wasted on agonizing getting a good understanding what the trade-offs are before committing to an implementation. Of course, part of the reason is that I never want to commit to anything, and want to have the maximum flexibility at all times, which can lead to analysis paralysis – so the first tip from this post is “It doesn’t really matter”. The reason it doesn’t really matter boils down to the wise phrase “it’s far more important to get it going than to get it right”. Of course, one could counter with “there’s never time to do it right but always time to do it over”, and the funny thing is that the conflict between these two phrases is yet another trade-off (this is so meta…) but the bottom line is that while it’s always nice if you choose the correct path from the start, you’ll never be able to avoid mistakes indefinitely and getting into “dear in the headlights” mode from fear of making a mistake is a far worse choice than any other option.

Still, the ability to make a good choice from the start is nice, and therefore I wanted to share an article that helps decide between using AWS Lambda or EC2 for a service you wish to implement.

https://www.trek10.com/blog/lambda-cost/

Running Python Script from PHP as www-data

The problem
Python script invoked from PHP via shell_exec and runs fine when PHP invoked from command line but fails when PHP triggered by browser access.

Reason
PHP, when triggered by a browser access is invoked by web server with the user www-data, while from the command line it is run as user ubuntu.

Attempting to see what happens when running PHP from the command line as user www-data would help to understand why running the script fails

One method to run php as www-data from the command line is to enable a shell for www-data user. This is done by modifying /etc/passwd so that user www-data has a shell (change the existing /usr/sbin/nologin or whatever to /bin/bash or something similar) and then sudo su www-data and try to run the python script again (see this reply for details).

Doing the above, quickly showed that the one of the imports in the python script fails when running it under www-data.
Comparing python3 -m site when running under user www-data vs. when running under ubuntu showed there is a difference in the module search paths.

Adding the missing path found for user ubuntu to user www-data via sys.path.insert was not scalable, nor possible (since the ubuntu user path is inaccessible to the www-data user), so the best way was to install the python modules (in my case, imagehash) in a way that will be accessible to the www-data user

The solution, found here illustrated how this is done

sudo mkdir /var/www/.local
sudo mkdir /var/www/.cache
sudo chown www-data.www-data /var/www/.local
sudo chown www-data.www-data /var/www/.cache
sudo -H -u www-data pip install imagehash

Method #2

Of course, a simpler alternative to this is to run apache as ubuntu, which will make all the above unnecessary assuming the situation/security requirements enable it, in which case you might want to also change the htdocs directory:

sudo vim /etc/apache2/envvars # change APACHE_RUN_USER and APACHE_RUN_GROUP to ubuntu
cd /etc/apache2/sites-available/
sudo cp 000-default.conf 000-ubuntu.conf
sudo vim 000-ubuntu.conf   # change the path for DocumentRoot
sudo a2dissite 000-default.conf
sudo a2ensite 000-ubuntu.conf

Also,
sudo vim /etc/apache2/apache2.conf

and add the following:

<Directory /home/ubuntu/>
        Options Indexes FollowSymLinks
        AllowOverride None
        Require all granted
</Directory>

Don’t forget to

sudo systemctl restart apache2

or

service apache2 reload

Quick, fresh Ubuntu 16.04 image in VirtualBox

Download the VirtualBox image from https://www.virtualbox.org/wiki/Downloads

Download the 16.04 Ubuntu image for VirtualBox from www.osboxes.org (the credential details for logging in are here: https://www.osboxes.org/faqs/)

Open VirtualBox, click the New icon, Select ‘Linux’ for type, ‘Ubuntu 64-bit’ for version (assuming you downloaded the VirtualBox 64bit vdi), click next a couple of times until you reach the ‘Hard Disk’ section and then click the ‘use an existing virtual hard disk file’ radio button
and browse for the vdi file you downloaded.

Log in using the credentials given (osboxes.org as password). After logging in it is recommended to install the guess additions (to be able to have things like shared folders and shared clipboard). Do this by selecting Devices->Insert Guest additions CD image… from the VirtualBox menu. This will add a CD icon to the unity bar on the left, meaning the guest additions CD has been mounted. Open a terminal window and type the following at the command prompt:
sudo /media/osboxes/VBox_GAs_5.2.8/VBoxLinuxAdditions.run
This will install the guest additions on the Ubuntu guest system. Ignore the #modprobe vboxsf failed# error at the end and restart the system via typing sudo restart at the command line
After successfully installing the guest additions you can eject the CD (right-click the CD icon and select eject)

After logging in again, open the terminal window again and do:
sudo apt-get update
sudo apt-get upgrade

The above update and upgrade will take a few minutes so after it is done, it is recommended to export your image should you want to have run the updated VirtualBox image on another machine without going through the entire process above yet again. Do this by closing the window and selecting ‘Power off the machine’. Next, Open the Oracle VirtualBox manager. Select File->Export Appliance from the menu, select the name of machine and click Next a couple of times. The resulting file/files can then be imported to a different VirtualBox app.

Selecting columns in Sublime Text turning your screen upside down?

The way to select a column in Sublime Text on Windows is by using + and then pressing the up or down arrow keys. You might get a surprise though if you try it in the form of you entire screen display flipping upside down. Some people don’t even know this is possible to do so they get extra surprised.

This is because on certain Intel graphic cards, there are the hotkeys to trigger certain functions of the graphic card, for example being the functions to flip and rotate the screen (this is handy for instance if you want to work with your screen physically rotated to be in portrait rather than landscape mode and you’d obviously want to render the desktop accordingly). The keys to select a text column in Sublime Text happens to be taken by the hotkey for flipping the screen by the graphic card.

To disable this key combination being hijacked by the graphic card, you can either customize the graphic card to use a different key combination (Ctrl+Alt+F12 > Options) or disable the hotkeys for graphic functions altogether (Ctrl+Alt+F12 > Options > Uncheck “Enable Hot Keys”)

Found the above thanks to a comment here (by L_7337)

Importing an .ics file to Google calendar

Someone sent you an invitation to an event as an .ics file, and you wish to add it to your Google calendar

The first part is to import the file. To do this, click the + icon to the right of the “Add a friends calendar” located on the left side of the page. After clicking the icon, from the available options select ‘Import’

Now, the obvious thing to do would be to just to click the button that says “Select file from your computer”, select the .ics file and click import, but if you would do that, chances are that you’ll see the dreaded:

‘Failed to import events: Could not upload your events because you do not have sufficient access on the target calendar..’

The solution for this is to manually edit the .ics file prior to importing it and replace all occurrences of “UID:” with “UID:X” (without the quotes). After doing this and saving the file, proceed with the import and all should be fine.

Learned this from here

Notepad tricks in Google Docs

Many people are not aware that since the early days of the simple notepad app that comes bundled with Windows, it had the following undocumented feature: If you enter the text .LOG as the first line of the file, then every time you open the file with notepad, it will append the current date and time to the end of the document and scroll there. This is quite handy when you want a file that keeps track of the time when you added new entries.

I wanted to have the same functionality with Google Docs (with the added benefit of not needing to write .LOG at the beginning of the file). The following script (built via multiple shameless plagiarism from various sources) enables that functionality:

function onOpen() {
  var ui = DocumentApp.getUi();
  // Or FormApp or SpreadsheetApp.
  ui.createMenu('Custom Menu')
      .addItem('Insert Date', 'insertDate')
      .addToUi();

  setCursorToEnd()
  insertDate();
  setCursorToEnd()
}

function setCursorToEnd()
{
  var doc = DocumentApp.getActiveDocument();
  var paragraph = doc.getBody().appendParagraph('');
  var position = doc.newPosition(paragraph, 0);
  doc.setCursor(position);
}

function formatDate()
{
  var date = new Date();
  var datestr = date.getFullYear() + '-'+  
    ('0' + (date.getMonth()+1)).slice(-2) + '-' + 
    ('0' + date.getDate()).slice(-2);  
  var hours = date.getHours();
  var minutes = date.getMinutes();
  var ampm = hours >= 12 ? 'pm' : 'am';
  hours = hours % 12;
  hours = hours ? hours : 12; // the hour '0' should be '12'
  minutes = minutes < 10 ? '0'+minutes : minutes;
  var strTime = hours + ':' + minutes + ' ' + ampm;
  return datestr + "  " + strTime;
}

function insertDate()
{
  var activeDoc = DocumentApp.getActiveDocument();
  var cursor = activeDoc.getCursor();
  if (cursor) {
      var date = formatDate();    
      var element = cursor.insertText(date);
  } else {
      DocumentApp.getUi().alert('Cannot find a cursor in the document.');
  }
}

To enable this for a document, In Google Docs : Tools -> Script Editor, enter and save the above script and then accept the permissions.