Seems like a comprehensive set of algorithms in one library, based on a similar Java project: https://pypi.org/project/strsim/
Compiling PHP 7.3 with MySQL support for AWS Lambda
This is probably easier to do for PHP 7.4, but I wanted PHP 7.3 to make sure everything is compatible with existing code. The code below assumes you are doing this in an interactive shell in a Docker container build for creating binaries for AWS Lambda. Doing this is explained here
First, install the necessary tools to build PHP
$ yum update -y $ yum install autoconf bison gcc gcc-c++ libcurl-devel libxml2-devel -y
Next, build the SSL version required for PHP 7.3 build
$ curl -sL http://www.openssl.org/source/openssl-1.0.1k.tar.gz | tar -xvz $ cd openssl-1.0.1k $ ./config & make & make install
Then download and build PHP 7.3, and configure with all the support for accessing MySQL databases
$ cd .. $ curl -sL https://github.com/php/php-src/archive/php-7.3.0.tar.gz | tar -xvz $ cd php-src-php-7.3.0 $ ./buildconf --force $ /configure --prefix=/home/ec2-user/php-7-bin/ --with-openssl=/usr/local/ssl --with-curl --without-libzip --with-zlib --enable-zip --with-pdo-mysql --with-mysqli=mysqlnd $ make install
Finally, check that the MySQL modules are there:
$ /home/ec2-user/php-7-bin/bin/php -m
Backing up AWS lambda functions
There might be a built-in method to do so, but I haven’t found one. I wanted to download packages of all the lambda files I have in AWS and ended up creating a script to do it.
Of course you should replace the –region parameter with the AWS region you run your AWS lambda functions and the –profile parameter with the appropriate AWS lambda profile in your ~/.aws/credentials file (the correct way would have been to give those as parameters to the bash script but this will be left as an exercise to the reader 😉
Feel free to copy. Don’t forget to chmod u+x …
#!/bin/bash # Get all the function names list=`aws lambda list-functions --region us-west-2 | jq -r .Functions[].FunctionName | perl -pe 's/\n/ /g'` # For each lambda function, get the function's download url and download it for val in $list; do url=`aws lambda get-function --region us-west-2 --function-name $val --profile amnon | jq -r .Code.Location` shortname=`echo $url | perl -pe's/^.+\/(.+?)-[\w]{8}.+/\1.zip/g'` echo $shortname wget -nv $url -O $shortname done
Duplicated log lines using Python in AWS Lambda
The short version:
logname = 'processing' formatter = logging.Formatter(fmt='%(asctime)s %(levelname)-8s %(message)s', datefmt='%Y-%m-%d %H:%M:%S') handler = logging.StreamHandler() handler.setFormatter(formatter) logger = logging.getLogger(logname) logger.setLevel(logging.DEBUG) logger.handlers = [] # === Make sure to add this line logger.addHandler(handler) logger.propagate = False # === and this line
For details, check out this link
Standard C++ libraries on AWS Lambda
While attempting to compile a test sample together a library I needed, I received the following
/usr/bin/ld: cannot find -lstdc++
installing the following solved the issue. I didn’t even check if both installations are necessary – if you have the curiosity to dig further then send your conclusions, but all I wanted was a solution to the problem at hand, and here it is:
yum install libstdc++-devel yum install libstdc++-static
lupa (LuaJIT/Lua-Python bridge) on AWS Lambda
Last post discussed launching LuaJIT from Python on AWS Lambda.
Suppose you want to receive events from the Lua code in the Python code that invoked it.
If you don’t have a Python-Lua API then you are left with either using sockets, files or even getting feedback by reading the realtime output from the process stdout object (which is not that trivial, see this implementation)
A more elegant way would be if there was a Python API for accessing the Lua code (and vice versa). This is what lupa does, and it works with both Lua and LuaJIT.
I’ll explain how to install lupa so it can be used with Python on the amazon linux docker container, and thereby on AWS Lambda.
So, Connect to your amazon linux docker container and
mkdir -p /root/lambdalua cd /root/lambdalua # Download the lupa source code # link for latest source: https://pypi.org/project/lupa/#files - look for tar.gz file wget https://files.pythonhosted.org/packages/f2/b2/8295ebabb417173eaae05adc63353e4da3eb8909e1b636b17ff9f8e1cdce/lupa-1.7.tar.gz # extract the source in /root/lambdalua tar xzfv lupa-1.7.tar.gz # enter the lupa source directory cd lupa-1.7 wget http://luajit.org/download/LuaJIT-2.0.5.zip # download latest source of LuaJIT unzip LuaJIT-2.0.5.zip # unzip it in /root/lambdalua/lupa-1.7 cd LuaJIT-2.0.5 # genter LuaJIT source directory make CFLAGS=-fPIC # Make LuaJIT with -fPIC compile flag cd .. # back to lupa source dir python setup.py install # Create lupa Python module cd /root/lambdalua mkdir lupa_package # create a directory for a lupa test aws package # copy the Lupa module that we previously compiled to the lupa_package directory # so that the "import lupa" will work from the python code on AWS lambda cp -r ./lupa-1.7/build/lib.linux-x86_64-3.6/lupa lupa_package/
Now create /root/lambdalua/lupa_package/lambdalua.py so that it makes use of Lupa:
import subprocess import sys import os import lupa # use the local lupa module from lupa import LuaRuntime def lambda_luajit_func(event, context): def readfile(filename): content = '' with open(filename, 'r') as myfile: return myfile.read() lua = LuaRuntime(unpack_returned_tuples=True) # define fhe Python function that will be called from LuaJIT def add_one(num): return num + 1 # Load the Lua code lua_func = lua.eval(readfile('./test.lua')) params = { "add_one_func":add_one, "num":42 } # call the Lua function defined in test.lua with the above parameters res = lua_func(params) return res if __name__ == "__main__": print(lambda_luajit_func(None, None))
and also create the /root/lambdalua/lupa_package/test.lua that the above python file will load:
function(params_table) -- get the number passed from python num = params_table.num -- get the python function to invoke py_func = params_table.add_one_func -- return the result of invoking the Python function on the number return "The result is:"..py_func(num) end
To see that this actually works in our amazon linux docker, just type:
python lambdalua.py # output is: The result is:43
To make this run on AWS lambda, we pack the contents of /root/lambdalua/lupa_package in a zip file.
zip -r package.zip .
If the AWS credentials and command line is in your OS shell, then from an OS terminal copy the zip file to your OS from the docker container, e.g (you can skip this step if you copied the aws credentials to ~/.aws in the docker container):
docker cp lucid_poincare:/root/lambdalua/lupa_package/package.zip .
Create a function (if we haven’t done so already) or update the function (if we want to update the function from the previous post) and invoke it.
The details of creating the lambda user, profile, function etc from the aws command line are detailed in a previous post, but a quick overview assuming you already have a lambda user names lambda_user:
Next, assuming you already have a lambda account and user (if you don’t, see a previous post) either create a new function
aws lambda create-function --region us-east-1 --function-name lambda_luajit_func --zip-file fileb://package.zip --role arn:aws:iam::123456789012:role/basic_lambda_role --handler lambdalua.lambda_luajit_func --runtime python3.6 --profile lambda_use
or, if you already created the function lambdalua.lambda_luajit_func from the previous post, you can update it:
# the following assumes myzippackage is a bucket you own aws s3 rm s3://myzippackage/package.zip # remove previous package if exists aws s3 cp package.zip s3://myzippackage/package.zip # copy new package # update lambda_luajit_func function with contents of new package aws lambda update-function-code --region us-east-1 --function-name lambda_luajit_func --s3-bucket myzippackage --s3-key package.zip --profile lambda_user
finally, invoke the Lupa test on AWS lambda
aws lambda invoke --invocation-type RequestResponse --function-name lambda_luajit_func --region us-east-1 --log-type Tail --profile lambda_user out.txt cat out.txt # output should be: "The result is:43"
AWS Lambda – running python bundles and arbitrary executables – Part 2
In the previous post I explained how to create your AWS lambda environment using Docker, and how to package a python bundle and launch it on AWS Lambda.
In this post I’ll show how you can launch arbitrary executables from an AWS Lambda function.
To make this tutorial even more useful, the example of an arbitrary executable I’ll be using is LuaJIT – an incredibly fast Lua implementation created by Mike Pall. After this you should be able to write blazing fast Lua code and run it on AWS Lambda.
I assume you already have a Docker container that emulates AWS Lambda Linux – if not, check the previous post
So, first thing is to install LuaJIT on the Docker amazon lambda container. Start the container for the amazon linux (use: docker ps -a
or docker container list
to find the container and docker start -i <name>
to connect to it.
Once in the container, make sure you have wget and unzip installed. If not then:
yum install wget yum install zip
Next, download the latest version of LuaJIT (in my case this was 2.0.5) from here
wget http://luajit.org/download/LuaJIT-2.0.5.zip # download latest source of LuaJIT unzip LuaJIT-2.0.5.zip # unzip it cd LuaJIT-2.0.5 # go to source directory make # build LuaJIT make install # install LuaJIT
To run an arbitrary binary from AWS Lambda, we’ll first include and any dependencies it might have in the zip package that we’ll upload to Lambda.
So let’s create the ingredients of this package. For starters we’ll create a directory to place all the relevant files:
mkdir lambdalua cd lamdalua mkdir lib # we'll place any luajit dependencies here
Since we compiled and installed luajit, let’s check where it was placed:
which luajit
in my case, the result is: /usr/local/bin/luajit
Now, we’ll copy luajit to the directory we are in so it will be part of the package
cp /usr/local/bin/luajit .
Next, let’s check whether there are any dynamic linked libraries that luajit depends on, as they’ll need to exist on AWS Lambda too in order for luajit to successfully run:
ldd /usr/local/bin/luajit # find the shared libraries required by luajit
The result:
linux-vdso.so.1 => (0x00007ffdb75a7000) liblua-5.1.so => /usr/lib64/liblua-5.1.so (0x00007f6deea61000) libreadline.so.6 => /lib64/libreadline.so.6 (0x00007f6dee81c000) libncurses.so.5 => /lib64/libncurses.so.5 (0x00007f6dee5f6000) libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00007f6dee3d5000) libm.so.6 => /lib64/libm.so.6 (0x00007f6dee0d3000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f6dedecf000) libc.so.6 => /lib64/libc.so.6 (0x00007f6dedb0b000) /lib64/ld-linux-x86-64.so.2 (0x00007f6deec8d000)
Above, we can see that most of the shared libraries luajit depends on (those starting with /lib64) are part of linux (and hopefully they are the same version as those on AWS Lambda amazon linux).
However, one file is not part of lambda linux, and that is /usr/lib64/liblua-5.1.so (this was added as part of installing luajit).
We’ll need to make this file available to luajit on lambda so let’s copy it to the lib/ directory we created.
cp /usr/lib64/liblua-5.1.so lib/
create the following hello.lua file in the directory we’re in:
local str = "hello from LuaJIT - " for i=1,10 do str = str .. i .. " " end print(str)
Now we create the Python file that will launch the above Lua script using LuaJIT. We’ll name this file lambdalua.py. Note the explanations in the comments within the code:
import subprocess import sys import os def lambda_luajit_func(event, context): lpath = os.path.dirname(os.path.realpath(__file__)) # the path where this file resides llib = lpath + '/lib/' # the path for luajit shared library # Since we can't execute or modify execution attributes for luajit in the directory # we run on aws lambda, we'll copy luajit to the /tmp directory where we'll be able # to change it's attributes os.system("cp -n %s/luajit /tmp/luajit" % (lpath)) # copy luajit to /tmp os.system("chmod u+x /tmp/luajit") # and make it executable # Since we don't have permission to copy luajit's shared library to the path # where it looks for it (the one shown from the ldd command), we'll add the # path where the liblua-5.1.so is located to the LD_LIBRARY_PATH, which enables # Linux to search for the shared library elsewhere # add our lib/ path to the search path for shared libraries os.environ["LD_LIBRARY_PATH"] += (":%s" % (llib)) # prepare a subprocess to run luajit with the hello.lua script path as a parameter command = "/tmp/luajit %s/hello.lua" % (lpath) p = subprocess.Popen(command , shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) # launch the process and read the result in stdout stdout, stderr = p.communicate() # We'll make the return of the lambda function the same as what was the output of the # Lua script return stdout.decode("utf-8") if __name__ == "__main__": print(lambda_luajit_func(None, None))
At this point, we can create a package to upload as a lambda function. In the directory where we’re in run:
zip -r package.zip .
and then copy the zip file to your host OS terminal, e.g:
docker cp lucid_poincare:/root/lambdalua/package.zip
now we’ll upload the package and create the lambda function using the same user and role created in the previous post (replace the role ARN with your own):
aws lambda create-function --region us-east-1 --function-name lambda_luajit_func --zip-file fileb://package.zip --role arn:aws:iam::123456789012:role/basic_lambda_role --handler lambdalua.lambda_luajit_func --runtime python3.6 --profile lambda_user
you should get a JSON reply with the information that the function has been created. Finally we can invoke the lambda function as follows:
aws lambda invoke --invocation-type RequestResponse --function-name lambda_luajit_func --region us-east-1 --log-type Tail --profile lambda_user out.txt
when we check the out.txt file:
$cat out.txt "hello from LuaJIT - 1 2 3 4 5 6 7 8 9 10 \n"
AWS Lambda – running python bundles and arbitrary executables
In a previous post, I mentioned using Amazon Linux EC2 to create AWS Lambda compatible packages. While this works, another way to create packages that can run on AWS Lambda is to create them locally via a Docker Amazon Linux image . One downside I’ve found to this method is that sometimes these images are incompatible with some of the system files in the Lambda runtime, but at the time of writing this, I found the docker-lambda project to both create compatible lambda linux images as well as a great way to shorten lambda development cycles by emulating a lambda environment you can invoke locally.
To start, here are the instructions to build a Python 3.6 docker lambda image (of course, make sure you have Docker installed):
git clone https://github.com/lambci/docker-lambda.git # clone project from git cd docker-lambda/ # go to project directory npm install # install project node.js dependencies cd python3.6/build # go to the python Dockerfile build docker build . # build the image as per instructions in # the Dockerfile (takes time...) docker images # show docker images, note the id of the built image docker tag 32e7f5244861 lambci/python3.6:build # name and tag the built docker image using its id docker run -it lambci/python3.6:build /bin/bash # create a new container based on new image and # run it interactively (/bin/bash command is needed # because CMD ["/bin/bash"] is not included as the # last line in the Dockerfile exit # leave docker container docker ps -a # locate the newly created container from the above # above command, and note the name given to it docker start -i vibrant_heyrovsky # resume interactive session with the container # using the container name found above
So, now you have a console to a compatible Amazon Linux shell. To create lambda functions, you basically zip all the relevant files and upload to AWS lambda and after that, you can remotely invoke the required function on Lambda .
My current method will be to have two console windows – one is the above console to the docker bash, and another is a console of the host operating system (whatever OS you are running Docker on). This way, you can easily zip the lambda packages in the Docker console, and then copy them from your OS console (and from there upload them to AWS Lambda)
Setting up an AWS lambda user ∞
Now that we have a local Lambda-compatible environment, let’s create an actual AWS user that will be used to upload and run the packages that we’ll create in our local Lambda-compatible Docker container.
To run the following, make sure you first have the AWS CLI installed on your OS.
Let’s create our lambda user using the above CLI. Of course, the assumption is that you already have a credentials file in your .aws directory which enable you to do the next part. If not, you’ll need to create a user with the appropriate privileges from the AWS IAM console, get that user’s aws key id and aws secret, then locally run aws configure
and follow the instructions. This will create your initial credentials file.
We’ll now create a user that we’ll use for AWS lambda. The information here is based on this excellent simple tutorial with some minor changes to suit this one.
# Create a user group 'lambda_group' $ aws iam create-group --group-name lambda_group # Create a user 'lambda_user' $ aws iam create-user --user-name lambda_user # Add our user to the group $ aws iam add-user-to-group --user-name lambda_user --group-name lambda_group # Create a password for this user $ aws iam create-login-profile --user-name lambda_user --password _your_password_here_ # Create a CLI access key for this user $ aws iam create-access-key --user-name lambda_user # Save user's Secret and Access Keys somewhere safe - we'll need them later
Now that we have a user, let’s authorise this user to run lambda functions, copy s3 files etc. To do this, we create a policy and grant that policy to the user we just created.
For that, create a file with the following json, and name it lambda_policy.json
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Action": [ "iam:*", "lambda:*", "s3:*" ], "Resource": "*" }] }
now grant the above policy to our lambda user:
aws iam put-user-policy --user-name lambda_user --policy-name lambda_all --policy-document file://lambda_policy.json
Now, let’s configure our AWS CLI so that we can perform actions as lambda_user
$ aws configure --profile lambda_user > AWS Access Key ID [None]: <your key from the above create-access-key command> > AWS Secret Access Key [None]: <your secret from the above create-access-key command> > Default region name [None]: us-east-1 (or whatever region you use) > Default output format [None]: json # AWS stores this information under [lambda_user] at ~/.aws/cretentials file
Finally, we need to create a role which is needed when creating a lambda function and determines what actions the lambda function is permitted to perform.
To create the role, create a file named basic_lambda_role.json
with the following json text:
{ "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Principal": { "AWS" : "*" }, "Action": "sts:AssumeRole" }] }
Then create the role via the CLI:
$ aws iam create-role --role-name basic_lambda_role --assume-role-policy-document file://basic_lambda_role.json
The above will return the role identifier as an Amazon Resource Name (ARN), for example: arn:aws:iam::716980512849:role/basic_lambda_role . You’ll need this ARN whenever you create a new lambda function so hold on to it.
We now have all the ingredients to create, update and invoke AWS Lambda functions. We’ll do that later, but first – let’s get back to creating the code package that is required when creating a lambda function. The code package is just a zip file which contains all your code and its dependencies that are uploaded to lambda when you create or update your lambda function. The next section will explain how to do this.
Creating a AWS Lambda code package ∞
We’ll start with creating and invoking a python package that has some dependencies, and then show how to create a package that can run arbitrary executables on AWS Lambda
Creating a local Python 3.6 package
So now, let’s make a package example that will return the current time in Seoul. To do this, we’ll install a python module named arrow, but we’ll install it in a local directory since we need to package our code with this python module. To do this, open your docker console that is running the lambda compatible environment and:
cd /var/task # move to the base lambda directory in the docker image mkdir arrowtest # Create a directory for the lambda package we're going to make cd arrowtest # move in to the directory pip install arrow -t ./ # install the arrow python library in this directory ls # take a look at what has been added
next, we’ll create our lambda function which we’ll later invoke. (you might want to install an editor of your choice on the docker console using yum, for example via yum install vim
).
So, let’s create arrowtest.py :
import sys import arrow def lambdafunc(event, context): utc = arrow.utcnow() SeoulTime = utc.to('Asia/Seoul') return "The time in Seoul is: %s" % (SeoulTime.format()) #just for local testing if __name__ == "__main__": print(lambda_func(None, None))
and test that it works locally in the docker shell:
python arrowtest.py
Ok, so we have the python file with the lambda function, we have the dependencies, now all we need to do is zip the contents of the entire directory and add this zip file as a parameter to the lambda function creation.
This would work, however with larger Python libraries, you might want to remove certain files that aren’t being used by you python code and would just waste space on lambda. My rather primitive but effective method for doing this is cloning the complete directory and start removing files that seem pointless until something breaks, and then I put them back and try other things until I’m happy with the size reduction. In the cloned directory, I actually rename directories before removing them as it’s easier to run the script after renaming and rename them back if we see that the directory is needed by the script.
Let’s do it for this example:
cd .. pwd # should be /var/task cp -r arrowtest arrowtest_clone cd arrowtest_clone ls # let's see what's in here du -hd1 # note how much space each directory takes (1.2MB)
Installed python libraries can contain many directories and files of different types. There are python files, binary dynamic libraries (usually with .so extensions) and others. Knowing what these are can help decide what can be removed to make the zipped package more lean. In this example, the directory sizes are a non issue, but other python libraries can get much larger.
an example of some stuff I deleted
rm -rf *.dist-info rm -rf *.egg-info rm six.py rm -rf dateutil # we're not making use of this - it's just wasting space # test that the script is still working after all we've deleted python arrowtest.py test du -hd1 # we're down to 332K from 1.2MB and the script still works.
now, let’s package this directory in a zip file. if you don’t have zip installed on your docker container yet then
yum install zip
and now after removing unneeded files and dependencies, let’s pack our directory:
zip -r package.zip .
now that we have the package on the docker container, let’s copy it to our OS from our OS console:
docker cp vibrant_heyrovsky:/var/task/arrowtest_clone/package.zip .
(replace vibrant_heyrovsky with the name of your docker image).
So we have a zipped package that we tested on docker – let’s create a lambda function from this package and invoke it (replace arn:aws:iam::716980512849:role/basic_lambda_role with your own ARN):
aws lambda create-function --region us-east-1 --function-name lambdafunc --zip-file fileb://package.zip --role arn:aws:iam::716980512849:role/basic_lambda_role --handler arrowtest.lambdafunc --runtime python3.6 --profile lambda_user
and finally, let’s see if we can get AWS lambda to tell us the current time in Seoul:
aws lambda invoke --invocation-type RequestResponse --function-name lambdafunc --region us-east-1 --log-type Tail --profile lambda_user out.txt #invoke the function cat out.txt # check the result
the file out.txt contains the return value of the called lambda function. Next we’ll see how to update to a new package and how to pass parameters to the lambda function.
To be continued…
Quick file search for Windows
Can’t believe I haven’t heard of this utility before (in my defense, I haven’t been a heavy Windows user for a few years now):
Anyway: the software’s name is “Everything” – get it here
Amazing indexing speed of all your drives and immediate lookup of any filename, including parts of a name, regex searches etc. Perfect for when you know that file you’re looking for exists somewhere in that almost infinite maze of folders and files but using an exhaustive search would take hours or more.
Intuitive explanation of confusion matrix
I took a couple of hours to create an interactive explanation for those who want to get an intuitive grip on the confusion matrix – precision, recall, F1, accuracy etc.
Still need to improve and polish but you can already play with it: