Aws-python function dependencies load

Dont give up @mpradilla!

This is completely achievable on lambda. It is not necessary to manually zip your package first. Let me describe the way I use lambda to run python code with external python modules that leverage precompiled libraries:


External Modules

Firstly, when serverless zips your lambda, it follows symlinks. As such, one can symlink the virtual environment’s site-packages into the function directory. That way it doesn’t clutter the lib/ directory, which can be used solely for your precompiled libraries. In your case numpy’s multiarray.so.

File structure

▾ lib/
    some_lib.so
  event.json
  handler.py
  serverless.yml
  site-packages -> /Users/jimbo/miniconda2/envs/testenv/lib/python2.7/site-packages

For local testing, you run the code OUTSIDE your virtual environment to ensure the manual imports (below) are working:

import sys
import os
here = os.path.dirname(os.path.realpath(__file__))

# Import installed packages (in site-packages)
site_pkgs = os.path.join(here, "site-packages")
sys.path.append(site_pkgs)

# Now Import your environment packages
import numpy as np
...

Compiled code

A simple case

As you have found, the lambda execution environment is a linux 64 bit architecture. You are not developing on this so your multiarray.so is different, hence your ‘invalid ELF header’ error.

You problem is solved by developing on a linux-64 machine. Thus when you install scikit-learn and its dependency numpy, you will get the shared object files that can be used by lambda.

It is worth noting for other readers that this error may report in cloudwatch logs as simply an ImportError

ImportError: No module named numpy.

So far you know all this, however at some point you may run into the followin case.

Non site-package Libraries

Sometimes you may require libraries from other directories that are on your local LD_LIBRARY_PATH but don’t get zipped up with the lambda function.

Example

I have lambda code that requires the intel math kernel libraries. Here are my problems:

  1. Libraries are not in site-packages
  2. LD_LIBRARY_PATH is read when the python interpreter starts so changing it with os.environ in code is too late
  3. Some library files are large. AWS lambda zip files have limits and run faster (and cheaper) when they are small.

The solution

I need to include only the libraries I need, not entire lib directories. The directory where I put the libraries must already be on the LD_LIBRARY_PATH when my lambda starts. This restricts the directory to the lib/ beside my handler.

I run the function in the lambda execution environment and cherry pick the libraries as they error out and add them to lib/ thus ensuring I have the bare minimum libraries I need.

Now that this is in the service repository I can continue to develop on my mac because the linux-64 binaries are deployed.

2 Likes