2020-05-22 Why Do I Move Away From Zappa Serverless?

2020-05-22

There has been a lot of discussion on the use of serverless, especially with Zappa serverless. Based on my own personal experiences, I am here to summarize why I completely shifted away from Zappa serverless after two years of use for Labii Electronic Lab Notebook (ELN) and Laboratory Information Management System (LIMS).

It is slow

Generally, with the lambda size of 1024MB, the API provided with Zappa is at least 10 times slower than some very basic EC2, for example, t3a.large. Even the t3a.small is faster than the Zappa.

It is difficult to work with SSO

If you want to integrate with SSO in Zappa, you are out of luck. There are a lot of tedious configurations and it can not guarantee it is going to work. However, if you are using an EC2, the SSO works out of the box.

Here are some of the process I used to configure the Zappa to work with SSO:

  1. build a binary build from AMI

  2. copy xmlsec1, libxmlsec1.so.1, libxmlsec1.so.1.2.20 to /site-packages/lib

  3. install libxmlsec1-openssl and copy lbixmlsec1-openssl.so, lbixmlsec1-openssl.so.1, lbixmlsec1-openssl.so.1.20 to /site-packages/lib

  4. set xmlsec_binary in accounts/views.py to /var/task/lib/xmlsec1

  5. add if modname != "saml2.extension.__pycache__": to line 90 of the /site-packages/saml2/mdstore.py

if metadata can not download, remove public subnet from zappa_settings

if use slim_handler=true, add this code to zappa/core.py at line 408 to copy the `lib`. This need to be done whenever zappa is updated.

# code for step 5def load_extensions():
    from saml2 import extension
    import pkgutil
    package = extension
    prefix = package.__name__ + "."
    ext_map = {}
    for importer, modname, ispkg in pkgutil.iter_modules( package.__path__,
                                                         prefix):
        module = __import__(modname, fromlist="dummy")
        if modname != "saml2.extension.__pycache__":
            ext_map[module.NAMESPACE] = module# code for step 7copytree(os.path.join( current_site_packages_dir, "lib"), os.path.join(venv_site_packages_dir, "lib"))

def create_handler_venv(self):
    """
    Takes the installed zappa and brings it into a fresh virtualenv-like folder. All dependencies are then downloaded.
    """
    import subprocess
    # We will need the currenv venv to pull Zappa from
    current_venv = self.get_current_venv()
    # Make a new folder for the handler packages
    ve_path = os.path.join(os.getcwd(), 'handler_venv')
    if os.sys.platform == 'win32':
        current_site_packages_dir = os.path.join(current_venv, 'Lib', 'site-packages')
        venv_site_packages_dir = os.path.join(ve_path, 'Lib', 'site-packages')
    else:
        current_site_packages_dir = os.path.join(current_venv, 'lib', get_venv_from_python_version(), 'site-packages')
        venv_site_packages_dir = os.path.join(ve_path, 'lib', get_venv_from_python_version(), 'site-packages')
        copytree(os.path.join( current_site_packages_dir, "lib"), os.path.join(venv_site_packages_dir, "lib"))

It could not load big sized data

There is a limit on the size of json the API can return with the Zappa serverless. This size might related to the lambda memory size you defined, but I have not tested. If you want your API to be 100% working even when querying a lot data, Zappa serverless is not for you.

It is difficult to debug

Recently I have problem to read the SSM with Zappa. It works well when you just deployed, but it will failed after 4 minutes, when a new session started. It looks like there are some consistency problem for different sessions.

It is not cheap

Based on my calculation, the pricing of Zappa at the 1024M is similar to the a EC2 instance of t3a.large running 24 hours, at the pricing of RI. Almost no money is saved.

To learn more, schedule a meeting with Labii representatives (https://call.skd.labii.com) or create an account (https://www.labii.com/signup/) to try it out yourself.

Yonggan Wu

Last updated