To install non-standard Python packages, most developers use the standard package manager pip. For example, a popular HTTP library (Requests) can be installed like this:

pip install requests

But package installation can get complex.  Celery, an asynchronous task queue implementation, uses bundles to allow developers to choose their preferred tech stack for many of its features. For example, a developer may choose to use Redis or Amazon SQS as their underlying message broker, in which case they install Celery using one of these commands:

pip install celery[redis]
pip install celery[sqs]

The bundles are specified within the square brackets and help developers limit the installation of unused dependencies, saving space and and preventing potential dependency hell. Multiple bundles can be installed as long as they're separated by commas (with no spaces between):

pip install celery[redis,auth]

But how did Celery achieve this?

They used the extras_require option in their setup.py, of course!

Writing your own bundles

When writing your own Python package, you can use setuptools and create a setup.py file to define the package. The setup() method in that file accepts several keyword arguments, including install_requires and extras_require. (And yes, I often put the "s" in the wrong spot when typing out these keyword arguments.)

The install_requires argument expects a list of the minimal dependencies that a package needs to properly function. Dependencies can be specified with or without their versions:

install_requires = [
    "numpy",
    "pandas>=1",
    "scipy>=1,<2",
]

The extras_require argument expects a dictionary mapping from a bundle name to the list of dependencies (names and versions, if desired) that the bundle depends on:

extras_require = {
    "bundle_a": ["mysqlclient"],
    "bundle_b": ["psycopg2"],
}

And that's it. The [bundle_a] and [bundle_b] bundles are now available.

Informing users about missing dependencies

If the consumer of your package doesn't know to install a bundle that they inadvertently depend on, they can become frustrated by cryptic messages about import errors when they run their code. Save them some trouble by writing user-friendly error messages.

Celery has a great example of how to do this in their redis.py module. The relevant code blocks are extracted below:

try:
    import redis.connection
    from kombu.transport.redis import get_redis_error_classes
except ImportError:  # pragma: no cover
    redis = None
    get_redis_error_classes = None

.
.
.

E_REDIS_MISSING = """
You need to install the redis library in order to use \
the Redis result store backend.
"""

.
.
.

class RedisBackend(BaseKeyValueStoreBackend, AsyncBackendMixin):
    """Redis task result store."""
    redis = redis

    .
    .
    .

    def __init__(self, host=None, port=None, db=None, password=None,
                 max_connections=None, url=None,
                 connection_pool=None, **kwargs):

        .
        .
        .

        if self.redis is None:
            raise ImproperlyConfigured(E_REDIS_MISSING.strip())

There are three key points to note here:

  1. The error message is thrown during initialization of a RedisBackend and not when the celery library is imported. This is preferable because the error only gets thrown for developers who actually use the Redis feature in their code.
  2. The ImproperlyConfigured error provides a clear message that instructs the user on how to fix their missing dependency problem. This is just good practice!
  3. The redis dependency check is performed only once, during the import celery statement, instead of during every RedisBackend initialization by the clever use of a try/except block at the very top of the module. There's no need to run this check multiple times; if the redis dependency isn't available when the program starts, it probably won't be available later either.

This isn't the only way to create bundles and provide developer-friendly error messages, but it's a great blueprint to follow. Now you're thinking with bundles!