The VGER Blog

Python Dependency Checking. Taming the untamable | by Patrick O'Leary | Jan, 2023 | Medium

Written by Patrick O'Leary | Jan 4, 2023 5:00:00 AM
 Taming the untamable

This is a work in progress — mileage will vary

https://imgs.xkcd.com/comics/python_environment.png

Why?

I was working on rebuilding a tech stack where I need to:

  1. Take demo/vaporware and make it real
  2. Stabilize it
  3. Figure out the right architecture, migrate customer[s] to it with minimal resources
  4. Deliver continual value while doing so

A huge part of it was unwinding a data store from NoSQL to a relational data structure. The product I took over was feature rich, but not scalable or functional.

My work was definitely cut out for me, as you can imagine the process was to break it down into phases, prioritization, migrate and iterate. Determine if the business made sense to support each iteration and continuously have it live while being able to stop at any time if budget or other issues came up, and they did.

This is a classic strangler process, anyone who has ever done one will tell you it is a hell of a lot harder to pull off than you can imagine.

Python was part of the new toolkit I brought in house to make data management viable, however towards the end one thing became apparent, mass projects and ever changing tech stacks are not python friendly.

What do I mean? Well moving from Mongo to Mysql + ElasticSearch, Lambda to Docker/Celery, building a SaaS BI solution, a recommendation engine, moving users RBAC out of Authentication storage and building out a basic admin tool to manage it all meant different tech was used at different times.

Each iteration went through the following questions:

  • Could it put out today's fire?
  • Does it generate tech debt and is it fixable?
  • Does it support the long term vision — am I painting myself into a corner?

Decisions that you are making day in / day out.

Something I noticed though as the project came to an end; Phase 1 dependencies were still present in the environment.

Code clean up does not mean environment clean up.

What AI thinks “Bambi eating in the middle of a dump” looks like

Not exactly a house on fire issue, but as you refactor, modularize for reuse, change run time environments, change repo layout, and do code clean ups, and you suddenly get hit with a tidal wave of incompatibility.

This is a weakness of Python requirements.txt being generated from the environment and not cascading code. One reason is the decision to have executable code as part of the install / dependency process of python e.g. setup.py .

Dependency Hell

What AI thinks is Satan with a laptop

PIP, Conda, VirtualEnv, Poetry etc..solve different pieces of a puzzle, none fully solving it as dependency management is hard.

The problems I kept running into were

  • Am I still using this dependency?
  • What are the other projects using?
  • Can I convert this to a module and reuse it without blowing everything up?
  • What damn version do I need?

I came up with an idea recently to look at solving this problem — for the moment it’s called depend-py (expect it to be renamed)

The objective with this was two-fold

  1. Discover what was used actively in a python project.
  2. Don’t pollute the project itself.

Warning

This is very early on in the project, I just pushed it to github the other day after an xmas break.

Starting to solve the problem

Depend-py can be installed directly from git and run with python without additional dependencies.

What AI thinks is “The Thinker with a Laptop”

Here is the output of running depend-py against another python project


$ python depend.py --path ../reconciliation/
{'active': {'project_pkgs': {'reconciliation': [('reconciliation', '0.3', ['Flask', 'Flask-Jsonpify', 'marshmallow'])]},
'vendor_pkgs': {'flask': [('Flask', '2.0.2', ['Werkzeug', 'Jinja2', 'itsdangerous', 'click', 'asgiref', 'python-dotenv'])],
'flask_jsonpify': [('Flask-Jsonpify', '1.5.0', ['Flask'])],
'marshmallow': [('marshmallow', '3.14.1', ['pytest', 'pytz', 'simplejson', 'mypy', 'flake8', 'flake8-bugbear', 'pre-commit', 'tox', 'sphinx', 'sphinx-issues', 'alabaster', 'sphinx-version-warning', 'autodocsumm', 'mypy', 'flake8', 'flake8-bugbear', 'pre-commit', 'pytest', 'pytz', 'simplejson'])],
'pandas': [('pandas', '1.3.4', ['python-dateutil', 'pytz', 'numpy', 'numpy', 'numpy', 'numpy', 'hypothesis', 'pytest', 'pytest-xdist'])],
'setuptools': [('setuptools', '58.3.0', ['sphinx', 'jaraco.packaging', 'rst.linker', 'jaraco.tidelift', 'pygments-github-lexers', 'sphinx-inline-tabs', 'sphinxcontrib-towncrier', 'furo', 'pytest', 'pytest-checkdocs', 'pytest-flake8', 'pytest-cov', 'pytest-enabler', 'mock', 'flake8-2020', 'virtualenv', 'pytest-virtualenv', 'wheel', 'paver', 'pip', 'jaraco.envs', 'pytest-xdist', 'sphinx', 'jaraco.path', 'pytest-black', 'pytest-mypy'])]}},
'installed': {'_distutils_hack': [('setuptools', '58.3.0', ['sphinx', 'jaraco.packaging', 'rst.linker', 'jaraco.tidelift', 'pygments-github-lexers', 'sphinx-inline-tabs', 'sphinxcontrib-towncrier', 'furo', 'pytest', 'pytest-checkdocs', 'pytest-flake8', 'pytest-cov', 'pytest-enabler', 'mock', 'flake8-2020', 'virtualenv', 'pytest-virtualenv', 'wheel', 'paver', 'pip', 'jaraco.envs', 'pytest-xdist', 'sphinx', 'jaraco.path', 'pytest-black', 'pytest-mypy'])],
'autopep8': [('autopep8', '1.6.0', ['pycodestyle', 'toml'])],
'bleach': [('bleach', '4.1.0', ['packaging', 'six', 'webencodings'])],
'certifi': [('certifi', '2021.10.8', [])],
'charset_normalizer': [('charset-normalizer', '2.0.9', ['unicodedata2'])],
'click': [('click', '8.0.3', ['colorama', 'importlib-metadata'])],
'colorama': [('colorama', '0.4.4', [])],
'dateutil': [('python-dateutil', '2.8.2', ['six'])],
'docutils': [('docutils', '0.18.1', [])],
'et_xmlfile': [('et-xmlfile', '1.1.0', [])],
'flask': [('Flask', '2.0.2', ['Werkzeug', 'Jinja2', 'itsdangerous', 'click', 'asgiref', 'python-dotenv'])],
'flask_jsonpify': [('Flask-Jsonpify', '1.5.0', ['Flask'])],
'idna': [('idna', '3.3', [])],
'importlib_metadata': [('importlib-metadata', '4.8.2', ['zipp', 'typing-extensions', 'sphinx', 'jaraco.packaging', 'rst.linker', 'ipython', 'pytest', 'pytest-checkdocs', 'pytest-flake8', 'pytest-cov', 'pytest-enabler', 'packaging', 'pep517', 'pyfakefs', 'flufl.flake8', 'pytest-perf', 'pytest-black', 'pytest-mypy', 'importlib-resources'])],
'itsdangerous': [('itsdangerous', '2.0.1', [])],
'jinja2': [('Jinja2', '3.0.3', ['MarkupSafe', 'Babel'])],
'keyring': [('keyring', '23.4.0', ['importlib-metadata', 'SecretStorage', 'jeepney', 'pywin32-ctypes', 'sphinx', 'jaraco.packaging', 'rst.linker', 'jaraco.tidelift', 'pytest', 'pytest-checkdocs', 'pytest-flake8', 'pytest-cov', 'pytest-enabler', 'pytest-black', 'pytest-mypy'])],
'markupsafe': [('MarkupSafe', '2.0.1', [])],
'marshmallow': [('marshmallow', '3.14.1', ['pytest', 'pytz', 'simplejson', 'mypy', 'flake8', 'flake8-bugbear', 'pre-commit', 'tox', 'sphinx', 'sphinx-issues', 'alabaster', 'sphinx-version-warning', 'autodocsumm', 'mypy', 'flake8', 'flake8-bugbear', 'pre-commit', 'pytest', 'pytz', 'simplejson'])],
'numpy': [('numpy', '1.21.4', [])],
'openpyxl': [('openpyxl', '3.0.9', ['et-xmlfile'])],
'packaging': [('packaging', '21.3', ['pyparsing'])],
'pandas': [('pandas', '1.3.4', ['python-dateutil', 'pytz', 'numpy', 'numpy', 'numpy', 'numpy', 'hypothesis', 'pytest', 'pytest-xdist'])],
'pip': [('pip', '21.3.1', [])],
'pkg_resources': [('setuptools', '58.3.0', ['sphinx', 'jaraco.packaging', 'rst.linker', 'jaraco.tidelift', 'pygments-github-lexers', 'sphinx-inline-tabs', 'sphinxcontrib-towncrier', 'furo', 'pytest', 'pytest-checkdocs', 'pytest-flake8', 'pytest-cov', 'pytest-enabler', 'mock', 'flake8-2020', 'virtualenv', 'pytest-virtualenv', 'wheel', 'paver', 'pip', 'jaraco.envs', 'pytest-xdist', 'sphinx', 'jaraco.path', 'pytest-black', 'pytest-mypy'])],
'pkginfo': [('pkginfo', '1.8.2', ['coverage', 'nose'])],
'pycodestyle': [('pycodestyle', '2.8.0', [])],
'pygments': [('Pygments', '2.10.0', [])],
'pyparsing': [('pyparsing', '3.0.6', ['jinja2', 'railroad-diagrams'])],
'pytz': [('pytz', '2021.3', [])],
'readme_renderer': [('readme-renderer', '30.0', ['bleach', 'docutils', 'Pygments', 'cmarkgfm'])],
'requests': [('requests', '2.26.0', ['urllib3', 'certifi', 'chardet', 'idna', 'charset-normalizer', 'idna', 'PySocks', 'win-inet-pton', 'chardet'])],
'requests_toolbelt': [('requests-toolbelt', '0.9.1', ['requests'])],
'rfc3986': [('rfc3986', '1.5.0', ['idna'])],
'setuptools': [('setuptools', '58.3.0', ['sphinx', 'jaraco.packaging', 'rst.linker', 'jaraco.tidelift', 'pygments-github-lexers', 'sphinx-inline-tabs', 'sphinxcontrib-towncrier', 'furo', 'pytest', 'pytest-checkdocs', 'pytest-flake8', 'pytest-cov', 'pytest-enabler', 'mock', 'flake8-2020', 'virtualenv', 'pytest-virtualenv', 'wheel', 'paver', 'pip', 'jaraco.envs', 'pytest-xdist', 'sphinx', 'jaraco.path', 'pytest-black', 'pytest-mypy'])],
'six': [('six', '1.16.0', [])],
'toml': [('toml', '0.10.2', [])],
'tqdm': [('tqdm', '4.62.3', ['colorama', 'py-make', 'twine', 'wheel', 'ipywidgets', 'requests'])],
'twine': [('twine', '3.7.1', ['pkginfo', 'readme-renderer', 'requests', 'requests-toolbelt', 'tqdm', 'importlib-metadata', 'keyring', 'rfc3986', 'colorama'])],
'urllib3': [('urllib3', '1.26.7', ['brotlipy', 'pyOpenSSL', 'cryptography', 'idna', 'certifi', 'ipaddress', 'PySocks'])],
'webencodings': [('webencodings', '0.5.1', [])],
'werkzeug': [('Werkzeug', '2.0.2', ['dataclasses', 'watchdog'])],
'wheel': [('wheel', '0.37.0', ['pytest', 'pytest-cov'])],
'zipp': [('zipp', '3.6.0', ['sphinx', 'jaraco.packaging', 'rst.linker', 'pytest', 'pytest-checkdocs', 'pytest-flake8', 'pytest-cov', 'pytest-enabler', 'jaraco.itertools', 'func-timeout', 'pytest-black', 'pytest-mypy'])]},
'missing': [],
'path': '../reconciliation/',
'python_paths': ['../reconciliation/.env/lib/python3.8/site-packages'],
'source_deps': {'flask': ['request'],
'flask_jsonpify': ['jsonpify'],
'json': [],
'marshmallow': ['Schema', 'fields'],
'marshmallow.decorators': ['post_dump', 'post_load'],
'os': ['name'],
'pandas': [],
'pprint': ['pprint'],
'reconciliation.reconcile': ['EntityType',
'InvalidUsage',
'Property',
'ReconcileRequest',
'ReconcileService'],
'setuptools': ['setup'],
'typing': []},
'source_pkgs': {'reconciliation': [('reconciliation', '0.3', ['Flask', 'Flask-Jsonpify', 'marshmallow'])]},
'system': ['json', 'os', 'pprint', 'typing']}

You can see what’s active, what's installed, what’s 3rd party vendor, provided by python, provided by the project itself.

There’s even code in there to help you trace back a dependency to what required it.

$ python deep_resolve.py --path ../reconciliation/ --depends-on Flask
Flask is required by ['Flask-Jsonpify', 'reconciliation']
List of ['Flask-Jsonpify', 'reconciliation'] has a source package
Flask is NOT a source dependency

There’s obviously a lot more that can be done with this but it’s a start to help get things under control.

Top image from XKCD

All other images from the mind of a GPU