Welcome to cachepath’s documentation!

https://img.shields.io/pypi/v/cachepath.svg https://img.shields.io/travis/haydenflinner/cachepath.svg Documentation Status

A small package for pythonic parameterized cache paths.

Getting Started

Install: pip install cachepath

Import: from cachepath import CachePath, Path

Docs: https://cachepath.readthedocs.io

Why?
  1. Integrates pathlib with tempfile and shutil
  2. Wraps pathlib import for Py2/3 compat. (not in six)

Why, but longer:

Do you need a temp path to pass to some random tool for its logfile? Behold, a gaping hole in pathlib:

def easy_get_tempfile():
    return TempPath()  # Path('/tmp/213kjdsrandom')
def hard_get_tempfile():
    # Deprecated, plus I forgot to call Path the first time
    return Path(tempfile.mktemp())

Now, suppose I’m running this tool multiple times, and I’d like to skip running the tool if I already have results. How do I attach some info to the filename?

def easy_get_tempfile(param):
    return CachePath(param, suffix='.txt')  # Path('/tmp/param.txt')
def hard_get_tempfile(param):
    return (Path(tempfile.gettempdir())/param).with_suffix('txt')

Ew. Now, I’m running this tool a lot, maybe even over a tree of data that looks like this!

2018-12-23
    person1
    person2
2018-12-24
    person1
2018-12-25
    person1

I want my logs to be structured the same way. How hard can it be?

2018-12-23/
    person1_output.txt
    person2_output.txt
2018-12-24/
    person1_output.txt
2018-12-25/
    person1_output.txt

Let’s find out:

def easy_get_path(date, person):
    return CachePath(date, person, suffix='_output.txt')
def hard_get_path(date, person):
    personfilename = '{}_output.txt'.format(person)
    returning = Path(tempfile.gettempdir())/date/personfilename
    return returning

Actually, we made a mistake. These aren’t equivalent. We may find out when we pass our path to another tool that it refuses to create the date folder if it doesn’t already exist. This issue can show itself as a Permission Denied error on Unix systems rather than the “File/Folder not found” you might think you would get. Regardless, we figured it out, let’s try again:

def hard_get_path():
    personfilename = '{}_output.txt'.format(person)
    returning = Path(tempfile.gettempdir())/date/personfilename
    # Does this mkdir update the modified timestamp of the folders we're in?
    # Might matter if we're part of a larger toolset...
    returning.parent.mkdir(exist_ok=True, parents=True)
    return returning

Now, how do we clear out some day’s results so that we can be sure we’re looking at fresh output of the tool?

def easy_clear_date(date):
    CachePath(date).clear()  # rm -r /tmp/date/*
def hard_clear_date(date):
    # We happen to know that date is a folder and not a file (at least in our
    # current design), so we know we need some form of .remove rather than
    # .unlink(). Unfortunately, pathlib doesn't offer one for folders with
    # files still in them. If you google how to do it, you will find plenty of
    # answers, one of which is a pure pathlib recursive solution! But we're lazy:
    p = Path(tempfile.gettempdir(), date)
    import shutil
    if p.exists():
        shutil.rmtree(p)
    p.mkdir(exist_ok=True, parents=True)
    # This still isn't exactly equivalent, because we've lost whatever
    # permissions were set on the date folder, or if it were actually a symlink
    # to somewhere else, that's gone now.

And all of this is ignoring the hacky imports you have to do to get pathlib in Py3 and pathlib2 in Py2:

from cachepath import Path  # py2/3
# or
try:
    from pathlib import Path
except:
    from pathlib2 import Path

Convinced yet? pip install cachepath or copy the source into your local utils.py (you know you have one.)

API doc is here.

Shameless Promo

Find yourself working with paths a lot in cmd-line tools? You might like invoke and/or magicinvoke!

[*]The source for CachePath can be downloaded from the Github repo.
[†]This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.