llama.dev.upload package

Tools for uploading manifests of data files to a DigitalOcean Spaces/Amazon S3 object storage solution (for later user installation using llama install).

llama.dev.upload.upload_and_get_manifest(root: str = '.', glob: str = '**/*', key_prefix: str = 'objects/', key_uses_relpath: bool = False, bucket: str = 'llama', public: bool = True, **kwargs)

Upload files from the specified path to DigitalOcean Spaces/AWS S3 and return a manifest mapping the stored object URLs to local relative paths. The sha256 sum of the uploaded file will be the actual filename, allowing for versioning of files and avoiding redundant file uploads and downloads, with key_prefix prepended to aid in organization. Use this to offload large data files onto separate file storage and generate the MANIFEST constant (and related constants) for installation.

Parameters:
  • root (str, optional) – The path to the root directory that should be uploaded to cloud storage. All local paths in the returned manifest will be relative to this path as well. Can be relative or absolute.
  • glob (str, optional) – The glob specifying which files to match from the provided root. By default, recursively matches all files in all subdirectories.
  • key_prefix (str, optional) – A prefix to prepend to the uploaded files’ sha256 sums in order to create their object keys (i.e. remote filenames). Note that this is just a prefix, so if you want it to act/look like a containing directory for uploaded files, you will need to make sure it ends with /.
  • key_uses_relpath (bool, optional) – If True, put the relative filepath from root of each file as a prefix in front of the sha256 sum when generating the key. In the filesystem analogy, this would put your remove files (on DigitalOcean, at least) at /<bucket>/<key_prefix>/<relative-path>/<sha256sum>. Use this if you want it to be easier to find the file at a glance/want to organize things by filename on the object store (e.g. for one-off uploads); don’t use this if you’re planning on organizing things with the returned manifest.
  • bucket (str, optional) – The DigitalOcean Spaces/AWS S3 bucket to upload files to. For DigitalOcean this is just the name of the directory in your root Spaces directory.
  • public (str, optional) – Whether to make files public. If you specify public=False, the uploaded files will have None as their remote URLs in the returned manifest (which should not be surprising, since the returned manifest is intended for unauthenticated downloads). You want this to be True if you are uploading files for the purpose of public distribution.
  • **kwargs – Keyword arguments to pass to llama.com.s3.get_client that set authentication parameters and choose the target space for uploads; see documentation for that function for details.
Returns:

manifest – A dictionary whose keys are local paths of uploaded files relative to the root argument and whose values are tuples of the remote upload URL and sha256 sum of the file described by the key. Use this manifest to later download and install the correct versions of the uploaded files with the correct directory structure. Looks like {filename: (url, sha256sum)}.

Return type:

Dict[str, Tuple[str, str]]

Examples

Try uploading some dummy files with known contents to a remote test directory to confirm that you have access rights.

>>> # coding: utf-8
>>> import os
>>> from llama.dev.upload import upload_and_get_manifest
>>> from tempfile import TemporaryDirectory
>>> from pathlib import Path
>>> from requests import get
>>> from hashlib import sha256
>>> with TemporaryDirectory() as tmpdirpath:
...     tmpdir = Path(tmpdirpath)
...     with open(tmpdir/'foo', 'w') as foo:
...         _ = foo.write('bar')
...     with open(tmpdir/'baz', 'w') as baz:
...         _ = baz.write('quux')
...     manifest = upload_and_get_manifest(root=tmpdirpath, bucket='test',
...                                        key_prefix='llama/dev/upload/',
...                                        public=True)
>>> sha256(get(manifest['foo'][0]).content).hexdigest()
'fcde2b2edba56bf408601fb721fe9b5c338d10ee429ea04fae5511b68fbf8fb9'
>>> sha256(get(manifest['baz'][0]).content).hexdigest()
'053057fda9a935f2d4fa8c7bc62a411a26926e00b491c07c1b2ec1909078a0a2'