Django Snippets: Managing Container Files
A common practice for cloud native applications is to use some type of external storage system (like Amazon S3) to manage their files. By offloading the storage into an external service, it becomes much easier to scale the application and provide high availability. Unfortunately, many cloud storage platforms do not provide convenient tools for managing files in bulk. This can make it difficult to migrate from one type of storage provide to another, to rename buckets/containers, or to remove files in bulk.
If working with Python and Django, however, there is an option for performing these types of management operations that is often overlooked: the Django storages API. The file storage API provides a common interface to working with files, and is the abstraction with which all Django components persist file data. It presents a common interface for retrieving, opening, reading, and saving data. More importantly, though, there is a broad set of drivers which can be used to work with nearly any type of object/blob storage including S3, Windows Azure, DropBox, and Google Cloud Storage (amongst others).
The snippets below can be used with storage instances to migrate files from one storage instance to another, or to remove files from a storage in preparation of deleting the container.
Transferring Content
Migrating system content from one storage to another can be difficult in cloud storage. For example, maybe you've stored data for a site in one Amazon S3 bucket and need to move it to another, differently named, bucket. There is no native transfer mechanism in S3 that can facilitate this. The snippet below shows how you can copy files from one Django storage to another. Both src and dest are instances of a Django storage. They do not have to be the same type of storage, though. The same snippet will work for transferring content from a file (local) to a cloud storage or vice-versa.
import posixpath def transfer_content(src, dest, basefolder='/'): ''' Transfer content from the source content system to the destination. Works recursively. ''' # storage.listdir returns two lists, folders and files sfolders, sfiles = src.listdir(basefolder) # iterate through the files and move to the destination for fpath in sfiles: src_fpath = posixpath.join(basefolder, fpath) f = src.open(src_fpath) dest.save(src_fpath, f) # iterate through the folders to add their content for dpath in sfolders: transfer_content(src, dest, posixpath.join(basefolder, dpath))
Removing Files
Many cloud storage platforms lack a "bulk delete" tool. To remove large numbers of files, this requires that you manually find and remove individual keys.
import posixpath def remove_content(s, basefolder='/'): ''' Remove all content from the provide storage. Works recursively. ''' sfolders, sfiles = s.listdir(basefolder) # Iterate through all file paths and remove for fpath in sfiles: s.delete(posixpath.join(basefolder, fpath)) # Iterate through all folders for dpath in sfolders: remove_content(s, basefolder=posixpath.join(basefolder, dpath))
Comments
Loading
No results found