So Flickr announced 1TB (yes, one _terabyte!_) of free storage for all users.
Which made me (and a whole bunch of other people) joke about writing a filesystem to use it as cloud storage.
Then, of course someone _did_ it ,and someone else too. (Personally I think “encoding” your data into text chunks in the png file is cheating – surely storing it as image data, either data-as-images-looking-like-noise or data-steganographically-embedded is more “in keeping with the spirit” of using Flickr’s storage?).
Then @emailpdeluca asked “so is raid supported with multiple accounts”?
But I was busy at work. so, of course, by brain decided to think about this and nothing else because productivity! – so…
A new plan.
Presenting
Washoe LAPSSFS
Washoe LAPSSFS is a distributed, fault tolerant, secure, highly available, photo storage provider independent – cloud storage system. It stores data in an “album grid” consisting of a collection of albums stored on free or inexpensive photo hosting sites. Each file you wish to store is first encrypted, then broken up into chunks and each chunk is stored in a photo on any of your configured photo storage sites/accounts. The splitting into chunks is done using erasure-coding, which means you can reconstruct the data even if one or many of the storage services become unavailable. It can also ensure that any one storage provider doesn’t hold enough encoded chunks to decrypt the file – even if they acquire the decryption key.
How it (ought to) work:
- Embed (or fork) Tahoe LAFS
- Modify/configure Tahoe to create chunks small enough to me encoded/hidden in photos (ballpark guesses – if we steganographically encode our encrypted data into the 2 least significant bits of a 16bit 2 megapixel image, we’ll be able to store a block of 256KB in each photo. This is a profligate waste of storage resources, but hey – Melissa just paid $1.1Billion for Tumblr, looks to me like they can afford it…)
- Embed those chunks of data into individual image files (either from a user-supplied image library, or by searching online image sites for CC licensed photos).
- Modify Tahoe to store urls/credential of hosting sites and photos as the location of each chunk – and to use those urls/credentials to retrieve specific chunks as required.
- Upload resulting image files to the available set of hosting sites/accounts.
- Profit
Future plans (for someone else to implement:
- Add the ability to proxy photo storage/retrieval requests for friends/colleagues/anonymous-random-people. This adds “noise” to any storage account traceable back to you – as well as hiding critical chunks of your encrypted data in storage accounts not related to you.
Development roadmap:
- I plan to post this rant to Twitter, and fully expect to see someone on Hackernews has it up and running a forkable on GitHub by the end of the weekend.