RAID-1 Sneakernet Part II
16 October 2012
2 Minute Read
So. The disks got shipped off to AWS after having been formatted and having the SIGNATURE files put on them. Each disk was sent as a separate job (hence the RAID-1)
I received an email from AWS for each job stating “In the processing of XXXXX, we discovered that your device does not contain a valid SIGNATURE file. A valid SIGNATURE file is required to authenticate your device.”
I’m pretty sure I double and triple checked before packing them up, but anyway I responded to each notification with the requested info that was the output of the import export jar. I would really like to see Import/Export brought into the man AWS console.
A few days later AWS told me the jobs were complete and that the log files of the export job were available in S3.
While waiting to the disks to arrive I thought it might be worth looking through these logs. Lo and behold about 600,000 out of 700,000 total files had been renamed….. Yay..
It turns out that AWS will not put more than 100,000 files in a single directory. Any other files above this limit are treated as if they have an invalid filename and are put in the recovery path as something like this: /EXPORT-RECOVERY/NNNN/NNNN/NNNN
Luckily each of these occurrences are logged, and there was enough information in there to be able to reconstruct it all.
This ‘feature’ is not unfortunately not listed in the Import/Export FAQ page at https://aws.amazon.com/importexport/faqs/ - but I did find a reference to this issue in the AWS forums https://forums.aws.amazon.com/message.jspa?messageID=238051
One other odd thing that happened is that the entire S3 bucket was exported, and not just the sub folder I’d specified. I was lucky enough te send big enough disks for it.
This process is OK on the whole, but if you are trying it I would recommend that you allow a lot of margin for error, especially time.
Related Tech Posts
Learn how Cfhighlander‘s existing component library can be used to produce reusable CloudFormation templates for managing AWS resources in a modular way.
Moving your infrastructure to one of the mainstream cloud providers is in most cases a massive cost saving for organizations. However, nowadays the goal is no longer to just to get to the cloud, but to make the best use of it by utilizing services at your disposal.
How to run docker containers in AWS, whilst working with them like you would _locally_.
In the wake of doing things faster with serverless, the serverless.com project and the benefits you get from Lambda. Discover how to automate the deployment of serverless projects in Jenkins.
directly to your inbox
Join your peers and sign up to our best content, news, services and events.