Using perl to get, put and delete on Amazon S3 storage
6 August, 2007 – 5:36 pmAmazon web services development domain have provided some neat perl code here such that you can simply put, get or delete objects inside an Amazon S3 storage bucket with something like this:
./s3curl.pl --id=[aws-access-key-id] --key=[aws-secret-access-key] -- http://s3.amazonaws.com/[bucket-name]/[key-name]
I modified the author’s code to make it a bit more win32 friendly, as it is no surprise that DPHOTO uses perl and a win32 platform to launch its tertiary backups.
What I needed to do was convert the code provided by the author into a couple of succinct win32 friendly functions, to use within your own perl code.
The first part needed a function that could sign your host ID with an appropriate hash. You can use the following function to achieve this.
sub getResourceToSign
{
my ($host, $resourceToSignRef) = @_;
for my $ep (@endpoints) {
if ($host =~ /(\w+).$ep/) { # vanity subdomain case
my $vanityBucket = $1;
$$resourceToSignRef = "/$vanityBucket".$$resourceToSignRef;
return;
}
elsif ($host eq $ep) {
return;
}
}
# cname case
$$resourceToSignRef = "/$host".$$resourceToSignRef;
}
The second function, which is the guts of individual file backups looks like the following:
sub doBackup
{
my $errors = "";
my $host = "s3.amazonaws.com";
my $resource = $requestURI;
getResourceToSign($host, \$resource);
my $httpDate = POSIX::strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime);
my $aclHeaderToSign = defined $acl ? "x-amz-acl:$acl\n" : "";
my $stringToSign = "$method\n$contentMD5\n$contentType\n$httpDate\n$aclHeaderToSign$resource";
my $hmac = Digest::HMAC_SHA1->new($secretKey);
$hmac->add($stringToSign);
my $signature = encode_base64($hmac->digest, "");
my @args = ();
push @args, ("-H", "\"Date: $httpDate\"");
push @args, ("-H", "\"Authorization: AWS $keyId:$signature\"");
push @args, ("-H", "\"x-amz-acl: $acl\"") if (defined $acl);
push @args, ("-H", "\"content-type: $contentType\"") if (defined $contentType);
push @args, ("-T", "\"$sourceFile\"") if (defined $sourceFile);
push @args, "http://$host$requestURI";
$cmd = "$curl -s @args";
open(CURL, "$cmd 2>&1 |") || warn "can't open: $!";
local $SIG{PIPE}='IGNORE';
while(<CURL>)
{
chomp($_);
#$errors.= "$_ ";
if($_ =~ m/.*\<Code\>(.*)\<\/Code>.*/) {
$errors = "Amazon Error Code $1";
}
}
if($errors) {
$backup_retries += 1;
$backup_comments = "Amazon errors: see log file";
$log->error(1, $logDir, ($SpawnID, $file_id, $user_id, $backup_id, "$errors") );
} else {
$continue = "true";
}
}
I have some custom log objects in there that you probably want to remove in your own application. To make all this happen from your win32 platform you will obviously need perl with the following use statements:
use POSIX;
use Digest::HMAC_SHA1;
use MIME::Base64 qw(encode_base64);
use Getopt::Long qw(GetOptions);
You will also require a curl executable (full path should be in $curl). I’m currently using the version available here.
Then from the main body of your code you can call the doBackup function with something like this;
my $curl = "D:\\services\\scripts\\bin\\curl.exe";
my $keyId = "YOURKEYID";
my $secretKey = "YOURSECRETKEY";
my $acl = "public-read";
$sourceFile = "\\\\$server_name\\$path\\$filename.$file_ext";
$targetFile = "/$bucket_name/$path/$filename.$file_ext";
$contentType = "image/jpeg"
$requestURI = $targetFile;
$method = "PUT"; #Can also use GET or DELETE
&doBackup();
I have all of the code in some sort of while loop which can then parse and process individual files on the win32 system along with some basic form of error checking and retry system. I catch $errors in the doBackup function as you inevitably get timeouts and so forth on the S3 side. I also make the corresponding code multithreaded so that I can have simultaneous threads processing file backups to S3 (I use up to 10 threads at a time depending how busy DPHOTO is).
Hope someone else finds this useful!








