Upload Files (Objects) to AWS S3 without timeouts

Its always best to be using a user account authenticated via SSO (so you can have additional levels of security, e.g. MFA) on-top, and access AWS resources using a temporary Role (to provide the permissions). For most use cases e.g. interacting with the Web Console, using the CLI or performing development tasks via Cloudformation templates or Terraform use of these temporary roles is fine, you’re not needing access for long periods of time, if a timeout were to occur, you’d just reauthenticate.

However, when performing long duration activities, for example the upload of a large volume set, you may find that a temporary Role assignment authenticated via SSO won’t provide a sufficient duration. You’d not want a file upload to be started, which then fails because a time out occurs. Now, there are a few ways to get around this, such as wrapper scripts, checkpointing or even using AWS’s DataSync or AWS Transfer tools; but we’re looking at what you can do to (as safely as possible) provide a method to upload files to an AWS S3 bucket.

The option we used here to was to create an IAM User account with an Access Key, then apply very specific Permissions Policies to allow only the least privilege access to perform the task, which in this case is uploading loads of data to an S3 Bucket.

I’m not going into detail of each step, just giving you the basic overview and permissions policies to achieve this, note we’ll be using Inline Permissions Policies. I’ve also assumed the bucket has already been created, and you’ve got the ARN for it available, e.g.

arn:aws:s3:::aws-my-big-input-bucket

Step 1 – IAM User

Firstly create the IAM User account, we called ours s3-input-bucket-uploader.

Step 2 – Create Access Key

Now we create an Access Key for the user account, save the Access Key and Secret Key, we’ll need these shortly, and also ensure that you remember to remove this account once the upload is done to ensure it is not available for use by anyone else after you’ve finished with it.

Step 3 – Permission Policy 1 (AWS S3 Bucket Permissions)

We’ve got two sets of permission policies that we are adding, strictly we only need this first one which provides the basics to upload/manage the objects/files, the second provides the ability to list all the buckets, not strictly essential, but means you can see the buckets if you were to run an “aws s3 ls” command.

Add this as an “inline policy”, that way its attached directly to the user permanently, this isn’t best practice as Roles are best for providing permissions, but in our case our use case is specific, access is restricted and access is required for an extended period.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:DeleteObject",
                "s3:ListAllMyBuckets"
            ],
            "Resource": [
                "arn:aws:s3:::aws-my-big-input-bucket/*",
                "arn:aws:s3:::aws-my-big-input-bucket"
            ]
        }
    ]
}

Step 4 – Permission Policy 2 (AWS S3 List Permissions)

Now for the list permissions.

Add this as an “inline policy”, that way its attached directly to the user permanently, this isn’t best practice as Roles are best for providing permissions, but in our case our use case is specific, access is restricted and access is required for an extended period.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Action": [
                "s3:ListAllMyBuckets"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

Step 5 – Upload the Files!

Lovely, we’re now all set, so from your AWS Workstation machine with AWS CLI installed, you can perform the upload.

Firstly login to your AWS workstation machine with something like “Screen”, so if your session were to get disconnected all would not be lost, we don’t want to get it started uploading then our connection to machine is lost. If you’re working locally however you don’t need to worry about this necessarily.

Then get authenticated by adding the Access Key and Secret Key to your current environment:

export AWS_ACCESS_KEY_ID=<Access Key Here>
export AWS_SECRET_ACCESS_KEY=<Secret Key Here>

That’s it, we’re ready to upload, you may want to verify the credentials are working by running a cheeky “aws s3 ls” to ensure you can see the bucket as expected.

Now start your upload with something like:

aws s3 cp * s3://aws-my-big-input-bucket

And kick back and wait!

It’s recommended to try this out on some sample data to verify it really does just keep going, essentially if its still working after 12 hours which appears to be the maximum

Leave a comment