Research: The Dangers of Proxying S3 Content

Background

It is common for organizations to use Amazon’s S3 service as a place to host static assets and other content. The content within Amazon S3 is organized in “buckets”. Amazon also provides ability to point custom domains at S3 buckets through virtual hosting or the static website endpoints. In both cases, a CNAME mapping is created from the custom domain to an Amazon domain name.

However, SSL support is not available via the custom domain name, but SSL is provided if either “s3.amazonaws.com/<bucket-name>” URL or the “<bucket-name>.s3.amazonaws.com” domain name is used directly (as long as there are no periods in the bucket name). The reason why SSL doesn’t work when accessing the plain domain names is because Amazon is not able to provide certificates for them because they are not the domain owner and current S3 functionality does not allow custom certificates to be loaded. However, for “s3.amazonaws.com” domains, Amazon provides a wildcard certificate which works just fine. If you try to access the domain names directly, you will be served content with the same wild card certificate which of course would not match the domain names.

Possible SSL solution – CloudFront or Another CDN

One possible solution offered by AWS is to use their CDN offering called CloudFront. In that case, you can setup two CloudFront domains that sit in front of the S3 buckets and CNAME your domain names to them. This of course comes at a higher price and a confusing set of options: you can use the cloudfront.net subdomains, or a free SNI-enabled SSL certificate not compatible with older browsers or a costly ($600/month/per domain) option to upload your own SSL certificate. The data would then flow as follows:

[S3] >—-internal AWS network—-> [CloudFront] >—–SSL—-> [users]

Another set of solutions is to use a non-AWS CDN like CloudFront, etc. and have the CDN proxy the content with SSL. The setup would be similar to Amazon with SNI and non-SNI SSL options available. The data flow would then look like this (for CloudFlare):

[S3] >—-HTTP—-> [CloudFlare] >—–SSL—-> [users]

What You Should Not Do – Proxy S3 Content Yourself

Of course many developers would immediately react to this particular problem in the same way: I can do it better by myself! The usual solutions is to have a script or a webserver rule that will automatically retrieve the content from S3 and display it to the user. So it would look like this:

Everything after “/static/” would that be retrieved from some S3 bucket, let’s say “marketing.example”. So the following path would be followed:

Of course this only lasts as long as there is only one bucket. Let’s say now another bucket is needed called “support.example”. So the script will become something like with the bucket name in the URL:

What will often happen at this point is that the developer will not realize that the bucket names need to be validated against a whitelist of valid buckets. Because S3 bucket names are not unique to one AWS user but share a global namespace across all S3 users, this script would be able to retrieve data from any other S3 bucket, as long as it is public. This will not happen when using CloudFront or other CDNs because they will be mapped 1-to-1 against a specific S3 bucket.

How will this look like? If an attacker can figure out that the script takes arbitrary bucket names, they can go ahead and create a new bucket called “evil.example” and then use the following URLs to retrieve content from it:

What can this be leveraged for? Some examples:

  • Serving malware since the content will be served under the target domain and the target SSL certificates
  • Facilitating phishing attacks
  • XSS since HTML / JS content will bypass the same origin policy since it is served from the same domain as the target
  • Stealing session cookies since the code will run in the same domain and have access to cookies
  • If the content is retrieved using the S3 APIs, then an attacker could setup a “Requester Pays Bucket” and make money off the target (although Amazon would probably catch this eventually)
  • [insert your exploit here]

Recommendations

  • Don’t re-invent the wheel, use an existing solution like CloudFront, or some other CDN
  • If you must proxy content yourself, make sure you have a whitelist of valid buckets, and use other technologies like subdomains, HTTPOnly cookies, CSP headers, etc. to segregate the S3 content from the rest of the site

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.