https://mastodon.social/@mcc/110278058804094562 [https://mastodon.social/@mcc/110278058804094562]
Everything is so broken I can't even get the post embed explaining how broken things are to work.
I have this site, dryad.technology. It's my "static content" site. It has always been HTTP-only, no TLS. For reasons, I have to fix that now. It has to go HTTPS.
When I set this site up 7 years ago I decided I would finally use ~cloud technology~ (I self-host my other sites and was tired of it). The architecture is:
* Content hosted in a secret S3 bucket
* Cloudflare caches/proxies the S3 bucket and serves at the domain.
I pay almost nothing because Cloudflare is free, and Amazon charges me only for the traffic between Amazon and Cloudflare, which comes out to sub-penny levels. But I can't figure out how to move this setup to HTTPS. I tripped over this 7 years ago and I'm tripping over it now.
Can anyone help. Me. With this. Here is a list of things I tried that don't work.
Imagine my secret S3 bucket is named BEEFCAFE.
1. Serve site from http://BEEFCAFE.s3-website-datacentername-1.amazonaws.com/
Cloudflare proxies, https-encodes, adds a Cloudflare-signed https certificate
Why this doesn't work: This is fake https. It's HTTPS signed from Cloudflare to the user, but it passes unencrypted over the open Internet between Amazon and Cloudflare. This is lying. It is giving a false sense of security to my users. I won't do it.
The domain above does not respond to requests on port 443.
2. Serve site from https://s3.amazonaws.com/BEEFCAFE/index.html
Cloudflare designates site DNS as s3.amazonaws.com, and adds a rewrite rule that tacks on BEEFCAFE/ to all inner-website requests.
Why this doesn't work: Although I can go to https://s3.amazonaws.com/BEEFCAFE/index.html in a web browser and see my site URL, they appear to be specifically denylisting Cloudflare. When accessing dryad.technology/BEEFCAFE/index.html after making the change I get:
AccessDenied
Access Denied
R7QMB7JF9RKMTJ12
CCQMYXOnUVx1OTNPYl0W/GYL/xHLBm2kZyXq2aBls4YFiHSvNgUTRJfKD9J/znX05MkmYbngAc0=
… or similar nonsense, on every pageload.
I cannot find any Amazon documentation explaining why requests from CloudFlare specifically would result in this AccessDenied request. However there is a stack overflow answer [https://stackoverflow.com/a/39769368/6582253] that purports to address this exact problem, for this exact scenario, by setting a JSON "bucket policy". Following this page's had literally no effect. It did not change the error when accessed through CloudFlare nor did it block me from accessing through non-cloudflare. That is weird enough I wonder if I did something wrong
3. Follow the instructions
There is a CloudFlare support page [https://developers.cloudflare.com/support/third-party-software/others/configuring-an-amazon-web-services-static-site-to-use-cloudflare/] which purports to explain how to do exactly the thing I want to do [use CloudFlare to serve pages stored in an S3 bucket].
The page is a bit oblique. It explains you need to go into the bucket policy editor and then "use this sample to fill out the needed JSON code". The referent of "this sample" is not in any way indicated, it is as if a sentence is missing. However I think they mean the IP based bucket policy [https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-bucket-policies.html#example-bucket-policies-IP]. The CloudFlare support page then directs you to fill in the list of known CloudFlare IP ranges. This produces a JSON very similar to the one from the stackoverflow page listed above.
I'm not sure whether this support page gives me what I want. For one thing, it has a step directing me to "redirect requests from this bucket’s URL to the subdomain bucket URL you created". This suggests I'm going through my BEEFCAFE.s3-website-datacentername-1.amazonaws.com domain from approach 1 above, which implies I'll founder on the same http/https problem as before. However they also make oblique reference to an "endpoint", and there is an "access points" menu in Amazon's interface I have not fully explored. So maybe this will solve one of my problems (it is theoretically possible someone could guess my bucket name and access it directly instead of going through dryad.technology, mostly harmless but still worth avoiding) and I can solve the https problem in a following step.
Why this doesn't work: The Amazon examples page contains an interesting statement: "Warning: Before using this policy, replace the 192.0.2.0/24 IP address range in this example with an appropriate value for your use case. Otherwise, you will lose the ability to access your bucket." Editing the JSON, I think: Wait, does that mean I won't be able to read the bucket, or does it mean I won't be able to administrate it? Hopefully it's just reading the bucket!
No, it was total. After applying the change, my S3 bucket was suddenly only accessible from CloudFlare IPs. Meaning the site still loaded (because it was loaded from a CloudFlare site) but all of the AWS administration pages for the bucket now showed big red error messages, and I assume I tried to edit it that wouldn't work either. Hilariously this meant I was also blocked from undoing the configuration change I had just made. (Fortunately Amazon foresaw someone might make this mistake, and if I logged in from my superuser account it unlocked the bucket policy JSON form only, allowing me to reverse the change.
4. CloudFront?
Amazon's help pages do contain a page on how to serve S3 as HTTPS. What they recommend is using a different Amazon service called CloudFront [https://repost.aws/knowledge-center/cloudfront-https-requests-s3]. Not CloudFlare. CloudFront. So in this model S3 funnels to CloudFront which funnels to CloudFlare which funnels to the user. The problem here is that CloudFront is not a straightforward web endpoint but a full-featured CDN, which isn't exactly what I want. This would mean two layers of caching which would be odd and could even be glitchy; it also implies I'll be getting charged twice per page update, which could potentially increase my AWS costs from one-quarter of a cent per year to as high as one-half of a cent per year. Given that this very almost works— I can access AWS via http, and I can access it (at the s3 site) via https as long as I don't do so via CloudFlare— without CloudFront, I'd prefer not to jump into that particular pool of frigid water unless I'm assured by someone who has done this before that it really is the only way.
A second problem is that it's not clear to me from these docs how CloudFront accesses S3. Is it http? Is it https? Is it… whatever Amazon-internal communication method "aws://" is, which I assume but have not specifically seen documented is confirmed secure?
My "win condition" is that I have an S3 bucket, access to which is unrestricted with appropriate AWS credentials, and access to which is possible via public HTTPS but only when accessed by CloudFlare. Data should be encrypted (to ensure integrity) via some key or other at each link between the S3 bucket and the user's browser. I don't pay for CloudFlare and the S3 charges for updating pages and transmitting them to the CloudFlare cache comes out to a few pennies per decade, so I'm basically hosting high-availability static content for free. Can I make this work? Is there something I need to be reading? I am flailing.