Blog hosting
For a while now I've wanted to move my blog to a real host. Until now I've hosted it on a server in my apartment. This has been fine and in theory would be fine to continue doing, but it has a few issues:
- If I ever somehow get a surge of traffic I won't be able to keep up.
- It is not the most stable connection and I don't want to be prevented from running updates.
- There are security and privacy issues having my home IP exposed and accepting remote traffic.
Since this is something I want to make more public, I think it makes sense to put it up on a more powerful host.
Another reason I've been thinking about doing this is to give me an excuse to play around with "cloud" technology, specifically AWS and GCP. There is no real need for this blog to be using these, but I think it would be good to have some experience using them.
AWS or GCP
I've read some articles comparing GCP favourably to AWS for various reasons: Per minute billing and automatic discounts, ease of use, flexibility, among others.
I also like the idea of using the less popular option as hopefully this will spur competition.
Though GCP being from Google gives me pause. They seem to throw a lot of products at the wall and kill them quickly if the products don't fulfill their expectations. This makes it possibly riskier long term than AWS.
Other options
There are other ways I could host the site:
- A regular old VPS. I already have one running other things. The main issue is it is not very resistant to downtime, being only one server. I also have some issues with its performance.
- Github pages. Since my site is static HTML, I could put it on Github's free static hosting. However I discovered that if I want to use my own domain, then I can't protect it with TLS.
- Self host but add a CDN. What I could do is continue to host it myself (or possibly on my VPS), but put a CDN in front of it. This would provide speed and in theory stability.
But none of these let me scratch my GCP/AWS itch.
Trying out GCP
I set out by signing up to GCP. They gave me $300 to use for 60 days so I could try out the tools without putting in any money.
I narrowed my choices of service inside GCP to either Compute (VM) or App Engine. I started looking at App Engine first, primarily because it appears I will gain in not having to worry about scaling or managing individual servers much (if at all).
App Engine works through you uploading a version of your application. This
could be a Go executable, a PHP program, or a number of other options. In my
case there is no application as such, only static files. It is possible to
include static files in your application of course, so I wrote an App Engine
app.yaml
file (configuration file) that included an entire directory of
static files, and set the request for /
to go to an index.html
. It looks
like this:
runtime: go
api_version: go1
version: my-version
application: my-project-name
handlers:
- url: /
static_files: static/index.html
upload: static/index.html
- url: /*
static_dir: static
I found that depending on the App Engine tool I used, I needed to package an application/executable file as well (Standard Environment yes, Flexible Environment no). Because of this requirement I included a "hello world" Go program. In the configuration for the app, I set all requests to go to the static files, so this executable is never used (except to satisfy the requirement that there be one).
I was confused about which program I should be using to interact with GCP. I
started using gcloud
(Google Cloud SDK) as that is what I came across first
in their documentation. Elsewhere in the App Engine documentation I read that I
could use something called appcfg.py
(App Engine SDK). One difference being
the Google Cloud SDK uses App Engine's Flexible Environment which is in beta,
whereas appcfg.py
uses the App Engine Standard Environment which is the
current supported version. Using the Google Cloud SDK I could see compute
instances starting up, but with the App Engine SDK this was not the case (I
take it this is one of the features of the Flexible Environment). I understand
that in either case GCP takes care of stuff like OS updates.
I went with appcfg.py
due to the other being in beta.
Making updates to an App Engine site
I tested how making updates to the site would work.
I found I can specify a particular version each time I push an update (or you
can specify it in app.yaml
, as above). After pushing an update with
appcfg.py
, I didn't see the changes on my site. Looking in the GCP control
panel, the original version was set to accept all requests. There is a
subcommand to change the version accepting requests, set_default_version
.
This means pushing an update looks something like this:
appcfg.py -A project-name -V vX update path/to/site/files
appcfg.py -A project-name -V vX set_default_version path/to/site/files
Where vX
is the version and project-name
is the name in GCP's control
panel.
Note you don't have to bump the version number to make an update! If you use
the same version that is currently servicing requests, then you don't need to
use the set_default_version
command.
Update: I found that sometimes set_default_version
does not migrate
traffic to the new version. You can see which version is servicing requests
by going to the Versions tab in App Engine. You can change this from here
as well: Check the version you want to serve traffic, then click the arrow
in the top right. It will ask you to confirm you want to migrate the
traffic.
Problem with GCP: Caching
I made a handful of small updates testing the above process. I ran into strange behaviour: Sometimes I would see my update appear instantly live, but other times it would be minutes before I saw it.
I found that this was due to caching. Here's the header of a response:
HTTP/2 200
date: Mon, 12 Sep 2016 05:32:18 GMT
expires: Mon, 12 Sep 2016 05:42:18 GMT
cache-control: public, max-age=600
etag: "E3uZXA"
x-cloud-trace-context: 8693ff4fef5595d93e794ff7aa4aa94c
content-type: text/html
server: Google Frontend
alt-svc: quic=":443"; ma=2592000; v="36,35,34,33,32"
The headers say this can be cached for 10 minutes.
I found that the servers servicing my request would cache and serve my site's HTML for this period of time. I could not see a way to invalidate their cache. Indeed, the documentation says this is not possible.
I found in the documentation for app.yaml
that I can set default_expiration
to control this. It defaults to 10 minutes. It says "a global default cache
period for all static file handlers". This explains the behaviour.
It makes sense to cache these pages since they are primarily static after all. It would be nice if I could invalidate the cache when I do make updates. But knowing about the behaviour, it's something I can work around and live with I think. I don't plan to change the default.
Problem with GCP: TLS Certificate
I wanted my site to be TLS-only. Not that there is any specific security concern, but it is a good practice these days.
I've been using a certificate from Let's Encrypt and found it very easy to use. I thought I could keep using this.
However if I wanted to use it with App Engine then there is a problem: As far as I can tell, there's no way to set the certificate to use except through uploading it into a GCP console web form. Even if I had somewhere to run a tool to keep the certificate from Let's Encrypt up to date, I don't think there's a good way I could automate uploading the certificate into GCP's control panel (I don't like the idea of automating interaction with GCP's form).
Having the certificate renewal process be automatic is important as the certificates from Let's Encrypt only last 90 days. I could set a reminder and upload a new certificate every ~2 months or so, but that is a waste of time for something that has until now been automatic.
Solutions for TLS certificates with App Engine
- One workaround would be to put Cloudflare's CDN in front of my site hosted on GCP. Cloudflare can provide the TLS. I remembered this article about how Cloudflare works especially well with GCP. But this is introducing an extra level and seems needlessly complicated. Though now that I think about it, it may provide performance benefits too (being a CDN and all).
- I wondered if the
gcloud
orappcfg.py
tools had a way to upload a certificate for use with App Engine. But I do not see a way. (You can with compute instances it seems). - Use a VM (Compute instance) on GCP. This is not much different from sticking with what I already have, a VPS. Except it would be running on GCP. I'd still have a server to manage and keep up to date.
- Buy a certificate that lasts 1-2 years. They're not expensive and it would save me the hassle of uploading one every couple months. But it irks me to be forced to do so out of a limitation in a service I'm thinking of switching to.
Gotchas with App Engine
- I found that every file I had in my app's directory will be uploaded when I
run update, even if they are not mentioned in
app.yaml
. I mention this because I don't want things like my.git
files to go into GCP! (You can verify this by using thedownload_app
command inappcfg.py
). There is askip_files
app.yaml
directive you can use to exclude files.
My decision
I decided to go with App Engine. For solving the TLS issue, I decided that Cloudflare in front is acceptable. I didn't like the idea of another layer of hosting for my site, but when I realized I will gain from it being a CDN, I thought it's worth doing.