A couple years ago, I started deploying little Linux
web and utility servers to Amazon Web Services EC2. Like many providers
today, EC2 lets me instantly turn on “virtual instances” in the cloud.
Usually
when I deploy a new server, I bind a public IP address to it and then
it provides some kind of cloud-based service or website from there.
I
ran into a situation where I had to build a pair of servers in the
cloud which needed to be networked. Only one of them needed to be
exposed to the Internet, and I also had to build VPN tunnels from
multiple physical locations in diverse geographical locations, directly
to this group of servers.
Being
most familiar with VPN's, my first thought was “this is going to be a
breeze.” Build a couple IPSEC tunnels back to my sites routers, and be
able to connect right up.
Upon
researching what is involved in building a VPN from my site into a pool
of Amazon EC2 servers, I found it looked a bit more complex than I
originally thought.
There
is great documentation from Amazon on the entire process, and if your
working with this subject, I'd highly suggest checking this out:
The
purpose of a Virtual Private network is to secure a group or pool of
servers. This pool can then have point-to-point VPN connections built
between it and business end-points such as their physical locations.
Once
I had read all the documentation, I deployed my VPC and configured it.
There's a few steps but their fairly simple. I deployed an instance,
bound it to my new VPC security-group, opened some ports, created an
Internet-Gateway (IGW) for my VPC, and a Virtual Gateway (VGW). I also
specified my VPN end-point information. While doing the last step there,
I was able to download a device-specific config file that would contain
all the critical IPSEC configuration settings that my device would
need.
When
I read the documentation, I noticed that there seems to be somewhat
limited device support for VPC. The choices where Cisco, Juniper,
Yamaha, Generic
I
decided to start out attempting the Generic device config. The site in
question had Netgear RT314 VPN routers that where already running IPSEC
tunnels between each other. I decided to attempt to build an IPSEC
connection with them because that would be a cost effective way to
establish a tunnel.
That
didn't work. Period. No way possible. Why? Because the Amazon VPC
absolutely requires your VPN endpoints to have BPG. They also require
the use of dual-redunant VPN tunnels...
Next
up: Forget the netgear RT314, I'll just build a linux box as my
firewalls for the sites, and then use Strongswan or OpenSwan. There are
some excellent documents online and discussions on that topic that we
found, so it seemed like a viable thing to try.
The
theory is that you deploy your instances to the VPC, get them talking
locally (in the cloud) and then deploy a Linux instance such as Ubuntu,
install Strongswan, and configure it as your Virtual Gateway in the
cloud (VGW). Then Build another Strongswan box (or technically any other
standard IPSEC type box could work), as your VPN endpoint. Simply
configure a tunnel between then and your done. No BGP required because
your not actually leveraging the VPC as your VPN end-point, you are
simply forwarding IPSEC traffic from the outside, in to your Micro
Ubuntu Strongswan VPC server which then routes your traffic into your
VPC pool.
Problem
is, we couldn't get it to work. After much digging and researching we
finally found a blog that changed my mind about Strongswan. To sum it
up, you can get Strongswan to work, however it will be unstable in a way
that makes it virtually unusable except for the most least critical
utilitarian tasks.
There
is a big advantage to using systems the way they are designed to be
used. The Amazon VPC is very strange in the way it's designed to
establish VPN tunnels. When you build an IPSEC tunnel to a VPC Virtual
Gateway, BGP is involved of course, but the other weird thing is that
TWO individual and distinct tunnels are created. This complicates things
considerably because most devices aren't designed for IPSEC tunnels to
operate that way.
I
finally decided there was no other way. I had to deploy it on Cisco. My
budget was really tight though, so naturally I was concerned about
costs.
I
had a Cisco 871 handy. These are not considered very expensive units,
their a few hundred dollars, well under a thousand each. They also had
all the feature support I needed – BGP especially.
The
problem was, BGP is considered an “advanced IP feature”. The 871 I had
only came with the standard features so I had to purchase an upgrade. It
cost about $100 to add that but money well spent if it solves the
problem was my thought.
After
flashing the unit and upgrading the features, we configured it onsite
as a regular internet gateway and connected it up with a public IP and a
LAN with an Ubuntu desktop behind it.
Now we configured the Amazon VPC side (VGW), and it gave us a perfect config file we simply added to our router.
It didn't take any tweaking. Both the VPN tunnels lit up green when I checked the VPC tab of AWS.
The Cisco also showed everything came up. Wow that was easy I thought.
First thing I tried to do? Well, I connected by SSH to the Cisco, and tried to ping the inside gateway of the VPC. No dice.
I
had deployed a couple Microsoft Server instances inside of my VPC. I
needed to RDP directly into one of them so I bound a public IP to the
Internet Gateway (IGW) then tried to connect. No go. I checked the
settings and found I needed to add a route to my VPC. 0.0.0.0/0 was
required to be manually added to my routing table in the VPC tab before I
could RDP into my instances in the VPC pool.
Once I was in my instance, I tried to ping my Cisco 871's inside interface over the VPN. No go.
We
put an Ubuntu desktop behind the Cisco 871 and tried to ping the inside
interface of our Server 2008 box inside the AWS VPC. Still no go. We
ran a traceroute from our Ubuntu desktop to our VPC instance, and it
returned the Amazon side BGP address... It was a 169.254 address. That
told us that traffic was actually routing through the Cisco 871, and
going all the way to the VPC but was being rejected.
We
found that another route had to be added and that route was simply the
IP network range of the remote side. All these routes that we added went
in the Route Table in the VPC tab of AWS. None of them needed to be
added to the Cisco and when we tried to manually create them, it seemed
to cause things to drop.
We then had to properly configure our security group, and then traffic freely flowed over the tunnel.
We
never could ping from our Cisco 871 to the VPC IGW but since we could
ping from our physical Ubuntu desktop to our Microsoft Server instance,
and we could RDP into the server, the problem was solved.
Overall,
at the end of the day, deploying the AWS VPC by using Cisco Service
Level routers is the best way to tap into this stuff. Just remember you
can't use ASA or PIX firewalls as your VPN endpoints, only Service Level
Routers. You also cannot NAT your Cisco router behind your firewall.