TurnKey Linux Virtual Appliance Library

Mapping AWS data centers for fastest connection

Yes, that's 'fastest', not closest.

Background

A while back I published a blog post entitled Finding the closest data center using GeoIP and indexing, which described how we automatically determine the AWS regional data center to be used for storing encrypted server backups.

We used the same solution to determine the preferred region when launching cloud servers in the Hub, as well as selecting the closest APT package archive for all TurnKey deployments.

New and improved

Since the original publication, Amazon built new regional data centers in Oregon, Sao Paulo and Tokyo, so the indexes needed to be updated.

While adding support for the new regions I decided to take it a step further and add some improvements.

Improvement #1: Automatic association (distance)

The method originally used to perform automatic association of countries/states to data centers was lacking some what and needed to be improved.

We are now using the Haversine formula, which is used to determine great-circle distances between two points on a sphere from their longitudes and latitudes.

haversine1

haversine2
 

Improvement #2: Incorporated world wide underwater cables (latency)

Originally we relied on user feedback of connection latency to tweak the indexes. This didn't scale very well, so we needed a way to make it easier.

Based on Gregs Cable map, we could mashup the automatic associations and tweak the index overrides based on expected latency.

It turns out that this was a crucial part of the equasion, as a user might be physically closer to data center X, but in reality the connection to data center Y is faster. For example, previously Australia was allocated to Singapore but has been moved to California as the pipe is much fatter (see the visual map below).

Improvement #3: Open source

We originally published the indexes, but have now open sourced the whole project on github in hope that others might find it useful, and make collaboration easier.

Putting it all together

The below screenshot plots countries/states to their associated AWS regional data centers, and overlays the world wide underwater cables for reference:

aws-datacenters-mashup

Want to zoom in? Toggle active and future cables? Check out the live mashup.

You can get future posts delivered by email or good old-fashioned RSS.
TurnKey also has a presence on Google+, Twitter and Facebook.

Comments

So what's the practical use?

Why not just use measured latency & throughput?

We all know actual geography means nothing in cyberspace...?

For example: I see a lot of lines from Africa pointing to India. Yet India has the highest latency and lowest bandwidth to most African countries because everything is routed either through London or New York.

Liraz Siri's picture

How do you measure latency to different AWS datacenters?

I agree that it would be better to just go ahead and measure latency and bandwidth directly. The trick is how do you do that from all over the world. Hmmm... maybe if the Hub doesn't already have your network's routing information it asks the appliance to run a few tests and measure latency directly, then caches that information for future reference.

Let alone the fact thay

Let alone the fact thay everything is highly ISP related in the first place... 

you should try to get a BGP

you should try to get a BGP feed and adjust the routing based on information out of that instead imho. 

Liraz Siri's picture

Global routing expert needed

Good thing Alon open sourced the code. Now let's hope a global routing expert runs into it and decides it would be a fun way to spend some time off on the weekend.
Chris Musty's picture

Interesting

Above comments aside, I actually had no idea Virginia would be faster than Singapore (I am in Sydney Australia). I have never had need to test for speed issues because Singapore is fast enough for the db frontend apps I am developing. If I can get it faster in Virginia - SWEET! The only issue with that is when the client asks "where is the cloud", sometimes they get turned off when you say "another country".

So as far as practical use goes it may help with data heavy apps in the future, at least for my company.

I am just about to start testing cloud servers and backups with my apps after spending months testing on site installations with TKL (which all came up trumps incidentally) so this information comes with good timing.

Thanks guys!

Chris Musty

Director

Specialised Technologies

Not only quicker...

Hi Chris,

I haven't checked recently, but a while ago it was actually cheaper to use US East AWS than Singapore AWS as well, so it may not just be quicker from Australia but less costly as well.

Chris Musty's picture

Gotta love it!

Just how I like my networking cheap and fast!

Chris Musty

Director

Specialised Technologies

Australia->Virginia

Could that Australia->Virginia (US-East) part be a typo? On the map (http://turnkeylinux.github.com/aws-datacenters/) I see a pink line from Australia to California (US-West)

Alon Swartz's picture

Good catch

Good catch Jack - thanks, it was a typo - should have been California. I've updated the blog post.

Hey from DZone!

Mr. Swartz,

Would you be interested in having this republished on DZone.com in our Big Data portal?  I think our readers would appreciate your work on mapping AWS data centers.  Let me know what you think!

Eric Genesky
Community Curator
DZone, Inc.

Alon Swartz's picture

Sure Eric, go ahead.

Sure Eric, go ahead. All I ask is that you link back to the original article.

Hi to all,I am new person to

Hi to all,I am new person to this blog.It is very interesting post and informative also.Keep on good work on the blog.

klima servisi

Post new comment

The content of this field is kept private and will not be shown publicly. If you have a Gravatar account, used to display your avatar.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <p> <span> <div> <h1> <h2> <h3> <h4> <h5> <h6> <img> <map> <area> <hr> <br> <br /> <ul> <ol> <li> <dl> <dt> <dd> <table> <tr> <td> <em> <b> <u> <i> <strong> <font> <del> <ins> <sub> <sup> <quote> <blockquote> <pre> <address> <code> <cite> <strike> <caption>

More information about formatting options

Leave this field empty. It's part of a security mechanism.
(Dear spammers: moderators are notified of all new posts. Spam is deleted immediately)