Continuing on the series on Internet centrality, an important element of the argument that the web is greatly centralised rests on the fact that the Web’s architecture is becoming less distributed with time. The push towards the cloud has been translated into fewer name servers, and fewer hosting choices.

Pingdom, a company dedicated to monitoring server uptime, has conducted an interesting study looking at the hosting locations of the top one million websites (as ranked by Alexa). While some people have expressed serious concerns with Alexa rankings, they provide an interesting sample of highly-visible websites which generate considerable traffic, and this gives us a good snapshot of what the Internet hosting situation looks like today. The results are predictably depressing for someone who would like to see more diversity. Initially, it is noteworthy that the top million websites are hosted in 7,936 cities around the world, which seems quite diverse, but when looking at the distribution of hosting, it seems clear that there is a lot of concentration of content: the top 3 cities account for 10% of all hosting, while the top 10 account for a staggering 22% of all content.

A power law in hosting.

The full map shows just how skewed hosting has become, and it helps us get an idea of the issues ahead.

Looking at the data in detail, several issues emerge. Firstly, it would seem that we are presented with a power law, or at least with a clear long tail skewed distribution (more details about those concepts in my book). In fact, I have taken the liberty of playing a bit with the data provided, and have come up with some interesting results. Here is the graphic of the top 100 cities, which starkly shows the huge imbalance in hosting.

hosting

Geographically, the data shows that the United States is considerably ahead when it comes to hosting content. Of the top 10 hosting cities, there is only one outside of the US. This becomes more evident when one looks at the data as a pie chart:

hosting2

Why is this important? From a legal standpoint, this has huge implications for liability and enforcement. The first obvious one is that by being hosted in a country, you might be subject to enforcement by their authorities if you commit an offence, a lesson that has been in evidence in the Megaupload case. This could include of course civil enforcement of figures that do not exist in your country, such as opening up the potential for patent infringement suits arising from software patents. For a while, I have also been concerned about the international effect of bad legislation. As CISPA is supposed to be resurrected, this might mean that a lot more providers would be covered by the legislation, even if they are not located in that country.

In the end, we are faced with a question of governance. The Web’s architecture is highly US-centric, which might explain the fact that its governance is as well. Until the percentages seen above, as well as similar showings in domain name registration, we can expect the world to be subject to US jurisdiction for quite a while.

Table: Top 50 Hosting Cities

Number of websites City Country
1 50,598 Houston United States
2 29,594 Mountain View United States
3 24,822 Dallas United States
4 23,210 Scottsdale United States
5 21,808 San Antonio United States
6 20,691 Provo United States
7 14,871 Ashburn United States
8 13,214 San Francisco United States
9 13,125 Chicago United States
10 11,273 Beijing China
11 10,006 New York United States
12 9,412 Los Angeles United States
13 8,170 Lansing United States
14 7,588 Tokyo Japan
15 7,538 Montreal Canada
16 7,380 Culver City United States
17 7,349 Brea United States
18 6,605 Osaka Japan
19 6,296 Atlanta United States
20 6,063 Amsterdam The Netherlands
21 4,546 Moscow Russia
22 4,250 Burlington United States
23 3,823 Wayne United States
24 3,627 Absecon United States
25 3,491 Austin United States
26 3,451 Seattle United States
27 3,449 Orlando United States
28 3,361 Berlin Germany
29 3,176 Columbus United States
30 3,146 Saint Louis United States
31 3,088 Shanghai China
32 2,907 Paris France
33 2,819 Denver United States
34 2,804 Sunnyvale United States
35 2,659 Bangkok Thailand
36 2,619 Englewood United States
37 2,607 Providence United States
38 2,601 Toronto Canada
39 2,579 San Jose United States
40 2,525 Guangzhou China
41 2,366 Tampa United States
42 2,293 London United Kingdom
43 2,283 Phoenix United States
44 2,248 Hangzhou China
45 2,196 Tempe United States
46 2,189 San Diego United States
47 2,181 Fremont United States
48 2,176 Mclean United States
49 2,097 Pittsburgh United States
50 1,957 Nanjing China
Categories: Networks

2 Comments

Avatar

mlinksva · March 9, 2013 at 1:40 pm

"an important element of the argument that the web is greatly centralised
rests on the fact that the Web’s architecture is becoming less
distributed with time"

I do not follow.

This weird joining of point in time and rate of change is the thing that confuses me about previous posts in this series as well.

Regarding this one, should I find that 22% of the top million cites are hosted in 10 cities staggering? I have no idea. What would a properly distributed web look like in terms of the share of top cities? What did the distribution look like at any point in time in the past? Which way is the delta? Not clear to me either direction should be assumed.

83.9% of the top million sites hosted in the US is pretty unambiguously depressing and important for the reasons you state. But what's the direction?

La concentration des données du Web » DICE, Data on the Internet at the Core of the Economy · March 10, 2013 at 11:28 am

[…] l’information qui se dessine dans ces centres, et comme l’a bien souligné le blog de Technollama, le cadre légal dominant […]

Leave a Reply to La concentration des données du Web » DICE, Data on the Internet at the Core of the EconomyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.