Wednesday, January 23, 2013

GitHub blocked in China - how it happened, how to get around it, and where it will take us

What happened?

Update: On January 23, https://github.com was unblocked again.

On January 18, or possibly the day before (though our test data doesn’t cover this), the Great Firewall began to reset connections containing “*.github.com”. As a result, code sharing projects hosted on a subdomain of GitHub, such as aoxu.github.com, were blocked in China. The main GitHub website was mostly unaffected, for two reasons. Firstly, it’s hosted on github.com, without a subdomain. Secondly, it serves encrypted content only, thus preventing the Great Firewall from resetting connections based on keywords.
A day later, the block was extended through the inclusion of github.com, without subdomains, in the list of keywords causing connections to be reset. Chinese users could still access GitHub as long as they manually typed in https://github.com in their browser (notice the https). Strangely the www.github.com host was DNS poisoned, but not any other hosts. The www subdomain is not used by GitHub.
On January 21, DNS poisoning was extended to all github.com hosts including the root domain as well as all its subdomains. In effect, all of GitHub was blocked in China.
Interestingly, the blocking of GitHub has seemingly not been censored on social media. The keyword “github” has not been blocked on Sina Weibo, and we have not detected any deleted posts containing “github” on FreeWeibo.
For further information on how the blocking was introduced, including data references, see the Timeline at the end of this article.

Why oh why?

As always when online censorship in China changes, the first question asked is why. While we cannot be certain, it doesn’t stop us, or anyone else, from speculating.
Some have suggested that it may be because of the Mongol project, hosted on GitHub. Mongol is an open-source tool used to detect routers that block certain connections going out of China - in essence tracking where the Great Firewall is located. While such a tool may seem threatening from the point of view of the Chinese authorities, there are a few facts that make the blocking of Mongol seem unlikely: the tool was released a full month ago(link is external), the working principle of the software was released back in 2011 and the paper describing it is still not blocked.
Another theory is that the government jumped on the opportunity to block an all-encrypted file-sharing service which, though intended for code sharing, can also be used to share politically sensitive material. Other file sharing services have faced similar dilemmas in China, including Dropbox which was blocked in 2010. Was GitHub being used by activists to share information?

The train ticket theory

The most gripping tale though ties this story in with China’s annual mass migration during the new year holiday. Each year tens of millions of Chinese scramble to purchase a limited and insufficient number of train tickets so they can make the journey home to spend the holiday with their families. Train tickets in China can only be bought 18 days ahead of the planned journey. With tens of millions of people traveling home for the Spring Festival, getting hold of the right ticket is a real challenge. Failure can mean missing out on the often only once-a-year chance to meet up with the family.
With the increased use of the internet, however, a lot of ticket sales are done online via the government-run website 12306.cn. While waiting for the right ticket to go on sale, users will often reload a web page continuously. This is of course a problem easily solved by creative software developers. Several Chinese web browser providers rolled out add-ons(link is external)that automatically reload the government website and book the ticket as soon as it's available.
A particularly interesting add-on was called 12306_ticket_helper (https://github.com/iccfish/12306_ticket_helper(link is external), now deleted). The software was using files embedded on GitHub. It’s sudden popularity caused such a traffic load that GitHub temporarily went offline, and an employee sent an abuse complaint to 12306.cn(link is external). GitHub didn't know that it was actually the browser add-on that embedded the file, and not the 12306.cn website itself.
On January 18, at the same time that the GitHub block was introduced, the Ministry of Railways was said to be asking Kingsoft(link is external), one of the other browser providers, to disable their ticket-buying add-on. On the same day, the Ministry of Industry ordered all browser providers(link is external) to remove similar add-ons.
Is the GitHub block just a matter of the site being in the wrong place at the wrong time? It’s not inconceivable to think that when the Ministers of Railways and Industry say “dance” that everyone dances. After all of the accomplices who were involved in the ticket scandal made amends, it is likely that they looked further to see who else was involved and GitHub may have just found themselves caught in that net.
If this is true, then this episode does reveal something about the Chinese censorship mechanism. One of two things would have had to occur for GitHub to have been blocked. The person who has his finger on the censorship button had free reign to just censor what he thought needed to be censored (in relation to the ticket scandal) which would indicate that this civil servant does not have to jump through a lot of hoops when he thinks a site should be blocked. Another explanation is that the powers-that-be in the censorship bureau who gave the go-ahead to block GitHub are so incompetent that they could not comprehend the fallout related to closing down the site. They were either too lazy to investigate, too distracted to care or just plain oblivious to the role that GitHub plays for many developers across China.
Our tests indicate that the likely answer is a combination of the two theories above. At first the censors started resetting *.github.com but found that this was ineffective. So then they moved to a more comprehensive block when they understood that the first one was not working. Which would mean that the powers-that-be had no understanding of how GitHub works and the civil servant with his finger on the button can choose to push that button whenever he wants.

The HTTPS theory (true either way)

Because GitHub is HTTPS-only, the Great Firewall cannot block individual pages. Regardless of the specific project the authorities wanted to block access to, the only way they could do it was to block GitHub altogether. This could have severe implications for other websites as well. As more and more of the Internet is switching to encrypted connections, the ability for online censorship authorities to selectively block content decreases. If, or perhaps when, Google Search, Wikipedia and CNN switch to HTTPS-only, will the Chinese authorities decide to block them altogether as well?

What will the knock on effects be?

According to Alexa, GitHub is the 276th most popular(link is external) website in China. Globally, GitHub is ranked 209th. Since its targeting a very specific audience (software developers), that’s not a bad ranking. Github themselves told Techinasia(link is external) that China ranks fourth in terms of visits to the site.  The only foreign-hosted websites ranked higher than GitHub in China are Google, Bing (and Live.com, Microsoft.com, Msn.com), Amazon, Yahoo, Wikipedia, Apple, eBay and Adobe.
While GitHub is popular, there are many other code-sharing services offering alternatives. Google Code is not blocked, though the HTTPS version sometimes is, and if or when they switch to HTTPS-only they may well face the same dilemma as GitHub. Sourceforge is also not blocked, as well as many other smaller providers.
Software developers often have to work with whatever code sharing service their project is already using. Switching from one to another is somewhat complicated. Many Chinese developers, especially the ones that work with customers abroad, will now have to use circumvention tools to stay in business. With such tools being actively targeted, some of them may not be able to continue their work at all.
China has been successful in attracting a lot of foreign developer houses to the country due to lower costs and access to plenty of developer talent. Foreign investors in this area may now start to question if it is a wise decision to place so many human resources in a country that may prevent or limit access to key technical resources without warning. Companies who run Gmail for their enterprises have learned the hard way that their communications can be turned off on a whim. Most who experienced outages when China completely blocked Google last November have probably found enterprise alternatives to Gmail already. Companies will now likely consider more stable alternatives to China.
The most devastating impact could come in an attitude shift amongst young Chinese. China’s censors have effectively just pissed off a whole nation of developers. It is likely they knew how to get around the firewall anyway but when developers have to turn on VPNs or fiddle with proxies in order to do their jobs, they will get upset. Does China really want to create a generation of would-be hackers? Especially within her borders? Could this signal the birth of a Chinese Anonymous? Perhaps an end to online censorship in China is now closer than we think?

How to get around it?

If the Great Firewall has not fallen by the time you read this, then you can follow these instructions to circumvent the blocking of GitHub.
If you are using a VPN, all your traffic is rerouted through a foreign server and GitHub will work as usual. Unless the Great Firewall also blocks the IP address of GitHub, another simpler alternative is to manually edit the so-called hosts file, adding the following entry:
207.97.227.239 github.com
With such an entry in place, connections to https://github.com will work from inside the Great Firewall. The unencrypted http://github.com will not work, so remember to add the “https” manually.
The IP address of GitHub may change at any time, of course. A more stable solution is to use an encrypted DNS lookup service such as DNSCrypt(link is external) which effectively bypasses DNS poisining. Ironically, the Mac version downloads links to GitHub, which of course is blocked. But the final download link is not blocked: http://download.dnscrypt.org/guis/opendns/osx/dnscrypt-osx-client-0.19.dmg.

Timeline

DateEvent
Jan 18Connection reset of *.github.com includingwww.github.com (not DNS poisoned)

Jan 19Connection reset of github.com
Jan 19DNS poisoning ofwww.github.com
Jan 19thewww.github.com keyword causes connection reset on Google Search
Jan 20Connection reset of *.github.com (still not DNS poisoned)
Jan 21DNS poisoning of github.com root domain (as well as *.github.com)

No comments:

Post a Comment