When we started our update project, we proposed a quite simple way to do this, it works like this:
Our staff use a tool which was using FTP to upload games to CDN, and cafes use http to download games, file by file.
It’s very simple, and of course it didn’t work simply. The major problems were:
1. using FTP to upload large files are hard to resume when connection was broken.
2. once a file has been uploaded to cdn, and future update may not become available in fairly long time, say, a file has been updated from v1 to v2, however, due to we uploaded to the same url, cdn still return v1 when v2 has been uploaded.
3. similar to point 2: a file may be cached in any node between downloader & CDN,for example, ISP, these nodes behave similarly to CDN, the result is the downloaders got old copies.
We halted this project, and resumed it after a few months. We didn’t use HTTP any more::
In this new version, we used p2p to connect and discovery more nodes. Because it’s a customised protocol based on bittorrent, it transfers files piece by piece, and it also verified each piece whenever sending or receiving.
Thanks to the nature of p2p, our uploader can finish uploading once any fs has a full copy, and the first sync fs syncs to all remaining fs.
Similarly, all cafes can get data from all connectable fs and other cafes downloading the same game.
The system works well until 2 problems appeared:
1. another system has been built based on this system, that will be used in more wide countries. CDN is charged by traffic volume, has higher bandwidth, and is more stable than our own managed servers.
2. an existing country using this system uploaded a lot of games while our fs recycling logic may have bug, so the server run out of disk.
Here is our solution:
Our new Fs server application supports 2 protocols, and our new client supports 3 protocols:
1. P2P: the protocol based on bittorrent. It’s still used for upload, internal sync.
2. Piece-based Http: the protocol accept http requests which is formated similar to bittorrent piece requests and metadata requests. Its URI is like this:
so no need to worry whether an old copy is cached(unable to download latest copy), because once a game is updated, the hash will change.
It works like an upload only bittorrent peer, it can serve a request when it has certain piece. it’s friendly to reverse proxies, each piece is neither too small nor too large.
3. File-based Http: it’s client side only, because servers don’t need to do anything, just setup a ngnix to expose the server side file system.
protocol 3 is not recommended, and it requires url must be unique, and hard to control caches under ISP.
I did a test by using following structure.
Games were uploaded in country A, and a ngnix instance was used as the edge server to do the reverse proxy and cache in country B.
This is the screen snapshot of the first time download:
The speed was less than 4MBps.
And next, this is the screen snapshot of the second time download:
The speed was more than 30MBps, and no more traffic went to our original file servers.
Although our previous system can offer upload traffic when it is syncing with other file servers, it takes time to sync games even a game may never be downloaded from this file server. This happens in Indonisia, we have a lot file servers under different ISP, the bandwidth is bad, while we may be syncing a large but never downloaded game while another hot game is being synced and sharing the bandwidth.
This new system is easier to deploy than before. In self-hosted countries, we can add more nginx servers as reverse proxies; In other countries, we can use cdn for the same purpose. reverse proxies/cdn requests and caches data only when it has been requested, as the result, it can avoid to sync games which may never be downloaded.