The moment you hear the term p2p, the first thing that comes to your
mind will mostly be BitTorrent. It
is a fantastic file sharing protocol and in a way it rewrote the way we
have always understood file transfers on the Internet.
Curious geeks can refer to this detailed
specification to understand how the protocol works.
It is advanced technology and a brilliant way to solve the age old
problem of scalability and web traffic overloads which have crashed the
best of web servers.
The key concepts used in BitTorrent are SHA1 checksum computations, peer
to peer networking(of course), randomization of downloading process and
the concept of Pareto efficiency.
But hang on for a minute. Let us first understand p2p and how it
differs from client server protocols like HTTP, FTP and SMTP.
Key p2p concepts
A peer to peer network builds a layer of redundancy and creates
interrelationships betweens the participants of a p2p protocol. Whereas
in a client server interaction between a client and a server, the same
server interacts with different clients in a 2 way relationship. It does
not share any information with the other clients downloading at the same
In other words, if you are downloading a page from slashdot and 1000 other people are also
downloading the very same page, then the web server at slashdot gets
loaded. And that is the only interaction between the clients.
There is a negative side effect in having many clients and one server.
This limits the scalability of web servers in a big way. One cannot
provision for rare instances of heavy load.
This problem has been dogging the evolution of Internet file sharing
mechanisms for a long time. Until Bittorrent came along and rewrote the
rules of the game. In Bittorrent, the file that is being downloaded does
not get downloaded from one single server. Instead the files get shared
and downloaded and uploaded between the participating "peers".
What is a bittorrent peer?
It is a client or a server that has a portion of the file available.
It also has to be connected to other peers. In other words at any given
point of time, if there 40 people downloading a file,then each of the
participants viz, each of the computers interested in obtaining a file
also simultaneously upload portions of the file with other computers.
The notion of client and server blurs when we enter the p2p world.
Every node acts as a server as well as a client. But there is a
significant problem here. It is called network address translation or
What is wrong with NAT?
If you wish to know NAT in graphic detail, read the article on NAT
in the reference section of this article. I will only give a brief
introduction. Due to the shortage of IPv4 addresses on the Internet,
many vendors (mostly MODEM and router manufacturers) allocate private IP
addresses to machines on a company Intranet or a house if using ADSL.
These private IP addresses belong to fixed netblocks and are
technically known as RFC1918 addresses. Typically 192.168/16, 172.16/12
and 10/8 networks are used. Now these addresses are not unique across
the Internet. Consequently they are unroutable and they cannot be
participants in a Bittorrent or any p2p protocol.
Forget p2p. They cannot even participate in any Internet client server
protocol. Then how can machines behind NAT access the Internet? They use
the globally unique public IP address allocated to the router or MODEM
that does NATing. Since many machines share a single(or few) public IP
addresses, an ephemeral public port(TCP or UDP) is assigned on the NAT
device for communicating with the Internet.
This works without hassle when we are the client. Not as a server.
The client uses ephemeral ports anyway; so this is not a problem. But in
the case of servers, this will not work. You can emulate a fixed port
mapping in a machine behind NAT by using a technique known as port
forwarding. This involves configuring the router/MODEM to redirect
requests coming to a certain public port to a certain private port
running in a certain private IP address.
Things have got really complicated now. We will move over to other
topics. If this is the case then how does Bittorent work?
How can Bittorrent nodes act as servers? Obviously it is a problem and
you have similar issues with many other protocols including Session
Initiation Protocol, Real Time Protocol and many others.
One of the ways skype solved this problem is by using a technique
called UDP hole punching. UDP hole punching does not require you to
configure port forwarding and things work seamlessly. But it is a
complex protocol. And so are all p2p protocols.
P2P is used not only for file sharing, but also for instant
messaging, VoIP and for many other applications with future growth
Multimedia, whiteboarding, Video on Demand and so on come to mind.
In all p2p protocols, there is a concept of a tracker or supernode which
tracks the various peers/participants of the protocol. It is the job of
the tracker to keep the state of the individual nodes participating in
UDP holepunching is an effective technique to scale firewalls, NAT
devices and routers and enable p2p applications to work without any
configuration. This is achieved through the concept of a rendezvous
server or mediator node that tracks the ephemeral port mapping at each
All this complexity adds a great deal of resilience, fault tolerance and
scalability by managing the churn effectively. Any number of
participants can come and go, the file transfer may get interrupted any
number of times, but we will get the file correctly thanks to the SHA1
A lot of real life applications easily lend to the P2P model as we
will see below.
The future of the Internet
I have already mentioned many of the key applications that use P2P and
that will continue to use P2P in today's Internet.
The future holds bright prospects for P2P computing as more and more
media rich applications move to Internet for content delivery.
Even television streaming and telephony applications will get integrated
into instant messaging and presence networks. All this need a solid P2P