Peer-to-Peer Systems
An Introduction
|
Prof. David Bernstein
James Madison University
|
|
Computer Science Department
|
bernstdh@jmu.edu
|
Motivation
- The Concept:
- Share storage and bandwidth of a large number of (small)
computers
- The Traditional Argument for Peer-to-Peer Systems:
- The time required to distribute a file is smaller
than in client-server systems
Distributing a File: Notation
- \(N\) denotes the number of machines that want the file
- \(F\) denotes the size of the file (in bits)
- \(u_S\) denotes the source's upload rate
- \(u_i\) denotes the upload rate of machine \(i\)
- \(d_i\) denotes the download rate of machine \(i\)
- \(d_{\mbox{min}} = \min\{d_1, d_2, \ldots, d_N\}\)
- \(D\) denotes the total time
Distributing a File: Client Server
- Observations:
- The server must transmit \(N \cdot F\) bits in total
which must take at least \(N \cdot F / u_{S}\) seconds
- The slowest peer cannot download the file in less than
\(F / d_{\mbox{min}}\) seconds
- Results:
- Obvious: \(D \geq \max \left\{ \frac{NF}{u_S}, \frac{F}{d_{\mbox{min}}} \right\}\)
- With Some Thought: \(D = \max \left\{ \frac{NF}{u_S}, \frac{F}{d_{\mbox{min}}} \right\}\)
Distributing a File: Peer-to-Peer
- Observations:
- The minimum time is at least \(F/u_S\)
- The slowest peer cannot download the file in less than
\(F / d_{\mbox{min}}\) seconds
- The system must deliver a total of \(N \cdot F\) bits
and the total upload rate is
\(u_{\mbox{total}} = u_S + u_1 + \cdots + u_N\)
- Results:
- Obvious: \(D \geq \max \left\{ \frac{F}{u_S}, \frac{F}{d_{\mbox{min}}}, \frac{NF}{u_\mbox{total}} \right\}\)
- With Some Thought: \(D \approx \max \left\{ \frac{F}{u_S}, \frac{F}{d_{\mbox{min}}}, \frac{NF}{u_\mbox{total}} \right\}\)
The Coordination Challenges
- No central administration
- Heterogeneity
- No trust
- Changing participants (i.e., churn)
- Unreliable participants
Services:
- "Required":
- Locate a provider of content
- Retrieve content
- Optional:
Central Directory Systems
- Example:
- Process:
- Provider registers availability of file with directory
- Requester queries directory for a provider of file
- Directory responds with a provider
- Requester requests file from provider
- Provider provides file
Central Directory Systems (cont.)
- Advantages:
- Simple
- Facilitates search
- Disadvantages:
- Not robust (i.e., single point of failure)
- Centralized accountability
Flooding Systems
- Example:
- Process:
- Requester looks for a file by flooding neighbors who
flood their neighbors (if necessary)
- One or more nodes responds to requester
- Requester selects a provider
- Requester requests file from provider
- Provider provides file to requester
Flooding Systems (cont.)
- Advantages:
- Disadvantages:
- Hard to find "unpopular" content
- No guarantees
- Flooding causes scaling problems
Flooding Systems with Super Peers
- Example:
- The Concept:
- A hybrid of directory based and flooding-based
- Some nodes are super peers that know about the nodes
"under" them
Chunked Systems
- Example:
- Storage Process:
- Content is split into chunks
- Chunks are stored on different nodes
- Location Process:
- Content has an associated tracker node
- Retrieval Process:
- Chunks are received from different nodes
- The last chunks are downloaded from multiple nodes
(to avoid halting)
Incentivized Participation
- Objective:
- Ensure that nodes are both providers and requesters
- Approaches:
- Only provide to others who provide
- Order/prioritize nodes by amount provided
There's Always More to Learn