Multi threaded proxy
Step 1: Implement a barebones proxy
Opens a TCP server socket to listen for TCP connections on a particular port. When a new connection is setup, the proxy reads in a HTTP request; yourcode needs to be compatible with HTTP GET requests sent by the curl HTTP client.
The cheap proxy should then pass on that HTTP request to the relevant webserver, download the data received, and send it back to the client on the TCP connection that the client had established. You need to ensure that you can handle interleaving of requests from multiple clients by maintaining suﬃcient state so that you return a response received from a webserver back to the correct client.
Rather than constantly poll sockets to receive data, you should implement network I/O code using the select call. Also, make sure to handle all errors so that your implementation is robust.
Step 2: Make the proxy multithreaded
To improve the throughput oďŹ€ered, make your paid proxy multi-threaded. The proxy would work in the same way as above, except that whenever data is to be received, a new thread is spawned to do so. However, you should limit the number of threads in your code to at most 20, rather than let the number of threads grow unbounded.
Step 3: Add a cache to the proxy
To further improve throughput, add a cache to your cheap proxy server. At any given time, the cache should contain at most 10 MB of data. To evict data from the cache when there is a cache miss, implement the LeastRecently-Used (LRU) cache replacement policy, i.e., always insert into the cache the most recently accessed object and to incorporate it into the cache, evict the appropriate number of least recently used objects from the cache. Ensure that reads from and writes to your cache are thread-safe.
Step 4: Deploy the proxy on EC2
Sign up to Amazon EC2 (http://aws.amazon.com/ec2/) and provision a Micro instance; Amazon oďŹ€ers for free one Micro instance that runs Linux (http: //aws.amazon.com/free/). Make sure you setup a Micro instance and that you use Linux, else you will be charged for your usage of EC2. Run your proxy on a EC2 Micro instance and test it by fetching web pages via it running curl on departmental machines at UCR.
Requirements and Deliverables
All your code must be written in C. Network I/O should use the select, recv, and send calls. Use the pthread library for threading and inter-thread synchronization. The following two deliverables are expected at the end of this project, both due before class on 31st January 2011.
• Deploy your proxy on EC2 and email me the address at which your proxy is accessible. The address expected is of the format hostname:port or IPAddress:port. I will then evaluate your implementation with my test suite, which will fetch web pages via your proxy. • Email me an archive that contains all your source code.
This project is worth 20 points. 10 points: I will go over your source code to ensure your implementation of (a) select-based socket programming, (b) thread pool, and (c) thread-safe cache look correct.
10 points: I will fetch web pages via your best paid proxy service using the curl HTTP client. Your proxy must be robust enough to handle web requests to arbitrary web sites. The submission that yields the maximum throughput will be granted 10 points. The number of points awarded to all others will be in proportion to the fraction of this maximum throughput their proxy oﬀers.
Thanking you.. For more info log on too.. http://proxiesforrent.com