比如说从第1024字节开始下载,请求报文如下:
GET /image/index_r4_c1.jpg HTTP/1.1Accept: */*Referer: http://192.168.3.120:8080Accept-Language: zh-cnAccept-Encoding: gzip, deflateUser-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.0.3705)Host: 192.168.3.120:8080Range:bytes=1024-Connection: Keep-Alive <script type="text/javascript">zmbbs=1;</script> 这样就可以从1024字节后下载 也就是说,如果要实现多线程下载,只要让每个线程在发送httpheader时加上'Range:bytes=*'就行了! 代码如下: import sys,time from httplib import * from thread import * from threading import * parts = []thread_amount = 5 PART_LENGTH = 1024 lock = RLock() class Part(Thread): def __init__ (self,NO,resource): # for short only self.resource = resource r = resource self.NO = NO self.pos_start = int( r.content_length / thread_amount ) * NO self.length = int( r.content_length / thread_amount ) self.pos_end = self.pos_start + self.length self.downloaded = 0 self.speed = 0 parts.append( self ) Thread. __init__ ( self, name = ' part_%s ' % (NO) ) def run(self): http = HTTPConnection(self.resource.host, 80 ) headers = { ' Range ' : ' bytes=%s-%s ' % ( self.pos_start, self.pos_end ) }; http.request( ' GET ' ,self.resource.url, '' ,headers) resp = http.getresponse() while self.downloaded < self.length: self.ongetdata(resp.read(PART_LENGTH)) def ongetdata(self,data): lock.acquire() self.resource.F.seek(self.downloaded + self.NO * self.length,0) self.resource.F.write(data) lock.release() self.downloaded += PART_LENGTH class Resource: def __init__ (self,url): # get host & url n = url.find( ' / ' , 7 ) self.host = url[ 7 :n] self.url = url[n:] # get length http = HTTPConnection(self.host, 80 ) http.request( ' GET ' ,self.url) resp = http.getresponse() self.content_length = int(resp.getheader( ' Content-Length ' )) # get filename & create a file before download n = url.rfind( ' / ' ) self.filename = url[n + 1 :] print self.filename self.F = open(self.filename, ' wb+ ' ) print >> self.F, ' x ' * self.content_length def begin_download(url): # get the host and url r = Resource(url) for i in range(thread_amount): p = Part(i,r) p.start() def part_begin_download(p,r): start_new_thread(x_part_begin_download,(p,r)) try : thread_amount = int(sys.argv[ 2 ]) except : thread_amount = 1 begin_download( sys.argv[ 1 ] ) 测试一下 python test.py http://tn4.cn3.yahoo.com/image/d43/ab6a1f9ede0aee6b0c.jpeg 8 然后就会发现文件夹下多了个ab6a1f9ede0aee6b0c.jpeg,基本成功! 第一次用python写程序,写的比较随便,功能很简单,只有下载,没有显示下载进度,没有断点续传,有空的话再补上吧 ps:如果要实现向迅雷一样的p2sp下载的话,其实也很简单,就是每个线程到一个源下载各自的部分就行!这得需要一个web数据库 ps:感谢 andelf的支持!!