还有一篇自己写的HTTPCLIENT工具也不错,源码可以直接反编码,我就不放上来了,尊重原著么,供新手学习感觉不错,因为我就是新手嘛嘿嘿
穿越地址: http://www.javaeye.com/topic/900931
一、介绍(introduction)
1. 目的——HTTP/0.9-〉HTTP/1.0-〉HTTP/1.1
2. 要求——MUST、REQUIRED、SHOULD
3. 术语——连接(Connection)、消息(Message)、请求(Request)、应答(Response)、资源(Resource)、实体(Entity)、表示方法(Representation)、内容协商(Content Negotiation)、变量(Variant)、客户机(Client)、用户代理(User agent)、服务器(Server)、原服务器(Origin server)、代理服务器( Proxy)、网关(gateway)、高速缓存(Cache)、可缓存(Cacheable)、直接(first-hand)、明确终止时间(explicit expiration time)、探索终止时间(heuristic expiration time)、年龄(Age)、保鲜寿命(Freshness lifetime)、保鲜(Fresh)、陈旧(Stale)、语义透明(semantically transparent)、有效性判别器(Validator)、实体标记(entity tag)或最终更改时间(Last-Modified time))、上游/下游(upstream/downstream)、向内/向外(inbound/outbound)
4. 总体操作——请求/应答、中介
二、符号惯例与一般语法(notational conversions and generic grammar)
1. 扩充BNF——name = definition,"literal",rule1 | rule2,(rule1 rule2),*rule,[rule],N rule, #rule,; comment, implied *LWS
2. 基本规则——OCTET,CHAR,UPALPHA,LOALPHA,ALPHA,DIGIT,CTL,CR,LF,SP,HT,<">
三、协议参数(protocol parameters)
1. HTTP版本——HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT
2. 统一资源标示符(URI)——统一资源定位器(URL)和统一资源名称(URN)的结合,http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
3. 日期/时间格式——Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123,Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036,Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format
4. 字符集——本文档中的术语"字符集"指一种用一个或更多表格将一个八字节序列转换成一个字符序列的方法,charset=token失踪字符集
5. 内容编码——内容编码主要用来允许文档压缩(信源编码)content-coding= token注册表包含下列标记:gzip,compress,deflate,identity
6. 传输编码——目的是能够确保通过网络安全传输(信道编码)transfer-coding = "chunked" | transfer-extensiontransfer-extension = token *( ";" parameter ),成块传输代码
7. 媒体类型——media-type = type "/" subtype *( ";" parameter )type = tokensubtype = token规范化和原文缺省多部分类型
8. 产品标记——product = token ["/" product-version]product-version = token
9. 质量值——qvalue = ( "0" [ "." 0*3DIGIT ] )| ( "1" [ "." 0*3("0") ] )
10. 语言标记——language-tag = primary-tag *( "-" subtag )primary-tag = 1*8ALPHAsubtag = 1*8ALPHA
11. 实体标记——entity-tag = [ weak ] opaque-tagweak = "W/"opaque-tag = quoted-string
12. 范围单位——range-unit = bytes-unit | other-range-unitbytes-unit = "bytes"other-range-unit = token
四、 HTTP消息(HTTP message)
1. 消息类型——HTTP-message = Request | Response ; HTTP/1.1 messagesgeneric-message = start-line *(message-header CRLF) CRLF [ message-body ]start-line = Request-Line | Status-Line
2. 消息头——HTTP头域包括常规头,请求头,应答头和实体头域message-header = field-name ":" [ field-value ]field-name = tokenfield-value = *( field-content | LWS )field-content = <the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, separators, and quoted-string>
3. 消息体——message-body = entity-body| <entity-body encoded as per Transfer-Encoding>
4. 消息的长度——决定因素
5. 常规头域——general-header = Cache-Control| Connection| Date| Pragma| Transfer-Encoding
五、 请求(request)
首行包括利用资源的方式,区分资源的标识,以及协议的版本号Request = Request-Line * (( general-header| request-header| entity-header ) CRLF) CRLF [ message-body ]
1. 请求行——Request-Line = Method SP Request-URI SP HTTP-Version CRLF方法——方法标记指的是在请求URI所指定的资源上所实现的方式Method = "OPTIONS"| "GET"| "POST"| "PUT"| "DELETE"| "TRACE"| "CONNECT"| extension-methodextension-method = token请求URL——请求URL是一种全球统一的应用于资源请求的资源标识符Request-URI = "*" | absoluteURI | abs_path | authority请求行举例:GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1GET /pub/WWW/TheProject.html HTTP/1.1Host: www.w3.org
2. 请求定义的资源——一个INTERNET请求所定义的精确资源由请求URL和主机报头域所决定
3. 请求报头域——request-header = Accept| Accept-Charset| Accept-Encoding| Accept-Language| Authorization| Expect| From| Host| If-Match| If-Modified-Since| If-None-Match| If-Range| If-Unmodified-Since| Max-Forwards| Proxy-Authorization| Range| Referer| TE| User-Agent
六、 应答(response)
接收和翻译一个请求信息后,服务器发出一个HTTP应答信息Response = Status-Line*(( general-header| response-header| entity-header ) CRLF) CRLF [ message-body ]
1. 状态行——Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF状态码——状态码是试图理解和满足请求的三位数字的整数码,1xx,2xx,3xx,4xx,5xx,100-〉505-〉扩展码
2. 应答报头域——response-header = Accept-Ranges| Age| Location| Proxy-Authenticate| Retry-After| Server| Vary| WWW-Authenticate
七、 实体(entity)
在未经特别规定的情况下,请求与应答的消息也可以传送实体。 实体包括实体报头域与实体正文,而有些应答只包括实体报头。
1. 实体报头域——entity-header = Allow | Content-Encoding| Content-Language| Content-Length | Content-Location| Content-MD5| Content-Range| Content-Type| Expires| Last-Modified| extension-headerextension-header = message-header
2. 实体正文——entity-body = *OCTETentity-body := Content-Encoding( Content-Type( data ) )
八、 连接(connection)
1. 持续连接——优点持续连接是任何HTTP连接的缺省方式,支持持续连接的客户机可以以流水线方式发送请求代理服务器
2. 消息传递要求——持续连接与流量控制监视连接中出错状态的消息100号状态的用途服务器过早关闭连接时客户机的动作
九、 方法定义(method definitions)
1. 安全和等幂方法安全方法——GET和HEAD方法除了补救外不应该有别的采取措施的含义等幂方法——没有副作用的序列是等幂的
2. OPTIONS——OPTIONS方法代表在请求URI确定的请求/应答过程中通信条件是否可行的信息
3. GET——GET方法说明了重建信息的内容由请求URI来确定
4. HEAD——除了应答中禁止返回消息正文外,HEAD方法与GET方法一样
5. POST——POST方法实现的实际功能取决于服务器
6. PUT——PUT方法要求所附实体存储在提供的请求URI下
7. DELETE——DELELE方法要求原服务器释放请求URI指向的资源
8. TRACE——TRACE方法用于调用远程的应用层循环请求消息
9. CONNECT——CONNECT方法用于能动态建立起隧道的代理服务器
十、 状态码定义(status code definitions)
1. 信息1XX——100继续101转换协议
2. 成功2XX——200请求成功201创建202接受203非权威信息204无内容205重置内容206局部内容
3. 重新定向3XX——300多样选择301永久移动302创立303观察别的部分304只读306(没有用的)307临时重发
4. 客户错误4xx——400坏请求401未授权的402必需的支付403禁用404没有找到405不被允许的方法406不接受407代理服务器认证所必需408请求超时409冲突410停止411必需的长度412预处理失败413请求实体太大414请求的URI过长415不被支持的媒体类型416请求范围不满足417期望失败
5. 服务器错误5xx——500服务器内部错误501不能实现502坏网关503难以获得的服务504网关超时505 HTTP版本不支持
十一、 访问验证(access authentication)——可选择
十二、 内容谈判(content negotiation)
HTTP为了"内容谈判"提供了一些机制,即当有很多种可能的表示时如何选择对于一个请求的最佳的表示。
1. 服务器驱动谈判——一个请求的最佳表示的选择由服务器提供的运算法则来完成
2. 代理驱动谈判——对于一个应答的最佳表示法的选择是在代理从原服务器端收到最初的应答后实现的
3. 透明谈判——透明的判断是服务器驱动和代理驱动谈判的结合体
十三、 HTTP中的缓存(caching in HTTP)
HTTP典型应用于能通过采用缓存技术而提高性能的分布式信息系统
1. 缓存——缓存正确性警告信息缓存控制机制直接的用户代理警告规则和警告的例外情况由客户控制的行为
2. 过期模型——服务器指定模型启发式过期年龄计算过期计算澄清过期值澄清多重响应
3. 确认模型——当缓存器想要用一个失时效的条目来相应客户的请求,他首先必须向源服务器检验这一缓存条目是否仍然可用最后修改日期标签缓存确认器强弱控制器关于何时使用实体标签和最后修改时间的规则不确认条件
4. 响应的缓存能力——除非被明确限制,缓存系统可以将一成功的响应作为缓存实体一直存储
5. 从缓存构造响应——端到端和Hop-by-hop报头不可更改报头联合报头联合字节范围
6. 缓存谈判响应
7. 共享与非共享缓存
8. 错误和不完全响应缓存行为
9. GET和 HEAD的副作用
10. 刷新或删除后的无效性
11. 强制写通过
12. 缓存替换
13. 历史纪录
十四、 报头域定义(header field definitions)
1. Accept——Accept = "Accept" ":" #( media-range [ accept-params ] )media-range = ( "*/*"| ( type "/" "*" )| ( type "/" subtype )) *( ";" parameter )accept-params = ";" "q" "=" qvalue *( accept-extension )accept-extension = ";" token [ "=" ( token | quoted-string ) ]例1:Accept: audio/*; q=0.2, audio/basic例2:Accept: text/plain; q=0.5, text/html, text/x-dvi; q=0.8, text/x-c
2. Accept-Charset——Accept-Charset = "Accept-Charset" ":" 1#( ( charset | "*" )[ ";" "q" "=" qvalue ] )例:Accept-Charset: iso-8859-5, unicode-1-1;q=0.8
3. Accept-Encoding——Accept-Encoding = "Accept-Encoding" ":" 1#( codings [ ";" "q" "=" qvalue ] )codings = ( content-coding | "*" )例:Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0
4. Accept-Language——Accept-Language = "Accept-Language" ":" 1#( language-range [ ";" "q" "=" qvalue ] )language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | "*" )例:Accept-Language: da, en-gb;q=0.8, en;q=0.7
5. Accept-Range——Accept-Ranges = "Accept-Ranges" ":" acceptable-rangesacceptable-ranges = 1#range-unit | "none"例:Accept-Ranges: bytes
6. Age——Age = "Age" ":" age-valueage-value = delta-seconds
7. Allow——Allow = "Allow" ":" #Method例:Allow: GET, HEAD, PUT
8. Authorization——Authorization = "Authorization" ":" credentials
9. Cache-Control——Cache-Control = "Cache-Control" ":" 1#cache-directivecache-directive = cache-request-directive| cache-response-directivecache-request-directive ="no-cache"| "no-store"| "max-age" "=" delta-seconds| "max-stale" [ "=" delta-seconds ]| "min-fresh" "=" delta-seconds| "no-transform"| "only-if-cached"| cache-extensioncache-response-directive ="public"| "private" [ "=" <"> 1#field-name <"> ]| "no-cache" [ "=" <"> 1#field-name <"> ]| "no-store"| "no-transform"| "must-revalidate"| "proxy-revalidate"| "max-age" "=" delta-seconds| "s-maxage" "=" delta-seconds| cache-extensioncache-extension = token [ "=" ( token | quoted-string ) ]什么是可缓存的哪些可能被缓存保存对基本过期失效机制的改进缓存重新确认有效和重载控制不得转换的指令缓存控制扩展
10. Connection——Connection = "Connection" ":" 1#(connection-token)connection-token = token例:Connection: close
11. Content-Encoding——Content-Encoding = "Content-Encoding" ":" 1#content-coding例:Content-Encoding: gzip
12. Content-Language——Content-Language = "Content-Language" ":" 1#language-tag例:Content-Language: mi, en
13. Content-Length——Content-Length = "Content-Length" ":" 1*DIGITContent-Length: 3495
14. Content-Location——Content-Location = "Content-Location" ":"( absoluteURI | relativeURI )
15. Content-MD5——Content-MD5 = "Content-MD5" ":" md5-digestmd5-digest = <base64 of 128 bit MD5 digest as per RFC 1864>
16. Content-Range——Content-Range = "Content-Range" ":" content-range-speccontent-range-spec = byte-content-range-specbyte-content-range-spec = bytes-unit SP byte-range-resp-spec "/"( instance-length | "*" )byte-range-resp-spec = (first-byte-pos "-" last-byte-pos) | "*"instance-length = 1*DIGIT例:The first 500 bytes:bytes 0-499/1234
17. Content-Type——Content-Type = "Content-Type" ":" media-type例:Content-Type: text/html; charset=ISO-8859-4
18. Date——Date = "Date" ":" HTTP-date例:Date: Tue, 15 Nov 1994 08:12:31 GMT没有时钟的原服务器的运作
19. Etag——ETag = "ETag" ":" entity-tag例:ETag: W/"xyzzy"
20. Expect——Expect = "Expect" ":" 1#expectationexpectation = "100-continue" | expectation-extensionexpectation-extension = token [ "=" ( token | quoted-string )*expect-params ]expect-params = ";" token [ "=" ( token | quoted-string ) ]
21. Expires——Expires = "Expires" ":" HTTP-date例:Expires: Thu, 01 Dec 1994 16:00:00 GMT
22. From——From = "From" ":" mailbox例:From: webmaster@w3.org
23. Host——Host = "Host" ":" host [ ":" port ] ; Section 3.2.2
24. If-Match——If-Match = "If-Match" ":" ( "*" | 1#entity-tag )例:If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
25. If-Modified-Since——If-Modified-Since = "If-Modified-Since" ":" HTTP-date例:If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT
26. If-None-Match ——If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag )例:If-None-Match: W/"xyzzy", W/"r2d2xxxx", W/"c3piozzzz"
27. If-Range ——If-Range = "If-Range" ":" ( entity-tag | HTTP-date )
28. If-Unmodified-Since ——If-Unmodified-Since = "If-Unmodified-Since" ":" HTTP-date例:If-Unmodified-Since: Sat, 29 Oct 1994 19:43:31 GMT
29. Last-Modified ——Last-Modified = "Last-Modified" ":" HTTP-date例:Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT
30. Location ——Location = "Location" ":" absoluteURILocation: http://www.w3.org/pub/WWW/People.html
31. Max-Forwards ——Max-Forwards = "Max-Forwards" ":" 1*DIGIT
32. Pragma ——Pragma = "Pragma" ":" 1#pragma-directivepragma-directive = "no-cache" | extension-pragmaextension-pragma = token [ "=" ( token | quoted-string ) ]
33. Proxy-Authenticate ——Proxy-Authenticate = "Proxy-Authenticate" ":" 1#challenge
34. Proxy-Authorization ——Proxy-Authorization = "Proxy-Authorization" ":" credentials
35. Range——字节范围范围检索请求Range = "Range" ":" ranges-specifier
36. Referer——Referer = "Referer" ":" ( absoluteURI | relativeURI )
37. Retry-After ——Retry-After = "Retry-After" ":" ( HTTP-date | delta-seconds )
38. Server ——Server = "Server" ":" 1*( product | comment )
39. TE ——TE = "TE" ":" #( t-codings )t-codings = "trailers" | ( transfer-extension [ accept-params ] )例:TE: trailers, deflate;q=0.5
40. Trailer ——Trailer = "Trailer" ":" 1#field-name
41. Transfer-Encoding ——Transfer-Encoding = "Transfer-Encoding" ":" 1#transfer-coding例:Transfer-Encoding: chunked
42. Upgrade——Upgrade = "Upgrade" ":" 1#product例:Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11
43. User-Agent ——User-Agent = "User-Agent" ":" 1*( product | comment )例:User-Agent: CERN-LineMode/2.15 libwww/2.17b3
44. Vary ——Vary = "Vary" ":" ( "*" | 1#field-name )
45. Via ——Via = "Via" ":" 1#( received-protocol received-by [ comment ] )received-protocol = [ protocol-name "/" ] protocol-versionprotocol-name = tokenprotocol-version = tokenreceived-by = ( host [ ":" port ] ) | pseudonympseudonym = token例:Via: 1.0 ricky, 1.1 ethel, 1.1 fred, 1.0 lucy
46. Warning = "Warning" ":" 1#warning-valuewarning-value = warn-code SP warn-agent SP warn-text [SP warn-date]warn-code = 3DIGITwarn-agent = ( host [ ":" port ] ) | pseudonymwarn-text = quoted-stringwarn-date = <"> HTTP-date <">
47. WWW-Authenticate ——WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge
十五、 安全考虑(security considerations)
一些建议,但是并不包括最终解决方案
1. 个人信息服务器日志信息的滥用敏感信息的传输URI中敏感信息的编码连接到Accept报头的机要问题
2. 基于文件和路径名称的攻击
3. DNS欺骗
4. Location(位置)报头和欺骗
5. 内容倾向问题
6. 鉴定证书和空闲的客户机
7. 代理服务器和高速缓存对代理服务器的拒绝服务攻击
十六、 感谢
十七、 参考文献
十八、 作者地址
十九、 附录