直接看的英文版,因为看到豆瓣上评价中文翻译一塌糊涂。原书写的真是好,也很好理解,再加上之前看过了那套计算机网络的视频教程,感觉对看这本书还是有很大帮助的。

这段日子经常向周边的同事安利这两个资料,然而没一个理我的,一起学互相讨论岂不更香?无奈,只好独自真香了。。。(顺便感叹一下,狗血剧诚不欺我,以后还是自己玩吧)

这篇笔记主要是记录第二章应用层的协议学习,方便日后查阅(第一章直接跳过)。

顺便再感叹一下,之前和邦邦老师讨论前端拆项目的可行性,再一次被邦邦老师的知识之渊博,见解之深刻所折服。虽然现在是写前端,但计算机的知识都是共通的,打好基础才是王道啊,好好学习!

7.11补:感觉继续这种笔记没有什么意义。。。所以不再写了。。。还能加快下进度。。。这一篇已经写完的就留下来好了,算是对之前教程的补充。


网络应用的结构(Network Application Architectures)

Application architecture is desinged by the application developer and dictates how the application is structured over the various end systems.

主要包含下面2种:

  1. client-server architecture: there is an always-on host, called the server, which services requests from many other hosts called clients
  2. P2P architecture: there is minimal(or no) reliance on dedicated servers in data centers. Instead the application exploits direct communication between pairs of intermittently connected hosts, called peers. The peers are note owned by the service provider, but are instead desktops and laptops controlledn by users

process

这部分看完感觉提壶灌顶!!!一下就佛了,不论前端后端,从网络应用的角度看我们都是在面向进程编程啊,也不知道每天争来争去的争个啥劲。。。那么多要学的的还有功夫费这劲(小声逼逼)。。。另外看了这部分,有种顿悟的感觉,不该把自己限定在前端的范围内,计算机世界这么广阔,先到处都转转,再说什么真香不好么?

这部分看的太激动。。。没有留下记录,以后需要直接翻书吧,只写了下面这段描述,感觉很重要:

Socket is the interface between the application process and the transport-layer protocol


传输层的简单介绍

  1. Transport Services Available to Applications

    • Reliable data transfer
    • Throughput(p122)

      传输层协议要能提供对throughput的控制能力(书里用了guarantee,感觉这个词有点点问题,万一网络真的波动了,传输层也没有办法保证吧,记得之前的视频教程里也没有说传输层能保证throughput,再加上书里的下面这段描述,感觉用控制更好)

      Because other sessions will be sharing the bandwidth along the network path, and because these other sessions will be coming and going, the available throughput can fluctuate with time.

      If the transport protocol cannot provede this throughput, the application would need to encode at a lower rate or may have to give up

      从受throughput影响的角度出发,网络应用又可以分为:

      • bandwidth-sensitive application: Applications that have throughput requirements,比如多媒体应用
      • elastic applications can make use of as much, or as little, throughput as happens to be available. 比如electronic mail, file transfer, web transfers
    • timing

    即时性建立在throughput的基础上

    • security

    比如SLL,SSL并不是和TCP、UDP同级的第三种Internet transport protocol,而是实现在应用层的用来加密数据强化TCP协议的

  2. TCP

    1. Connection-oriented service

      1. TCP has the client and server exchange transport-layer control information with each other before the application-level messages begin to flow. The so-called handshaking procedure alerts the client and server, allowing them to prepare for an onslaught of packets
      2. After the handshaking phase, a TCP connection is said to exist between the sockets of the two processes
      3. The connection is a full-duplex connection in that the two processes can send messages to each other over the connection at the same time
    2. Reliable data transfer service

    TCP和UDP无法提供throughput和timing的保证,time-senstive的应用都是通过其他方式满足要求的


HTTP协议(默认端口80)

  1. persistent connection vs non-persistent connections:每一个request和对应的response都走同一个独立的TCP,这种方式叫做non-persistent connection;相反,多个request和response走同一个TCP链接的就是persistent connection

    non-persistent的缺点:

    1. 针对每一个请求都必须建立和维护一个船新的connection,这些connection所必需的TCP buffersTCP variables必须同时维护在client和server端,这对于向众多客户端提供大量链接的服务器是个巨大的负担
    2. 每一个实体的传输都需要2个RTT,1个RTT用于建立TCP,另1个RTT用于请求和接受需要的信息

      persistentConnection

  2. HTTP message format

    • request message
    1. 分为request line和header lines
    2. host header line:指明请求发向的主机和端口,在web proxy cache中这个header line的信息很重要
    3. connection: close。浏览器告诉服务器它不需要持久连接,返回请求的数据之后服务器断开即可(在确保客户端收到response才会断开)
    • response message
    1. 组成:status line、header liens、entity body
    2. connection:close。告诉客户端发出message之后(确保收到)将断开连接
    3. Date: it is the time when the servier retrieves the object from its file system, inserts the object into the response message, and sends the reponse message(我的理解就是发出的时间)
    4. Last-Modified:用于对象缓存(本地和网络服务器之间通过这个字段进行资源是否过期的检查)
  3. cookie

    1. 组成:
      1. a cookie header line in the HTTP response message
      2. a cookie header line in the HTTP request message
      3. a cookie file kept on the user’s end system and managed by the user’s browser
      4. a back-end database at the web site
    2. cookies can be used to identify a user. The first time a user visits a site, the user can provide a user identification(possibly his or her name). During the subsequent sessions, the browser passes a cookie header to the server, thereby identifying the user to the server. Cookies can thus be used to create a user session layer on top of HTTP.
  4. web caching(又名proxy server)

    1. 过程:
      1. The browser establish a TCP connection to the web cache and sends an HTTP request for the object to the web cache
      2. The web cache checks to see if it has a copy of the object stored locally. If it does, the web cache returns the object within an HTTP response message to the client browser
      3. If the web cache does not have the object, the web cache opens a TCP connection to the origin server. The web cache then sends an HTTP request for the object into the cache-to-server TCP connection. After receiving this request, the origin server sends the object within an HTTP response to the web cache
      4. When the web cache receives the object, it stores a copy in its local storage and sends a copy, within an HTTP response, to the client browser( over the existing TCP connection between the client browser and the web cache)
    2. 缓存的优点
      1. 大幅降低客户端请求时间
      2. 大幅减少网络流量
  5. conditional GET

    An HTTP request message is a so-called conditional GET messsage if (1)the request message uses the GET method and (2)the request message includes an If-Modified-Since header line

    1. 过程

      1. on the behalf of a request browser, a proxy cache sends a request message to a web server

        1
        2
        GET /fruit/kivi.gif HTTP/1.1
        HOST: www.exotiquecuisine.com
      2. the web server sends a response message with the requested object to the cache

        1
        2
        3
        4
        5
        6
        HTTP/1.1 200 OK
        Date: Sat, 3, Oct 2015 15:39:29
        Server: Apache/1.3.0(Unix)
        Last-Modified: Wed, 9 Sep 2015 09:23:24
        Content-Type: image/gif
        (data data data...)
      3. One week later, another browser requests the same object via the cache, and the object is still in the cache. Since this object may have been modified at the web server in the past week, the cache performs an up-to-date check by issuing a conditional GET

        1
        2
        3
        GET /fruit/kivi.gif HTTP/1.1
        HOST: www.exotiquecuisine.com
        If-modified-since: Wed, 9 Sep 2015 09:23:24
      4. The web server sends a response message to the cache:

        1
        2
        3
        4
        HTTP/1.1 304 Not Modified
        Date: Sat, 10 Oct 2015 15:39:29
        Server: Apache/1.3.0 (Unix)
        (empty entity body)

邮件传输协议

  1. Internet mail system
    1. user agents
    2. mail servers
    3. 邮件传输协议
  2. Simple Mail Transfer Protocol(SMTP)
    SMTP有两端:

    1. a client side, which executes on the sender’s mail server
    2. a server side, which executes on the recipient’s mail server

    Both the client and server sides of SMTP run on every mail server. When a mail server sends mail to other mail servers, it acts as a SMTP client. When a mail server receives mail from other mail servers, it acts as an SMTP server.

  3. SMTP的一些细节以及和HTTP的不同

    1. SMTP requires binary multimedia data to be encoded to ASCII(7-bit ASCII format) before being sent over SMTP; and it requires the corresponding ASCII message to be decoded back to binary after SMTP transport
    2. 在建立TCP连接之后,仍需建立SMTP连接(持久连接)
    3. client端发起关闭指令
    4. To handle a docuement consisting of text and images, HTTP encapsulates each object in its own HTTP response message, which SMTP places all of the message’s objects into one message
    5. SMTP是push protocol,HTTP是pull protocol
  4. POP3(Post office protocol-version3)
    特点:

    1. 协议简单,不能跨session维护状态
    2. 不能在远程建立目录层级
  5. IMAP(Internet Mail Access Protocol)
    特点:
    1. IMAP server maintains user state information across IMAP seesions(i.e. the names of the folders and which messages are associated with which folders)
    2. 允许下载邮件的部分资源
  6. E-mail protocols and their communicating entities

    当使用客户端:

      graph LR;
     AliceAgent["Alice's agent"] --> |SMTP|AliceServer["Alice's mail server"] --> |SMTP|BobServer["Bob's mail server"] --> |POP3,IMAP,HTTP|BobAgent["Bob's agent"]

    当使用浏览器:

      graph LR;
     AliceAgent["Alice's agent"] --> |SMTP|AliceServer["Alice's mail server"] --> |SMTP|BobServer["Bob's mail server"] --> |HTTP|BobAgent["Bob's agent"]

DNS(domain name system)(默认端口:53)

The DNS is (1)a distributed database implemented in a hierarchy of DNS servers, and (2)an application-layer protocol that allows hosts to query the distributed database

  1. 查询过程

    在浏览器地址栏中,输入url后从hostname查找IP的过程:

    1. The same user machine runs the client side of the DNS application
    2. The browser extracts the hostname form the url and passes the hostname to the client side of DNS application
    3. The DNS client sends a query containing the hostname to a DNS server
    4. The DNS client eventually receives a reply, which includes the IP address for the hostname
    5. Once the browser receives the IP address from DNS, it can initiate a TCP connection to the HTTP server process located at port 80 at that IP address
  2. DNS的其他作用

    1. host aliasing
    2. mail server aliasing
    3. load distribution(在多台不同的host上跑相同的服务,然后分流处理请求)
    4. nearby DNS server caches the desired IP address来解决DNS查询造成的网络延迟
  3. 底层协议:UDP

  4. DNS的层级结构

      graph LR;
     Root["Root DNS Server"] --- Com["com DNS server"];
     Com["com DNS server"] --- facebook["facebook.com DNS servers"];
     Com["com DNS server"] --- amazon["amazon.com DNS servers"];
     Root["Root DNS Server"] --- Org["org DNS server"];
     Org["org DNS server"] --- pbs["pbs.org DNS server"];
     Root["Root DNS Server"] --- Edu["edu DNS server"];
     Edu["edu DNS server"] --- nyu["nyu.edu DNS servers"];
     Edu["edu DNS server"] --- umass["umass.edu DNS servers"];

    上图中,从左到右DNS服务器的级别逐渐降低

    1. 最左边:root DNS server, provide the IP address of the TLD services
    2. 中间:TCD(top-level domain), provide the IP address for authoritative DNS servers
    3. 最右边:authoritative DNS servers, 提供实际的IP address
  5. local DNS server

这种server不在上面的层级结构中。

When a host makes a DNS query, the query is sent to the local DNS server, which acts a proxy, forwarding the query into the DNS server hierarchy

一个通过local DNS server进行DNS查询的示例:

  graph TD;
    client["requesting host(cse.nyu.edu)"] -->|"1(recursive query)"|Proxy["local DNS server(dns.nuy.edu)"];
    Proxy["local DNS server(dns.nuy.edu)"] -->|8| client["requesting host(cse.nyu.edu)"];
    Proxy["local DNS server(dns.nuy.edu)"] -->|"2(iterating query)"|Root["Root DNS server"];
    Root["Root DNS server"] -->|3| Proxy["local DNS server(dns.nuy.edu)"];
    Proxy["local DNS server(dns.nuy.edu)"] -->|"4(iterative query)"| TLD["TLD DNS server"];
    TLD["TLD DNS server"] -->|5| Proxy["local DNS server(dns.nuy.edu)"];
    Proxy["local DNS server(dns.nuy.edu)"] -->|"6(iterative query)"| Authoritative["Authoritative DNS server(dns.umass.edu)"];
    Authoritative["Authoritative DNS server(dns.umass.edu)"] -->|7| Proxy["local DNS server(dns.nuy.edu)"];

如上图所示,DNS请求可以分为2类。

local DNS server可以缓存:(1)mapping of hostname and IP address of a server,也可以缓存(2)IP address of TLD servers(这样就能绕过query chain中对root DNS server的请求)

  1. DNS records

格式:(Name, Value, Type, TTL)

其中Name和Value的含义取决于Type的值,TTL(time-to-live),超时就会从存储中被移除

比如:

if Type = A, then Name is a hostname and value is the IP address for the hostname. Thus, a Type A record provides the standard hostname-to-IP address

书中给出了4种类型,详见P169,邮箱域名可以和其他服务的域名相同,其区分查询也是以这种形式记录的


P2P

BitTorrent

  1. torrent: the collection of all peers participating in the distribution of a particular file is called a torrent
  2. tracker: each torrent has an infrastructure node called a tracker, when a peer joins a torrent, it registers itself with the tracker and periodically informs the tracker that it is still in the torrent
  3. neighboring peers: all the peers with which a particular peer succeeds in establishing a TCP connection(A peer’s neighbouring peers will fluctuate over time)
  4. rarest first: the idea is to determine from among the chunks a particular peer does not have, the chunks that are the rarest among her neighbours(that is the chunks that have the fewest repeated copies among her neighbours) and then request those rarest chunks first
  5. unchoked: a particular peer continually measures the rate at which she receives bits and determines the four peers that are feeding her bits at the highest rate, these four peers are said to be unchoked(每10s重新计算)
  6. optimistically unchoked: every 30s, the particular peer picks one additional neighbour at random and sends it chunks

All other neightbouring peers besides these five peers(four top peers and one probing peer) are “choked”, that they do not receive any chunks from “Alice”.


video streaming and content distribution networks

In order to provide continuous playout, the network must provide an average throughput to the streaming application that is at least as large as the bit rate of the compressed video.(关于bit rate,书中没有解释,也没有查其他资料,感觉和fps之类的差不多)

  1. 实现方式

    1. DASH(Dynamic Adaptive Streaming over HTTP): In DASH, the video is encoded into several different versions, with each version having a different bit rate and, correspondingly, a different quality level
    2. 同时还有一个manifest file

    在上面的架构下的下载过程:

    The client first requests the manifest file and learns about the various versions. The client then selects one chunk at a time by specifying a URL and a byte range in an HTTP GET request message for each chunk. While downloading chunks, the client also measures the received bandwidth and runs a rate determination algorithm to select the chunk to request next

  2. CDN(Content Distribution Networks)加速

    CDN的操作包含:

    1. determine a suitable CDN server cluster for that client at that time,选择cluster的策略:
      1. CDN可以知道LDNS server的位置,然后选择离它地理距离近的cluster,问题是这种方法忽略了带宽波动的影响以及LDNS是远程DNS server的情况
      2. CDN can have each of its clusters periodically send probes to all of the LDNS around(real-time measurements),问题是many LDNSs are configured to not respond to such probes
    2. redirect the client’s request to a sever in that cluster
      graph LR;
     client --- server["webpage server(www.netcinema.com)"];
     client -->|1| local["local DNS server"];
     local["local DNS server"] -->|2| authoritative["netcinema authoritative DNS server"];
     authoritative["netcinema authoritative DNS server"] --> local["local DNS server"];
     local["local DNS server"] -->|3| CDN["kingCDN authoritative server"];
     CDN["kingCDN authoritative server"] --> local["local DNS server"];
     local["local DNS server"] --> client;
     client -->|4| KingCDN["KingCDN content distribution server"];
     KingCDN["KingCDN content distribution server"] --> client;

socket programming with UDP

imageOfUDPSocket


socket programming with TCP

分为两类(步):

  1. welcoming socket: the initial point of contact for all clients wanting to communicate with the server
  2. newly created server-side connection socket: subsequently created for communicating with each client

From the application’s perspective, the client’s socket and the server’s connection socket are directly connected by a pipe:

socketWithTCP

imageOfTCPSocket


参考资料

  1. https://book.douban.com/subject/30280001/