Golang http server代码原理学习


  1. 本文基于Go 1.7.1,所有列的Go标准库的代码均来自于go/src/net/http/server.go文件。
  2. 代码列的有点多,感觉有点乱,但是感觉代码列不全对于想看代码的人又难受。好吧,其实是写的乱。看起来需要耐心...

拨云见雾

Go中要实现一个简单的Web server非常的简单:

package main

import (
    "io"
    "log"
    "net/http"
)

func main() {
    http.HandleFunc("/", HelloServer)

    log.Fatal(http.ListenAndServe(":8080", nil))
}

func HelloServer(w http.ResponseWriter, req *http.Request) {
    io.WriteString(w, "hello world!\n")
}

上面这个程序运行后,我们在浏览器中访问http://127.0.0.1:8080就会输出"hello world!"。那么第一个问题来了:http.ListenAndServe做了什么,它是怎么和http.HandleFunc关联起来的?OK,我们先来看一下http.ListenAndServe这个函数:

func ListenAndServe(addr string, handler Handler) error {
    server := &Server{Addr: addr, Handler: handler}
    return server.ListenAndServe()
}

这个函数比较简单,它就利用参数构造了一个Server类型的变量server,然后调用了这个类型的成员函数ListenAndServe()Server的结构如下:

// A Server defines parameters for running an HTTP server.
// The zero value for Server is a valid configuration.
type Server struct {
    Addr         string        // TCP address to listen on, ":http" if empty
    Handler      Handler       // handler to invoke, http.DefaultServeMux if nil
    ReadTimeout  time.Duration // maximum duration before timing out read of the request
    WriteTimeout time.Duration // maximum duration before timing out write of the response
    TLSConfig    *tls.Config   // optional TLS config, used by ListenAndServeTLS

    // MaxHeaderBytes controls the maximum number of bytes the
    // server will read parsing the request header's keys and
    // values, including the request line. It does not limit the
    // size of the request body.
    // If zero, DefaultMaxHeaderBytes is used.
    MaxHeaderBytes int

    // TLSNextProto optionally specifies a function to take over
    // ownership of the provided TLS connection when an NPN/ALPN
    // protocol upgrade has occurred. The map key is the protocol
    // name negotiated. The Handler argument should be used to
    // handle HTTP requests and will initialize the Request's TLS
    // and RemoteAddr if not already set. The connection is
    // automatically closed when the function returns.
    // If TLSNextProto is nil, HTTP/2 support is enabled automatically.
    TLSNextProto map[string]func(*Server, *tls.Conn, Handler)

    // ConnState specifies an optional callback function that is
    // called when a client connection changes state. See the
    // ConnState type and associated constants for details.
    ConnState func(net.Conn, ConnState)

    // ErrorLog specifies an optional logger for errors accepting
    // connections and unexpected behavior from handlers.
    // If nil, logging goes to os.Stderr via the log package's
    // standard logger.
    ErrorLog *log.Logger

    disableKeepAlives int32     // accessed atomically.
    nextProtoOnce     sync.Once // guards setupHTTP2_* init
    nextProtoErr      error     // result of http2.ConfigureServer if used
}

注释写的很明确,Server类型定义了一个HTTP服务器的一些参数。另外,而且它还有4个成员函数:

func (srv *Server) ListenAndServe() error
func (srv *Server) ListenAndServeTLS(certFile, keyFile string) error
func (srv *Server) Serve(l net.Listener) error
func (srv *Server) SetKeepAlivesEnabled(v bool)

成员函数ListenAndServe的作用就是监听srv.Addr指定的TCP网络,并且调用srv.Serve()函数来处理收到的请求:

// ListenAndServe listens on the TCP network address srv.Addr and then
// calls Serve to handle requests on incoming connections.
// Accepted connections are configured to enable TCP keep-alives.
// If srv.Addr is blank, ":http" is used.
// ListenAndServe always returns a non-nil error.
func (srv *Server) ListenAndServe() error {
    addr := srv.Addr
    if addr == "" {
        addr = ":http"
    }
    ln, err := net.Listen("tcp", addr)
    if err != nil {
        return err
    }
    return srv.Serve(tcpKeepAliveListener{ln.(*net.TCPListener)})
}

再来看这个srv.Serve

// Serve accepts incoming connections on the Listener l, creating a
// new service goroutine for each. The service goroutines read requests and
// then call srv.Handler to reply to them.
//
// For HTTP/2 support, srv.TLSConfig should be initialized to the
// provided listener's TLS Config before calling Serve. If
// srv.TLSConfig is non-nil and doesn't include the string "h2" in
// Config.NextProtos, HTTP/2 support is not enabled.
//
// Serve always returns a non-nil error.
func (srv *Server) Serve(l net.Listener) error {
    defer l.Close()
    if fn := testHookServerServe; fn != nil {
        fn(srv, l)
    }
    var tempDelay time.Duration // how long to sleep on accept failure

    if err := srv.setupHTTP2_Serve(); err != nil {
        return err
    }

    // TODO: allow changing base context? can't imagine concrete
    // use cases yet.
    baseCtx := context.Background()
    ctx := context.WithValue(baseCtx, ServerContextKey, srv)
    ctx = context.WithValue(ctx, LocalAddrContextKey, l.Addr())
    for {
        rw, e := l.Accept()
        if e != nil {
            if ne, ok := e.(net.Error); ok && ne.Temporary() {
                if tempDelay == 0 {
                    tempDelay = 5 * time.Millisecond
                } else {
                    tempDelay *= 2
                }
                if max := 1 * time.Second; tempDelay > max {
                    tempDelay = max
                }
                srv.logf("http: Accept error: %v; retrying in %v", e, tempDelay)
                time.Sleep(tempDelay)
                continue
            }
            return e
        }
        tempDelay = 0
        c := srv.newConn(rw)
        c.setState(c.rwc, StateNew) // before Serve can return
        go c.serve(ctx)
    }
}

可以看到这里是一个死循环,每收到一个请求,就创建一个goroutine去处理这个请求。这里又出来了一个新的结构conn。代码中srv.newConn(rw)返回的变量c就是这个类型。我们看下这个结构:

// A conn represents the server side of an HTTP connection.
type conn struct {
    // server is the server on which the connection arrived.
    // Immutable; never nil.
    server *Server

    // rwc is the underlying network connection.
    // This is never wrapped by other types and is the value given out
    // to CloseNotifier callers. It is usually of type *net.TCPConn or
    // *tls.Conn.
    rwc net.Conn

    // remoteAddr is rwc.RemoteAddr().String(). It is not populated synchronously
    // inside the Listener's Accept goroutine, as some implementations block.
    // It is populated immediately inside the (*conn).serve goroutine.
    // This is the value of a Handler's (*Request).RemoteAddr.
    remoteAddr string

    // tlsState is the TLS connection state when using TLS.
    // nil means not TLS.
    tlsState *tls.ConnectionState

    // werr is set to the first write error to rwc.
    // It is set via checkConnErrorWriter{w}, where bufw writes.
    werr error

    // r is bufr's read source. It's a wrapper around rwc that provides
    // io.LimitedReader-style limiting (while reading request headers)
    // and functionality to support CloseNotifier. See *connReader docs.
    r *connReader

    // bufr reads from r.
    // Users of bufr must hold mu.
    bufr *bufio.Reader

    // bufw writes to checkConnErrorWriter{c}, which populates werr on error.
    bufw *bufio.Writer

    // lastMethod is the method of the most recent request
    // on this connection, if any.
    lastMethod string

    // mu guards hijackedv, use of bufr, (*response).closeNotifyCh.
    mu sync.Mutex

    // hijackedv is whether this connection has been hijacked
    // by a Handler with the Hijacker interface.
    // It is guarded by mu.
    hijackedv bool
}

如注释所示,这个结构描述/代表了服务端的一个HTTP连接。这个类型也有很多方法,这里我们只介绍上面调用到的方法:func (c *conn) serve(ctx context.Context),因为每个goroutine执行的就是这个方法。这个内容有点多,我们只保留对我们分析有用的部分:

func (c *conn) serve(ctx context.Context) {
    ...
    
    serverHandler{c.server}.ServeHTTP(w, w.req)
    
    ...
}

这里又涉及到了一个serverHandler以及它的一个方法ServeHTTP

// serverHandler delegates to either the server's Handler or
// DefaultServeMux and also handles "OPTIONS *" requests.
type serverHandler struct {
    srv *Server
}

func (sh serverHandler) ServeHTTP(rw ResponseWriter, req *Request) {
    handler := sh.srv.Handler
    if handler == nil {
        handler = DefaultServeMux
    }
    if req.RequestURI == "*" && req.Method == "OPTIONS" {
        handler = globalOptionsHandler{}
    }
    handler.ServeHTTP(rw, req)
}

serverHandler定义了一个HTTP服务,上面c.serve方法中使用c.server初始化了这个HTTP服务,然后调用了其ServeHTTP方法。而这个ServeHTTP方法里面会去c.server里面找其Handler,如果该Handler不为nil,就调用其ServeHTTP方法;如果为nil,就用一个全局变量DefaultServeMux来初始化这个Handler,再调用其ServeHTTP方法。也就是说,最终都调用了HandlerServeHTTP方法。让我们来看看这个Handler

type Handler interface {
    ServeHTTP(ResponseWriter, *Request)
}

这个Handler竟然是个接口,而且只定义了一个ServeHTTP方法。那我们接下来的任务就是去找看谁实现了这个接口。在这之前,我们先总结一下前面分析的东西。

  1. 我们从http.ListenAndServe开始,先是找到了Server这个类型,它用来描述一个运行HTTP服务的Server。而http.ListenAndServe就是调用了这个它的方法ListenAndServe,这个方法又调用了Serve这个方法。在Serve这个方法中,我们看到对于每个请求,都会创建一个goroutine去执行conn类型的serve方法。
  2. 然后我们又分析了conn类型,它描述了服务端的一个HTTP连接。它的serve方法里面调用了Handler接口的ServeHTTP方法。

上面的分析基本是根据函数调用来分析的,虽然有点乱,但是还是比较简单的,而且其实主要就涉及到了Serverconn两个类型和一个Handler接口。

接下来我们就分析Go HTTP中最重要的角色ServeMux

拨云见日

上面的分析中我们注意到Server结构中的Handler变量有一个默认值DefaultServeMux。它是一个包级别的全局变量,类型是ServeMux,因为它可以调用ServeHTTP,所以它应该实现了Handler接口。答案是肯定的!

ServeHTTP是Go中的HTTP请求分发器(HTTP request multiplexer),负责将特定URL来的请求分发给特定的处理函数。匹配的规则我摘抄一些Golang的文档,就不翻译了,基本就是正常思路:

Patterns name fixed, rooted paths, like "/favicon.ico", or rooted subtrees, like "/images/" (note the trailing slash). Longer patterns take precedence over shorter ones, so that if there are handlers registered for both "/images/" and "/images/thumbnails/", the latter handler will be called for paths beginning "/images/thumbnails/" and the former will receive requests for any other paths in the "/images/" subtree.

Note that since a pattern ending in a slash names a rooted subtree, the pattern "/" matches all paths not matched by other registered patterns, not just the URL with Path == "/".

If a subtree has been registered and a request is received naming the subtree root without its trailing slash, ServeMux redirects that request to the subtree root (adding the trailing slash). This behavior can be overridden with a separate registration for the path without the trailing slash. For example, registering "/images/" causes ServeMux to redirect a request for "/images" to "/images/", unless "/images" has been registered separately.

Patterns may optionally begin with a host name, restricting matches to URLs on that host only. Host-specific patterns take precedence over general patterns, so that a handler might register for the two patterns "/codesearch" and "codesearch.google.com/" without also taking over requests for "http://www.google.com/".

ServeMux also takes care of sanitizing the URL request path, redirecting any request containing . or .. elements or repeated slashes to an equivalent, cleaner URL.

然后回到刚开始的那个程序,http.ListenAndServe的第二个参数是nil,根据前面的分析,它就会使用默认ServeMux类型的变量DefaultServeMux作为Handler。然后我们看http.HandleFunc是如何将HelloServer注册给DefaultServeMux的?

http.HandleFunc函数体如下:

// HandleFunc registers the handler function for the given pattern
// in the DefaultServeMux.
// The documentation for ServeMux explains how patterns are matched.
func HandleFunc(pattern string, handler func(ResponseWriter, *Request)) {
    DefaultServeMux.HandleFunc(pattern, handler)
}

// Handle registers the handler for the given pattern
// in the DefaultServeMux.
// The documentation for ServeMux explains how patterns are matched.
func Handle(pattern string, handler Handler) { DefaultServeMux.Handle(pattern, handler) }

可见它还是调用了ServeMuxHandleFunc方法。所以我们还是先来看看这个ServeMux吧:

type ServeMux struct {
    mu    sync.RWMutex //锁,由于请求涉及到并发处理,因此这里需要一个锁机制
    m     map[string]muxEntry  // 路由规则,一个string对应一个mux实体,这里的string就是注册的路由表达式
    hosts bool // whether any patterns contain hostnames
}

type muxEntry struct {
    explicit bool // 是否精确匹配
    h        Handler // 这个路由表达式对应哪个handler
    pattern  string //匹配字符串
}

// NewServeMux allocates and returns a new ServeMux.
func NewServeMux() *ServeMux { return new(ServeMux) }

// DefaultServeMux is the default ServeMux used by Serve.
var DefaultServeMux = &defaultServeMux

var defaultServeMux ServeMux

// Handle registers the handler for the given pattern.
// If a handler already exists for pattern, Handle panics.
func (mux *ServeMux) Handle(pattern string, handler Handler)

// HandleFunc registers the handler function for the given pattern.
func (mux *ServeMux) HandleFunc(pattern string, handler func(ResponseWriter, *Request))

// ServeHTTP dispatches the request to the handler whose
// pattern most closely matches the request URL.
func (mux *ServeMux) ServeHTTP(w ResponseWriter, r *Request)

上面的代码块中,我列了ServeMux结构的定义以及它的几个重要的方法(有的方法的实现内容后面讲到时再贴)。至此,我们可以看到调用关系了:http.HandleFunc-->func (mux *ServeMux) HandleFunc(pattern string, handler func(ResponseWriter, *Request))-->func (mux *ServeMux) Handle(pattern string, handler Handler)。也就是说最终调用的是ServeMuxHandle方法。有时我们也用http.Handle注册请求处理函数,其内部也调用的是func (mux *ServeMux) Handle(pattern string, handler Handler)

这里还有个小细节需要注意:在ServeMuxHandle方法的第二个参数是Handler类型,它是个接口。而func (mux *ServeMux) HandleFunc(handler func(ResponseWriter, *Request))方法的第二个参数是func(ResponseWriter, *Request)类型的,对于我们上面的例子就是HelloServer函数,但是这个函数并没有实现Handler接口中定义的ServeHTTP(ResponseWriter, *Request)方法,所以它是不能直接传给Handle方法的的。我们可以看到func (mux *ServeMux) HandleFunc(handler func(ResponseWriter, *Request))函数体是这样的:

func (mux *ServeMux) HandleFunc(pattern string, handler func(ResponseWriter, *Request)) {
    mux.Handle(pattern, HandlerFunc(handler))
}

可以看到,这里用HandlerFunchandler包装了一下,就可以作为Handler接口类型传给Handle方法了。那这里的包装是什么呢?初步看起来好像是函数调用,但其实是一个类型转换。没错,HandlerFunc是一个类型:

type HandlerFunc func(ResponseWriter, *Request)

// ServeHTTP calls f(w, r).
func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request) {
    f(w, r)
}

而且HandlerFunc类型还实现了ServeHTTP方法,也就是说它实现了Handler接口。又因为我们的处理函数的签名与它的一致,所以可以强转。所以说HandlerFunc其实就是一个适配器,它使得的我们可以将普通的函数可以作为HTTP的处理函数,只要这个函数的签名是func(ResponseWriter, *Request)这样子的。这也就是为什么我们注册的HTTP请求处理函数的签名都必须写成这个样子。不得不说,这也是Go中一个非常巧妙的用法。

OK,现在让我们来看看Handle方法是如何注册处理函数的:

// Handle registers the handler for the given pattern.
// If a handler already exists for pattern, Handle panics.
func (mux *ServeMux) Handle(pattern string, handler Handler) {
    mux.mu.Lock()
    defer mux.mu.Unlock()

    if pattern == "" {
        panic("http: invalid pattern " + pattern)
    }
    if handler == nil {
        panic("http: nil handler")
    }
    if mux.m[pattern].explicit {
        panic("http: multiple registrations for " + pattern)
    }

    if mux.m == nil {
        mux.m = make(map[string]muxEntry)
    }
    mux.m[pattern] = muxEntry{explicit: true, h: handler, pattern: pattern}

    if pattern[0] != '/' {
        mux.hosts = true
    }

    // Helpful behavior:
    // If pattern is /tree/, insert an implicit permanent redirect for /tree.
    // It can be overridden by an explicit registration.
    n := len(pattern)
    if n > 0 && pattern[n-1] == '/' && !mux.m[pattern[0:n-1]].explicit {
        // If pattern contains a host name, strip it and use remaining
        // path for redirect.
        path := pattern
        if pattern[0] != '/' {
            // In pattern, at least the last character is a '/', so
            // strings.Index can't be -1.
            path = pattern[strings.Index(pattern, "/"):]
        }
        url := &url.URL{Path: path}
        mux.m[pattern[0:n-1]] = muxEntry{h: RedirectHandler(url.String(), StatusMovedPermanently), pattern: pattern}
    }
}

可以看到注册的过程其实就是构造map[string]muxEntry这个map或者往已有的里面添加值,key是url,value是处理函数以及其他一些必要信息。这样,注册过程就算明白了。我们再来看下,不同的请求来了以后,是如何选择到事先注册的处理函数的?

回想前面介绍的,每个请求来了以后会创建一个goroutine去为这个请求服务,goroutine里面最终执行的是Server结构里面Handler成员(类型是Handler接口类型)的ServeHTTP方法。这里的这个Handler就是DefaultServeMuxServeMux类型),所以也就执行的是ServeMuxServeHTTP方法,我们来看一下:

// ServeHTTP dispatches the request to the handler whose
// pattern most closely matches the request URL.
func (mux *ServeMux) ServeHTTP(w ResponseWriter, r *Request) {
    if r.RequestURI == "*" {
        if r.ProtoAtLeast(1, 1) {
            w.Header().Set("Connection", "close")
        }
        w.WriteHeader(StatusBadRequest)
        return
    }
    h, _ := mux.Handler(r)
    h.ServeHTTP(w, r)
}

这里如果收到的请求是*,则关闭连接并返回StatusBadRequest。否则执行mux.Handler,我们看下这个函数:

// Handler returns the handler to use for the given request,
// consulting r.Method, r.Host, and r.URL.Path. It always returns
// a non-nil handler. If the path is not in its canonical form, the
// handler will be an internally-generated handler that redirects
// to the canonical path.
//
// Handler also returns the registered pattern that matches the
// request or, in the case of internally-generated redirects,
// the pattern that will match after following the redirect.
//
// If there is no registered handler that applies to the request,
// Handler returns a ``page not found'' handler and an empty pattern.
func (mux *ServeMux) Handler(r *Request) (h Handler, pattern string) {
    if r.Method != "CONNECT" {
        if p := cleanPath(r.URL.Path); p != r.URL.Path {
            _, pattern = mux.handler(r.Host, p)
            url := *r.URL
            url.Path = p
            return RedirectHandler(url.String(), StatusMovedPermanently), pattern
        }
    }

    return mux.handler(r.Host, r.URL.Path)
}

// handler is the main implementation of Handler.
// The path is known to be in canonical form, except for CONNECT methods.
func (mux *ServeMux) handler(host, path string) (h Handler, pattern string) {
    mux.mu.RLock()
    defer mux.mu.RUnlock()

    // Host-specific pattern takes precedence over generic ones
    if mux.hosts {
        h, pattern = mux.match(host + path)
    }
    if h == nil {
        h, pattern = mux.match(path)
    }
    if h == nil {
        h, pattern = NotFoundHandler(), ""
    }
    return
}

可以看到,函数的核心功能就是根据请求的url去之前注册时构造的map里面查找对应的请求处理函数,并返回这个而处理函数。得到这个处理函数后,就接着上面的执行h.ServeHTTP(w, r)。我们注册时将我们自定义的请求处理函数强制转换为了HandlerFunc类型,所以从map里面取出来的还是这个类型,所以这里调用的就是这个类型的ServeHTTP方法:

// ServeHTTP calls f(w, r).
func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request) {
    f(w, r)
}

可以看到,其实就是执行我们注册的自定义的请求处理函数。

到这里就解析完了,但是感觉好乱有木有,好吧,这里我借用以下astaxie在《build-web-application-with-golang》一书中对这个流程的总结吧。对于刚开始的那段代码,整个执行的流程是这样的:

  • 首先调用http.HandleFunc,然后内部按顺序做了以下事情:

    1. 调用了DefaultServeMuxHandleFunc方法
    2. 调用了DefaultServeMuxHandle方法
    3. DefaultServeMuxmap[string]muxEntry中增加对应的handler和路由规则
  • 其次调用http.ListenAndServe(":8080", nil),依次做了以下事情

    1. 实例化Server
    2. 调用ServerListenAndServe方法
    3. 调用net.Listen("tcp", addr)监听端口
    4. 启动一个for循环,在循环体中Accept请求
    5. 对每个请求实例化一个Conn,并且开启一个goroutine为这个请求进行服务c.serve(ctx)
    6. 读取每个请求的内容w, err := c.readRequest()
    7. 判断handler是否为空,如果没有设置handler(这个例子就没有设置handler),handler就设置为DefaultServeMux
    8. 调用handlerServeHTTP
    9. 在这个例子中,下面就进入到DefaultServeMux.ServeHTTP
    10. 根据request选择handler,并且进入到这个handler的ServeHTTP
    11. 选择handler:

      • 判断是否有路由能满足这个request(循环遍历ServerMuxmuxEntry
      • 如果有路由满足,调用这个路由handlerServeHTTP
      • 如果没有路由满足,调用NotFoundHandlerServeHTTP

番外篇

从上面的分析可以看到,之所以我们能够用Go非常容易的写一个简单的Web Server程序是因为Go不光提供了机制和接口,还为我们实现了一个版本。比如Go实现了一个ServeMux,并内置了一个全局变量DefaultServeMux,还实现了一系列诸如对于HandleFunc之类的函数和方法,使得我们可以非常容易的去注册请求处理函数,去分发请求。

当然,我们可以不使用Go内部实现的ServeMux,而使用我们自己的。一方面是更加有灵活性(当然需要我们自己做更多的编码工作),另一方面有些人也认为标准库中内置一个全局的变量不是一个好的设计与实践。比如对于刚开始的程序,我们定义自己的"ServeMux"来实现:

package main

import (
    "io"
    "log"
    "net/http"
)

type MyMux struct{}

func (m *MyMux) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    if r.URL.Path == "/" {
        HelloServer(w, r)
        return
    }

    http.NotFound(w, r)
    return
}

func main() {
    mux := &MyMux{}

    log.Fatal(http.ListenAndServe(":8080", mux))
}

func HelloServer(w http.ResponseWriter, req *http.Request) {
    io.WriteString(w, "hello world!\n")
}

可以看到,实现自己的"ServeMux"其实就是实现Handler接口。当然这里只是一个示例,实际中,如何进行路由分发才是大的工作量。所以我觉得内部实现的那个挺好,至少可以减少开发者很多工作量...

除了这个ServeMux可以自定义外,如果我们想对Server进行更多更精细的控制,也可以自定义Server

s := &http.Server{
    Addr:           ":8080",
    Handler:        myHandler,
    ReadTimeout:    10 * time.Second,
    WriteTimeout:   10 * time.Second,
    MaxHeaderBytes: 1 << 20,
}
log.Fatal(s.ListenAndServe())

参考:


仅有一条评论

  1. 时间轨迹

    时间久了再回头来看自己写的,的确是有点乱啊 icon_redface.gif

    时间轨迹 博主 回复

添加新评论

选择表情 captcha

友情提醒:不填或错填验证码会引起页面刷新,导致已填的评论内容丢失。

|