框架之路由系统

1 项目概述

1.1 背景介绍及目标

1.1.1 背景

web 框架的路由系统就是根据用户输入的 URL 的不同来返回不同的内容。

日常开发 web 程序的时候，一般是先编写逻辑函数，然后再配置路由，进而提供服务，当项目变大的时候，维护路由成本也会逐渐提高。

那是不是可以通过一种方式，让框架自身维护路由

1.1.2 目标

『基础』用户访问不同的 URL，框架路由到特定 handler 函数进行处理
『进阶』根据开发者 handler 函数，框架自动生成路由

1.2 名词说明

environ: WSGI 网关将 HTTP server 请求包封装的一个 dict 对象
handler: 代指 butterfly 处理函数

1.3 Roadmap

支持传统风格 HTTP 接口
支持 Restful 风格 HTTP 接口

2 需求分析

2.1 功能需求

用户请求的 URL，会记录在 environ['PATH_INFO'] 中

解法

访问 --- 本质上是框架如何根据 environ['PATH_INFO'] 请求路径获取到对应 handler , 这里应该是有个对应关系
自动 --- 本质上是如何根据 handler 自动生成对应关系

期望

curl "http://127.0.0.1:8585/x/ping"                         ===>handlers/x/__init__.py:ping()
curl "http://127.0.0.1:8585/x/hello?str_info=world"         ===>handlers/x/__init__.py:hello(str_info=world)
curl -d '{"str_info":"world"}' http://127.0.0.1:8585/x/hello===>handlers/x/__init__.py:hello(str_info=world)

2.2 非功能需求

2.3 调研

2.3.1 路由简单例子

请求 /home/index 时，返回 "Hello Home"；请求 /login 时，返回 "welcome to login our site. "

#!/usr/bin/python
# coding=utf8
import __init__

def simple_app(environ, start_response):
    response_headers = [('Content-type','text/plain')]
    start_response('200 OK', response_headers)
    return ['My Own Hello World!']

def RunServer(environ, start_response):
    start_response('200 OK', [('Content-Type', 'text/html')])
    #根据 url 的不同，返回不同的字符串
    #1 获取 URL[URL 从哪里获取？当请求过来之后执行 RunServer,wsgi 给咱们封装了这些请求，这些请求都封装到了，environ & start_response]
    request_url = environ['PATH_INFO']
    print request_url
    #2 根据 URL 做不同的相应
    #print environ #这里可以通过断点来查看它都封装了什么数据
    if request_url == '/home/index':
        return "Hello Home"
    elif request_url == '/login':
        return "welcome to login our site. "
    else:
        return '<h1>404!</h1>'
s = __init__.CherryPyWSGIServer(("0.0.0.0", 8383), RunServer, perfork=1)
s.start()

这里只是一个简单的例子，我们需要更好的处理 environ['PATH_INFO']

2.3.2 Django

在 Django 中，路由是浏览器访问服务器时，先访问的项目中的 url，再由项目中的 url 找到应用中 url，这些 url 是放在一个列表里，遵从从前往后匹配的规则。

Django 支持动态路由即：

很多时候，我们需要获取 URL 中的一些片段，作为参数，传递给处理请求的视图函数。

例如访问网址 https://127.0.0.1:8585/user/4869

我们处理能匹配这个网址还要能将 4869 这串数字传递给函数，以便来查询用户。这个时候就是获取 url 中的一个片段作为参数传递给视图函数。

url 传递指定参数的语法为：

(?P<name>pattern)

name 可以理解为所要传递的参数的名称，pattern 代表所要匹配的模式。例如，

url(r'^user/(?P<userid>[0-9]+)$', views.detail),

路由系统会将正则部分匹配到的数据作为参数传递给 views.detail() 函数，views.detail() 函数也会多出一个参数，名为 userid。当然作为一个函数，userid 参数是可以提供默认值的。

def detail(request,userid='1'):
    users = User.objects.filter(id=userid)
    ....

Django2 中的路由参数传递

在 Django2 中路由参数传递改变了一点写法。

urlpatterns = [
    path('user/<int:userid>/', views.detail),
]

2.3.3 Tornado

通过正则进行匹配

[
    (r'/setting/(.+)', web.SettingHandler)
]

2.3.4 Flask

在 Flask 中，路由是通过装饰器给每个视图函数提供的，而且根据请求方式的不同可以一个 url 用于不同的作用。

示例：

from flask import Flask
app=Flask(__name__)
@app.route('/')
def index():
    return '<h1>Flask Web 程序开始了......<h1>'
@app.route('/user/<name>')
def user(name):
    return '<h1>你好！%s!<h1>' % name
if __name__=='__main__':
    app.run(debug=True)

werkzeug 路由逻辑

事实上，flask 核心的路由逻辑是在 werkzeug 中实现的。我们先看一下 werkzeug 提供的路由功能。

>>> m = Map([
...     Rule('/', endpoint='index'),
...     Rule('/downloads/', endpoint='downloads/index'),
...     Rule('/downloads/<int:id>', endpoint='downloads/show')
... ])
>>> urls = m.bind("example.com", "/")
>>> urls.match("/", "GET")
('index', {})
>>> urls.match("/downloads/42")
('downloads/show', {'id': 42})

>>> urls.match("/downloads")
Traceback (most recent call last):
  ...
RequestRedirect: http://example.com/downloads/
>>> urls.match("/missing")
Traceback (most recent call last):
  ...
NotFound: 404 Not Found

上面的代码演示了 werkzeug 最核心的路由功能：

添加路由规则（也可以使用 m.add）
把路由表绑定到特定的环境（m.bind）
匹配 url（urls.match）。

正常情况下返回对应的 endpoint 名字和参数字典，可能报重定向或者 404 异常。

match 实现

werkzeug 中是怎么实现 match 方法的。Map 保存了 Rule 列表，match 的时候会依次调用其中的 rule.match 方法，如果匹配就找到了 match。Rule.match 方法的代码如下：

def match(self, path):
        """Check if the rule matches a given path. Path is a string in the
        form ``"subdomain|/path(method)"`` and is assembled by the map.  If
        the map is doing host matching the subdomain part will be the host
        instead.

        If the rule matches a dict with the converted values is returned,
        otherwise the return value is `None`.
        """
        if not self.build_only:
            m = self._regex.search(path)
            if m is not None:
                groups = m.groupdict()

                result = {}
                for name, value in iteritems(groups):
                    try:
                        value = self._converters[name].to_python(value)
                    except ValidationError:
                        return
                    result[str(name)] = value
                if self.defaults:
                    result.update(self.defaults)

                return result

它的逻辑是这样的：用实现 compile 的正则表达式去匹配给出的真实路径信息，把所有的匹配组件转换成对应的值，保存在字典中（这就是传递给视图函数的参数列表）并返回。

3 总体设计

3.1 系统架构

+-------------------------------+
|                               |
|           route               | 路由系统
|                               |
+-------------------------------+
    |           |          |
    |           |          |
+--------+ +--------+ +--------+
|handler1| |handler2| |handler3|
+--------+ +--------+ +--------+

3.2 设计与折衷

3.2.1 最多支持路由层次

目前看 2 层就可以，如下

/{handler}
/{app}/{handler}

3.2.2 是否支持动态路由

如 restful 风格

/user/<username>

目前暂不支持

3.3 潜在风险

4 详细设计

4.1 URL PATH 与 handler 对应字典

通过一个 dict 存储对应关系，将路由放到 apicube 字典中

apicube demo

{
    '/apidemo/ping': <xlib.protocol_json.Protocol object at 0x7f0c97e93e90>,
    '/apidemo/hello': <xlib.protocol_json.Protocol object at 0x7f0c97e93ed0>
}

将路由写到 apicube 字典

根据请求路径在 apicube 字典中找到对应的处理 Handler，支持 1-2 级路由

请求 PATH_INFO    ==> 路由字典中的 key ==> 实际的函数路径
/ping or /ping/   ==> /ping            ==> handlers/__init__.py::ping
/apidemo/ping     ==> /apidemo/ping    ==> handlers/apidemo/__init__.py::ping

4.2 自动生成路由

制定对应规则，路由通过自动导入 handlers package 以及其下的子 package 然后自动注册到 Web 路由中。

设置一个目录为 API 接口代码文件夹，如 handlers ，如：

project
├── ...
├── handlers
│   ├── api
│   │   ├── __init__.py
│   │   └── model.py
│   ├── __init__.py
│   └── auth
│       └── __init__.py
└── ...

其中 handlers 此文件夹下的各层 __init__.py只能写接口 API 服务代码，这些接口 API 函数可以依赖本目录下的其他模块

4.2.1 路由自动映射规则

规则：项目文件夹 [...]/ 接口函数

示例：（如下 handlers/__init__.py::echo 标识 handlers 下的 __init__.py 中有个 echo 函数）

handlers/__init__.py::echo ==>  /echo
handlers/api/__init__.py::hostinfo ==> /api/hostinfo
handlers/auth/__init__.py::login ==> /auth/login
handlers/auth/__init__.py::logout ==> /auth/logout

4.2.2 handler 函数自动加载到路由中的条件

条件（第一个参数为 "req" 的非私有函数）

(1) 私有函数不会被加载，即函数名是 "_" 开头
(2) 类不会被加载，即 handler 中 controller 使用函数来完成 "功能的抽象"
(3) 函数的第一个形参名需要是 "req"
    调用此函数时的实参 req 是对 HTTP request environ 的封装

5 实现

5.1 自动生成路由

5.1.1 导入所有子 module/package

导入模块

import importlib

m = importlib.import_module("test.add")

importlib example

如何导入所有子模块

如果要获取包里面的所有模块列表，不应该用 os.listdir()，而是 pkgutil 模块。

import pkgutil

pkgutil.walk_packages(path=None, prefix='', onerror=None)

返回结果为
(module_loader, name, ispkg)

name:package or module name
ispkg:is_sub_package

网上说函数 iter_modules() 和 walk_packages() 的区别在于：后者会迭代所有深度的子包。实际测试发现没有区别

-------------------------------------------
handlers
├── __init__.py
├── apidemo
│   └── __init__.py
├── auth
│   └── __init__.py
├── ceshi1
│   ├── __init__.py
│   ├── ceshi2
│   │   ├── __init__.py
│   │   └── xx.py
│   └── ceshi3
│       └── __init__.py
├── report
│   └── __init__.py
└── x
    └── __init__.py
-------------------------------------------

>>> import pkgutil
>>> for i in pkgutil.walk_packages(["handlers"]):
...    print i
...
(<pkgutil.ImpImporter instance at 0x10fb3bc20>, 'apidemo', True)
(<pkgutil.ImpImporter instance at 0x10fb3bc20>, 'auth', True)
(<pkgutil.ImpImporter instance at 0x10fb3bc20>, 'ceshi1', True)
(<pkgutil.ImpImporter instance at 0x10fb3bc20>, 'report', True)
(<pkgutil.ImpImporter instance at 0x10fb3bc20>, 'x', True)

>>> for i in pkgutil.iter_modules(["handlers"]):
...    print i
...
(<pkgutil.ImpImporter instance at 0x10fb3be60>, 'apidemo', True)
(<pkgutil.ImpImporter instance at 0x10fb3be60>, 'auth', True)
(<pkgutil.ImpImporter instance at 0x10fb3be60>, 'ceshi1', True)
(<pkgutil.ImpImporter instance at 0x10fb3be60>, 'report', True)
(<pkgutil.ImpImporter instance at 0x10fb3be60>, 'x', True)

示例

import importlib
import pkgutil

def import_submodules(package):
    """
    Import all submodules of a module, recursively,
    including subpackages.

    From http://stackoverflow.com/questions/3365740/how-to-import-all-submodules

    :param package: package (name or actual module)
    :type package: str | module
    :rtype: dict[str, types.ModuleType]

    Examples:
        local dir:
            handlers/__init__.py
            handlers/api/__init__.py
        input: package = "handlers"
        return:{
                'handlers.api': <module 'handlers.api' from '/home/users/meetbill/butterfly/handlers/api/__init__.pyc'>,
                'handlers':     <module 'handlers' from '/home/users/meetbill/butterfly/handlers/__init__.pyc'>
               }
    """
    results = {}
    if isinstance(package, str):
        results[package] = importlib.import_module(package)
        package = importlib.import_module(package)
    for _loader, name, is_pkg in pkgutil.walk_packages(package.__path__):
        if is_pkg:
            full_name = package.__name__ + '.' + name
            results[full_name] = importlib.import_module(full_name)
    return results

5.1.2 动态导入对象

根据不同的条件导入不同的包

目录结构

├── a
│   ├── __init__.py
│   └── a.py
└── b
    ├── __init__.py
    └── c
        ├── __init__.py
        └── c.py

程序内容

# c.py 中内容
args = {'a':1}

class Test_class:

    def helloworld(self):
        print "helloworld"
        return 0

# a.py 中内容
import importlib

params = importlib.import_module('b.c.c') #绝对导入
params_ = importlib.import_module('.c.c',package='b') #相对导入

# 对象中取出需要的对象
print params.args                     # 取出变量
print params.Test_class               # 取出 class

test = params.Test_class()
print test.helloworld()    # 取出 class Test_class 中的 helloworld 方法

执行

# 设置 PYTHONPATH
$ export PYTHONPATH=$(pwd):$PYTHONPATH

# 执行
$ python a/a.py
{'a': 1}
b.c.c.Test_class
helloworld
0

6 传送门

插件化思维及实现

PreviousWSGIGateway Next框架之 MIDDLEWARE

Last updated 1 year ago