Python 并发编程之协程/异步IO

Python开发者 · 公众号 · Python · 2017-01-14 20:54

正文

(点击上方蓝字，快速关注我们)

来源：ZiWenXie

www.ziwenxie.site/2016/12/19/python-asynico/

如有好文章投稿，请点击 → 这里了解详情

引言

随着node.js的盛行，相信大家今年多多少少都听到了异步编程这个概念。Python社区虽然对于异步编程的支持相比其他语言稍显迟缓，但是也在Python3.4中加入了asyncio，在Python3.5上又提供了async/await语法层面的支持，刚正式发布的Python3.6中asynico也已经由临时版改为了稳定版。下面我们就基于Python3.4+来了解一下异步编程的概念以及asyncio的用法。

什么是协程

通常在Python中我们进行并发编程一般都是使用多线程或者多进程来实现的，对于计算型任务由于GIL的存在我们通常使用多进程来实现，而对与IO型任务我们可以通过线程调度来让线程在执行IO任务时让出GIL，从而实现表面上的并发。

其实对于IO型任务我们还有一种选择就是协程，协程是运行在单线程当中的“并发”，协程相比多线程一大优势就是省去了多线程之间的切换开销，获得了更大的运行效率。Python中的asyncio也是基于协程来进行实现的。在进入asyncio之前我们先来了解一下Python中怎么通过生成器进行协程来实现并发。

example1

我们先来看一个简单的例子来了解一下什么是协程(coroutine)，对生成器不了解的朋友建议先看一下Stackoverflow上面的这篇高票回答（http://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do）。

>>> def coroutine () :

... reply = yield 'hello'

... yield reply

...

>>> c = coroutine ()

>>> next ( c )

'hello'

>>> c . send ( 'world' )

'world'

example2

下面这个程序我们要实现的功能就是模拟多个学生 同时向一个老师提交作业 ，按照传统的话我们或许要采用多线程/多进程，但是这里我们可以采用生成器来实现协程用来模拟并发。

如果下面这个程序读起来有点困难，可以直接跳到后面部分，并不影响阅读，等你理解协程的本质，回过头来看就很简单了。

from collections import deque

def student ( name , homeworks ) :

for homework in homeworks . items () :

yield ( name , homework [ 0 ], homework [ 1 ]) # 学生"生成"作业给老师

class Teacher ( object ) :

def __init__ ( self , students ) :

self . students = deque ( students )

def handle ( self ) :

"""老师处理学生作业"""

while len ( self . students ) :

student = self . students . pop ()

try :

homework = next ( student )

print ( 'handling' , homework [ 0 ], homework [ 1 ], homework [ 2 ])

except StopIteration :

pass

else :

self . students . appendleft ( student )

下面我们来调用一下这个程序。

Teacher ([

student ( 'Student1' , { 'math' : '1+1=2' , 'cs' : 'operating system' }),

student ( 'Student2' , { 'math' : '2+2=4' , 'cs' : 'computer graphics' }),

student ( 'Student3' , { 'math' : '3+3=5' , 'cs' : 'compiler construction' })

]). handle ()

这是输出结果，我们仅仅只用了一个简单的生成器就实现了并发(concurrence)，注意不是并行(parallel)，因为我们的程序仅仅是运行在一个单线程当中。

handling Student3 cs compiler construction

handling Student2 cs computer graphics

handling Student1 cs operating system

handling Student3 math 3 + 3 = 5

handling Student2 math 2 + 2 = 4

handling Student1 math 1 + 1 = 2

使用asyncio模块实现协程

从Python3.4开始asyncio模块加入到了标准库，通过asyncio我们可以轻松实现协程来完成异步IO操作。

解释一下下面这段代码，我们创造了一个协程display_date(num, loop)，然后它使用关键字yield from来等待协程asyncio.sleep(2)的返回结果。而在这等待的2s之间它会让出CPU的执行权，直到asyncio.sleep(2)返回结果。

# coroutine.py

import asyncio

import datetime

@ asyncio . coroutine # 声明一个协程

def display_date ( num , loop ) :

end_time = loop . time () + 10.0

while True :

print ( "Loop: {} Time: {}" . format ( num , datetime . datetime . now ()))

if ( loop . time () + 1.0 ) >= end_time :

break

yield from asyncio . sleep ( 2 ) # 阻塞直到协程sleep(2)返回结果

loop = asyncio . get_event_loop () # 获取一个event_loop

tasks = [ display_date ( 1 , loop ), display_date ( 2 , loop )]

loop . run_until_complete ( asyncio . gather ( * tasks )) # "阻塞"直到所有的tasks完成

loop . close ()

下面是运行结果，注意到并发的效果没有，程序从开始到结束只用大约10s，而在这里我们并没有使用任何的多线程/多进程代码。在实际项目中你可以将asyncio.sleep(secends)替换成相应的IO任务，比如数据库/磁盘文件读写等操作。

ziwenxie :: ~ » python coroutine . py

Loop : 1 Time : 2016 - 12 - 19 16 : 06 : 46.515329

Loop : 2 Time : 2016 - 12 - 19 16 : 06 : 46.515446

Loop : 1 Time : 2016 - 12 - 19 16 : 06 : 48.517613

Loop : 2 Time : 2016 - 12 - 19 16 : 06 : 48.517724

Loop : 1 Time : 2016 - 12 - 19 16 : 06 : 50.520005

Loop : 2 Time : 2016 - 12 - 19 16 : 06 : 50.520169

Loop : 1 Time : 2016 - 12 - 19 16 : 06 : 52.522452

Loop : 2 Time : 2016 - 12 - 19 16 : 06 : 52.522567

Loop : 1 Time : 2016 - 12 - 19 16 : 06 : 54.524889

Loop : 2 Time : 2016 - 12 - 19 16 : 06 : 54.525031

Loop : 1 Time : 2016 - 12 - 19 16 : 06 : 56.527713

Loop : 2 Time : 2016 - 12 - 19 16 : 06 : 56.528102

在Python3.5中为我们提供更直接的对协程的支持，引入了async/await关键字，上面的代码我们可以这样改写，使用async代替了@asyncio.coroutine，使用了await代替了yield from，这样我们的代码变得更加简洁可读。

import asyncio

import datetime

async def display_date ( num , loop ) : # 声明一个协程

end_time = loop . time () + 10.0

while True :

print ( "Loop: {} Time: {}" . format ( num , datetime . datetime . now ()))

if ( loop . time () + 1.0 ) >= end_time :

break

await asyncio . sleep ( 2 ) # 等同于yield from

loop = asyncio . get_event_loop () # 获取一个event_loop

tasks = [ display_date ( 1 , loop ), display_date ( 2 , loop )]

loop . run_until_complete ( asyncio . gather ( * tasks )) # "阻塞"直到所有的tasks完成

loop . close ()

asyncio模块详解

开启事件循环有两种方法，一种方法就是通过调用run_until_complete，另外一种就是调用run_forever。run_until_complete内置add_done_callback，使用run_forever的好处是可以通过自己自定义add_done_callback，具体差异请看下面两个例子。

run_until_complete()

import asyncio

async def slow_operation ( future ) :

await asyncio . sleep ( 1 )

future . set_result ( 'Future is done!' )

loop = asyncio . get_event_loop ()

future = asyncio . Future ()

asyncio . ensure_future ( slow_operation ( future ))

print ( loop . is_running ()) # False

loop . run_until_complete ( future )

print ( future . result ())

loop . close ()

run_forever()

run_forever相比run_until_complete的优势是添加了一个add_done_callback，可以让我们在task(future)完成的时候调用相应的方法进行后续处理。

import asyncio

async def slow_operation ( future ) :

await asyncio . sleep ( 1 )

future . set_result ( 'Future is done!' )

def got_result ( future ) :

print ( future . result ())

loop . stop ()

loop = asyncio . get_event_loop ()

future = asyncio . Future ()

asyncio . ensure_future ( slow_operation ( future ))

future . add_done_callback ( got_result )

try :

loop . run_forever ()

finally :

loop . close ()

这里还要注意一点，即使你调用了协程方法，但是如果事件循环没有开启，协程也不会执行，参考官方文档的描述，我刚被坑过。

Calling a coroutine does not start its code running – the coroutine object returned by the call doesn’t do anything until you schedule its execution. There are two basic ways to start it running: call await coroutine or yield from coroutine from another coroutine (assuming the other coroutine is already running!), or schedule its execution using the ensure_future() function or the AbstractEventLoop.create_task() method. Coroutines (and tasks) can only run when the event loop is running.

Call

call_soon()

import asyncio

def hello_world ( loop ) :

print ( 'Hello World' )

loop . stop ()

loop = asyncio . get_event_loop ()

# Schedule a call to hello_world()

loop . call_soon ( hello_world , loop )

# Blocking call interrupted by loop.stop()

loop . run_forever ()

loop . close ()

下面是运行结果，我们可以通过call_soon提前注册我们的task，并且也可以根据返回的Handle进行cancel。

Hello World

call_later()

import asyncio

import datetime

def display_date ( end_time , loop ) :

print ( datetime . datetime . now ())

if ( loop . time () + 1.0 ) end_time :

loop . call_later ( 1 , display_date , end_time , loop )

else :

loop . stop ()

Python 并发编程之协程/异步IO

正文

请到「今天看啥」查看全文