主要介绍同步、异步、协程的概念;在实践部分给出了code 样本和网络下载图像的异步程序。异步的使用场景:network-based or file I/O-based.
概念
what a synchronous program
A synchronous program is executed one step at a time. Even with conditional branching, loops and function calls, you can still think about the code in terms of taking one execution step at a time. When each step is complete, the program moves on to the next one.
synchronous 程序就是 step by step,常见的两种形式:batch processing program and command-line program
what an asynchronous program
Asynchronous programming, or async for short, is a feature of many modern languages that allows a program to juggle multiple operations without waiting or getting hung up on any one of them. It’s a smart way to efficiently handle tasks like network or file I/O, where most of the program’s time is spent waiting for a task to finish.
异步编程的主要应用场景:network or file I/O;以下给出具体的使用场景:
Some examples of tasks that work well with async:
- Web scraping, as described above.
- Network services (e.g., a web server or framework).
- Programs that coordinate results from multiple sources that take a long time to return values (for instance, simultaneous database queries).
what is coroutine?(协程)
You also need a coroutine. What is a coroutine? A coroutine in python a function or method that can pause it’s execution and resume at a later point. Any task that needs to be run asynchronously needs to be a coroutine. You define a coroutine with async def
. Coroutines are awaitable and can not be executed by simply calling the function.
异步在 python 中的实现就是 coroutine
协程(coroutine)不能实现执行效率的提高和异步的支持,只不过它可以让原本支持异步的写法变得更加好写。
python 中有不同版本,对于异步的支持也是不一样的。这里使用 3.6 作为说明。
例子
实现数字相加,使用同步进程的思路实现。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
import time
def sleep():
time.sleep(1)
def sum(name, numbers):
total =0
for number in numbers:
sleep()
total += number
print('Task {}: Sum = {} \n'.format(name, total))
starttime =time.time()
tasks =[ sum('A', [1, 2]), sum('B', [1, 2, 3])]
print("Time: {} sec".format(time.time() - starttime))
|
异步实现同样的思路
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
|
import asyncio
import time
async def sleep():
print(f'Time: {time.time() - start:.2f}')
await asyncio.sleep(1)
async def sum(name, numbers):
total = 0
for number in numbers:
print(f'Task {name}: Computing {total}+{number}')
await sleep()
total += number
print(f'Task {name}: Sum = {total}\n')
start = time.time()
loop = asyncio.get_event_loop()
tasks = [
loop.create_task(sum("A", [1, 2])),
loop.create_task(sum("B", [1, 2, 3])),
]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
end = time.time()
print(f'Time: {end-start:.2f} sec')
# Time: 3.01 sec
|
实际场景中的一个例子:基于异步实现的下载图像的功能。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
import os
import ast
import requests
import pandas as pd
import asyncio
import time
download_dir ="bg_dir2"
async def asy_down(lst):
if not os.path.exists(download_dir):
os.makedirs(download_dir)
for ll in lst:
if not isinstance(ll,str): continue
ll =ast.literal_eval(ll)
for url in ll:
imgname =url.split('/')[-1]
image_suffix =imgname.split(".")[-1]
if len(image_suffix) < 5 and (image_suffix[-1] == 'g' or image_suffix[-1] == 'G'):
if not os.path.exists(os.path.join(download_dir, imgname)):
try:
await open(os.path.join(download_dir, imgname), 'wb').write(requests.get(url, allow_redirects =True).content)
except:
pass
start= time.time()
loop = asyncio.get_event_loop()
tasks =[ loop.create_task(asy_down(ids))]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
print("async down images time: {} seconds".format(time.time() - start))
|