Python 优化第一步: 性能分析实践

Python开发者 · 公众号 · Python · 2017-04-13 21:24

正文

(点击上方蓝字，快速关注我们)

来源：伯乐在线专栏作者 - iPytLab

http://python.jobbole.com/87621/

如有好文章投稿，请点击 → 这里了解详情

前言

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. – Donald Knuth

先扔上一句名言来镇楼。

当我们的确是有需要开始真正优化我们的Python程序的时候，我们要做的第一步并不是盲目的去做优化，而是对我们现有的程序进行分析，发现程序的性能瓶颈进而进行针对性的优化。这样才会使我们花时间和精力去做的优化获得最大的效果。

本文主要介绍Python内置的性能分析器的优雅使用方法，并以作者的一个化学动力学的程序为例子进行性能分析实践, 介绍了常用的性能分析可视化工具的使用，最后对Python程序进行初步的性能优化尝试。

正文

关于性能分析

性能分析就是分析代码和正在使用的资源之间有着怎样的联系，它可以帮助我们分析运行时间从而找到程序运行的瓶颈，也可以帮助我们分析内存的使用防止内存泄漏的发生。

帮助我们进行性能分析的工具便是性能分析器，它主要分为两类：

基于事件的性能分析(event-based profiling)

统计式的性能分析(statistical profiling)

关于性能分析详细的概念参考: 性能分析-维基百科

Python的性能分析器

Python中最常用的性能分析工具主要有：cProfiler, line_profiler以及memory_profiler等。他们以不同的方式帮助我们分析Python代码的性能。我们这里主要关注Python内置的cProfiler，并使用它帮助我们分析并优化程序。

cProfiler

快速使用

这里我先拿上官方文档的一个简单例子来对cProfiler的简单使用进行简单介绍。

import cProfile

import re

cProfile . run ( 're.compile("foo|bar")' )

分析结果：

197 function calls ( 192 primitive calls ) in 0.002 seconds

Ordered by : standard name

ncalls tottime percall cumtime percall filename : lineno ( function )

1 0.000 0.000 0.001 0.001 < string >: 1 ( < module > )

1 0.000 0.000 0.001 0.001 re .py : 212 ( compile )

1 0.000 0.000 0.001 0.001 re .py : 268 ( _compile )

1 0.000 0.000 0.000 0.000 sre_compile .py : 172 ( _compile_charset )

1 0.000 0.000 0.000 0.000 sre_compile .py : 201 ( _optimize_charset )

4 0.000 0.000 0.000 0.000 sre_compile .py : 25 ( _identityfunction )

3 / 1 0.000 0.000 0.000 0.000 sre_compile .py : 33 ( _compile )

从分析报告结果中我们可以得到很多信息：

整个过程一共有197个函数调用被监控，其中192个是原生调用（即不涉及递归调用）
总共执行的时间为0.002秒
结果列表中是按照标准名称进行排序，也就是按照字符串的打印方式（数字也当作字符串）
在列表中：

ncalls表示函数调用的次数（有两个数值表示有递归调用，总调用次数/原生调用次数）
tottime是函数内部调用时间（不包括他自己调用的其他函数的时间）
percall等于 tottime/ncalls
cumtime累积调用时间，与tottime相反，它包含了自己内部调用函数的时间
最后一列，文件名，行号，函数名

优雅的使用

Python给我们提供了很多接口方便我们能够灵活的进行性能分析，其中主要包含两个类cProfile模块的Profile类和pstat模块的Stats类。

我们可以通过这两个类来将代码分析的功能进行封装以便在项目的其他地方能够灵活重复的使用进行分析。

这里还是需要对Profile以及Stats的几个常用接口进行简单总结：

Profile类:

enable(): 开始收集性能分析数据
disable(): 停止收集性能分析数据
create_stats(): 停止收集分析数据，并为已收集的数据创建stats对象
print_stats(): 创建stats对象并打印分析结果
dump_stats(filename): 把当前性能分析的结果写入文件(二进制格式)
runcall(func, *args, **kwargs): 收集被调用函数func的性能分析数据Stats类

Stats类

pstats模块提供的Stats类可以帮助我们读取和操作stats文件（二进制格式）

import pstats

p = pstats . Stats ( 'stats.prof' )

Stats类可以接受stats文件名，也可以直接接受cProfile.Profile对象作为数据源。

strip_dirs(): 删除报告中所有函数文件名的路径信息
dump_stats(filename): 把stats中的分析数据写入文件（效果同cProfile.Profile.dump_stats())
sort_stats(*keys): 对报告列表进行排序，函数会依次按照传入的参数排序，关键词包括calls, cumtime等，具体参数参见https://docs.python.org/2/library/profile.html#pstats.Stats.sort_stats
reverse_order(): 逆反当前的排序
print_stats(*restrictions): 把信息打印到标准输出。*restrictions用于控制打印结果的形式, 例如(10, 1.0, ".*.py.*")表示打印所有py文件的信息的前10行结果。

有了上面的接口我们便可以更优雅的去使用分析器来分析我们的程序，例如可以通过写一个 带有参数的装饰器 ，这样想分析项目中任何一个函数，便可方便的使用装饰器来达到目的。

import cProfile

import pstats

import os

# 性能分析装饰器定义

def do_cprofile ( filename ) :

"""

Decorator for function profiling.

"""

def wrapper ( func ) :

def profiled_func ( * args , ** kwargs ) :

# Flag for do profiling or not.

DO_PROF = os . getenv ( "PROFILING" )

if DO_PROF :

profile = cProfile . Profile ()

profile . enable ()

result = func ( * args , ** kwargs )

profile . disable ()

# Sort stat by internal time.

sortby = "tottime"

ps = pstats . Stats ( profile ). sort_stats ( sortby )

ps . dump_stats ( filename )

else :

result = func ( * args , ** kwargs )

return result

return profiled_func

return wrapper