赠书 | 实现病人数据自动分析建模，Python能做的比你想象得更多

AI科技大本营 · 公众号 · AI · 2020-12-15 21:12

正文

者 | 李秋键

责编 | 晋兆雨

头图 | CSDN下载自视觉中国

*文末有赠书福利

数据表格整理等作为我们工作学习生活中最为繁琐和无趣的任务之一，消耗掉了我们的大多数时间。而今天我们就将利用Python对病人数据进行建模，并自动生成表单，从而节省了我们医务工作者的大量时间。

最终生成的表单数据如下可见，每个病人分别以单独的Word表单保存，一键分析Excel数据，生成表单，并自动给出病人具体的评估。

实验前的准备

首先我们使用的Python版本是3.6.5所用到的模块如下：

openpyxl库用来读取Excel表格数据。
datetime模块用来对时间处理和生成。
Docx库即为Python-docx包，这是一个很强大的包，可以用来创建docx文档，包含段落、分页符、表格、图片、标题、样式等几乎所有的Word文档中能常用的功能都包含了，这个包的主要功能便是用来创建文档。

下面我们需要将需要处理的数据文件放到同一目录下，部分文件数据如下图：

其中需要用到的文件命名为data.xlsx。

表格生成

我们需要生成的固定项如下表所见：

即主要包括项目名称、评估内容、评估结果和评估定级。

部分代码如下:

'''填写单元格'''
table.cell(0, 0).merge(table.cell(1, 0))  # 合并单元格
table.cell(0, 0).text = "项目"
table.cell(0, 0).paragraphs[0].runs[0].font.bold = True # 加粗
table.cell(0, 0).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
table.cell(0, 0).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
table.cell(0, 0).paragraphs[0].paragraph_format.alignment= WD_ALIGN_PARAGRAPH.CENTER  # 水平居中
table.cell(0, 1).merge(table.cell(1, 1))  # 合并单元格
table.cell(0, 1).text = "评估内容"
table.cell(0, 1).paragraphs[0].runs[0].font.bold = True # 加粗
table.cell(0, 1).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
table.cell(0, 1).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
table.cell(0, 1).paragraphs[0].paragraph_format.alignment= WD_ALIGN_PARAGRAPH.CENTER  # 水平居中
table.cell(0, 2).merge(table.cell(1, 2))  # 合并单元格
table.cell(0, 2).text = "评估结果"
table.cell(0




    
, 2).paragraphs[0].runs[0].font.bold = True # 加粗
table.cell(0, 2).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
table.cell(0, 2).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
table.cell(0, 2).paragraphs[0].paragraph_format.alignment= WD_ALIGN_PARAGRAPH.CENTER  # 水平居中
table.cell(0, 3).merge(table.cell(0, 5))  # 合并单元格
table.cell(0, 3).text = "评估定级"
table.cell(0, 3).paragraphs[0].runs[0].font.bold = True # 加粗
table.cell(0, 3).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
table.cell(0, 3).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
table.cell(0, 3).paragraphs[0].paragraph_format.alignment= WD_ALIGN_PARAGRAPH.CENTER  # 水平居中
table.cell(1, 3).text = "良好"
table.cell(1, 3).paragraphs[0].runs[0].font.bold = True # 加粗
table.cell(1, 3).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
table.cell(1, 3).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
table.cell(1, 3).paragraphs[0].paragraph_format.alignment= WD_ALIGN_PARAGRAPH.CENTER  # 水平居中
table.cell(1, 4).text = "中等"
table.cell(1, 4).paragraphs[0].runs[0].font.bold = True # 加粗
table.cell(1, 4).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
table.cell(1, 4).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
table.cell(1, 4).paragraphs[0].paragraph_format.alignment= WD_ALIGN_PARAGRAPH.CENTER  # 水平居中
table.cell(1, 5).text = "中等"
table.cell(1, 5).paragraphs[0].runs[0].font.bold = True # 加粗
table.cell(1, 5).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
table.cell(1, 5).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
table.cell(1, 5).paragraphs[0].paragraph_format.alignment=

数据计算与匹配整合

（1）数据清洗：

首先要读取数据，其中包括数据的清洗，即Excel日期数据和标准日期格式不匹配的问题，需要使用datetime转成标准时间：

workbook = load_workbook(u'./data.xlsx')    #找到需要xlsx文件的位置
booksheet = workbook.active                #获取当前活跃的sheet,默认是第一个sheet
rows = booksheet.rows
index=None
for row in rows:
    index = [col.value for col in row]
    break
dct = {j:i for i,j in enumerate(index)}
for i,row in enumerate(rows):
    #一个人的资料
    one_person =[col.value for col in row]
    #计算年龄
    birth=one_person[dct['bd']]
    age=None
    #now设置为问卷调查的日期
    now =datetime.date(2020,7,15)
    if now.month month:
        age = now.year - birth.year - 1
    if now.month > birth.month:
        age = now.year - birth.year
    if now.month == birth.month and now.day day:
        age = now.year - birth.year - 1
    if now.month == birth.month and now.day >birth.day:
        age = now.year - birth.year
    person_name = one_person[dct['name']]
    person_id = one_person[dct['id']]

（2）等级判断：

首先统计慢性病的数目，然后进行等级判断即可

#根据1.2统计慢性病数量
sum_disease = one_person[dct['nb2a']]+one_person[dct['nb2b']]++one_person[dct['nb2c']] \
+ one_person[dct['nb2d']] +one_person[dct['nb2e']] +one_person[dct['nb2f']]  \
+ one_person[dct['nb2g']] + one_person[dct['nb2h']] + one_person[dct['nb2i']] \
+ one_person[dct['nb2j']] +one_person[dct['nb2k']] + int(len(one_person[dct['nb2l']])!=0) \
+ int(len(one_person[dct['nb2m']])!=0)
#1.患病状况：')
t22=''
t22=t22+'您共患有'+str(sum_disease)+'种慢性病\n'
#多重用药
is_multi = one_person[dct['nb3']] == 4
if is_multi:
    #'    您存在多重用药')
    t22 =t22+ '您存在多重用药'+'\n'
#等级判断
my_list=[]
my_list.append(one_person[dct['nb1']])
my_list.append(one_person[dct['nb3']])
my_list.append(one_person[dct['nb4']])
if 4 in my_list:
    #'    您的患病状况维度等级为：较差')
    table.cell(3




    
, 5).text = chr(8730)
    table.cell(3, 5).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
    table.cell(3, 5).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
    table.cell(3, 5).paragraphs[0].paragraph_format.alignment= WD_ALIGN_PARAGRAPH.CENTER  # 水平居中
elif 3 in my_list:
    #'    您的患病状况维度等级为：中等')
    table.cell(3, 4).text = chr(8730)
    table.cell(3, 4).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
    table.cell(3, 4).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
    table.cell(3, 4).paragraphs[0].paragraph_format.alignment= WD_ALIGN_PARAGRAPH.CENTER  # 水平居中
else:
    #'    您的患病状况维度等级为：良好')
    table.cell(3, 3).text = chr(8730)
    table.cell(3, 3).paragraphs[0].runs[0].font.size = Pt(12)  # 字号大小
    table.cell(3, 3).vertical_alignment = WD_ALIGN_VERTICAL.CENTER  # 垂直居中
    table.cell(3, 3).paragraphs[0].paragraph_format.alignment= WD_ALIGN_PARAGRAPH.CENTER  # 水平居中

（3）数据计算：

其中需要根据Excel中的数值进行计算，最后根据计算的结果给出评估。代码如下：

a = one_person[dct['ng1a']]
    b = one_person[dct['ng1b']]
    c = one_person[dct['ng2a']]
    d = one_person[dct['ng2b']]
    if a == None:
        a = 12
    if b == None:
        b = 12
    if c == None:
        c = 7.5
    if d == None:
        d = 7.5
    print(one_person[dct['ng1a']])
    body = ((one_person[dct['ng1a']]+one_person[dct['ng1b']])/2) <12 + \
((one_person[dct['ng2a']] +one_person[dct['ng2b']]) / 2)<7.5
    #男 1 女 2
    if one_person[dct['sex']] == 1:
        leg=  ((one_person[dct['ng3a']]+one_person[dct['ng3b']])/2) >= 34
        hand=((one_person[dct['ng4a1']]+one_person[dct['ng4a2']]+one_person[dct['ng4b1']]+one_person[dct['ng4b2']])/4) >=28
    else:
        leg = ((one_person[dct['ng3a']] + one_person[dct['ng3b']]) / 2) >= 33
        hand =((one_person[dct['ng4a1']] +one_person[dct['ng4a2']] +one_person[dct['ng4b1']] + one_person[
            dct['ng4b2']]) / 4) >= 18
    body = body + leg+ hand
    #document.add_paragraph('6.身体测量：')
    t27=''
    if body>=2:
        #document.add_paragraph('    您没有肌衰弱风险')
        t27=t27+'您没有肌衰弱风险'+'\n'
    else:
        #document.add_paragraph('    您具有肌衰弱风险')
        t27 = t27 + '您具有肌衰弱风险' + '\n'

compute_correct = int()
a = one_person[dct['nd4a']]
if a is None :
    a = 0
b = one_person[dct['nd4b']]
if b is None :
    b = 0
c = one_person[dct['nd4c']]
if c is None :
    c = 0
d = one_person[dct['nd4d']]
if d is None :
    d = 0
e = one_person[dct['nd4e']]
if e is None :
    e = 0
compute_correct = (a == 100 - 7) +(b == a - 7) \
                  + (c == b - 7 ) \
                  + (d == c - 7

赠书 | 实现病人数据自动分析建模，Python能做的比你想象得更多

正文

实验前的准备

表格生成

数据计算与匹配整合

请到「今天看啥」查看全文