很多人认为程序员是格子衫、油腻的头发或者是没有头发、常年背着电脑包、带着眼睛、严重的黑眼圈的形象,这样怎么可能找到女朋友?!
其实程序员可以是这样的
也可以这样的
而我们更加可以用技术来识别并分析那个女孩更适合我们,也可以分析喜欢的女孩要怎样才能追上她
Requests 库爬取「我主良缘」网站
程序逻辑
利用 requests 库对目标站点进行数据的抓取。
在对抓取的数据进行处理,筛选出自己想要的数据信息。
最后保存相应的数据信息到数据库中。
解析网站
设置年龄
设置性别参数
设置身高参数
设置薪水参数
查询符合条件的数据
图片保存
保存数据到monogo数据库
完整代码
import requests
import xlwt
import xlsxwriter
import os
from hashlib import md5
from urllib.parse import urlencode
from config import *
import pymongo
clinet = pymongo.MongoClient(MONGO_URL)
db = clinet[MONGO_DB]
'''
解析网站
'''
def get_one(page, startage, endage, gender, startheight, endheight, salary):
headers = {
'Referer': 'http://www.lovewzly.com/jiaoyou.html',
'User - Agent': 'Mozilla / 5.0(Windows NT 10.0;WOW64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 66.0.3359.170Safari / 537.36'
}
params = {
'page':page,
'startage': startage,
'endage':endage,
'gender':gender,
'cityid':'52',
'startheight':startheight,
'endheight':endheight,
'marry':'1',
'educatin':'40',
'salary':salary
}
base_url = 'http://www.lovewzly.com/api/user/pc/list/search?'
url = base_url + urlencode(params)
print(url)
while True:
try:
response = requests.get(url, headers=headers)
if response.status_code == 200:
return response.json()
except ConnectionError:
return None
'''
设置年龄
'''
def query_age():
age = input('请输入期望对方年龄(如:20):')
if 21 <= int(age) <= 30:
startage = 21
endage = 30
elif 31 <= int(age) <= 40:
startage = 31
endage = 40
elif 41 <= int(age) <= 50:
startage = 41
endage = 50
else:
startage = 0
endage = 0
return startage, endage
'''
设置性别参数
'''
def query_sex():
'''性别筛选'''
sex = input('请输入期望对方性别(如:女):')
if sex == '男':
gender = 1
else:
gender = 2
return gender
'''
设置身高参数
'''
def query_height():
'''身高筛选'''
height = input('请输入期望对方身高(如:162):')
if 151 <= int(height) <= 160:
startheight = 151
endheight = 160
elif 161 <= int(height) <= 170:
startheight = 161
endheight = 170
elif 171 <= int(height) <= 180:
startheight = 171
endheight = 180
elif 181 <= int(height) <= 190:
startheight = 181
endheight = 190
else:
startheight = 0
endheight = 0
return startheight, endheight
'''
设置薪水参数
'''
def query_money():
'''待遇筛选'''
money = input('请输入期望的对方月薪(如:8000):')
if 2000 <= int(money) 5000:
salary = 2
elif 5000 <= int(money) 10000:
salary = 3
elif 10000 <= int(money) <= 20000:
salary = 4
elif 20000 <= int(money):
salary = 5
else:
salary = 0
return salary
'''
查询符合条件的数据
'''
def query_data():
print('请输入你的筛选条件, 开始本次姻缘')
gender = query_sex()
startheight, endheight = query_height()
startage, endage = query_age()
salary = query_money()
for i in range(1, 10):
json = get_one(i, startage, endage, gender, startheight, endheight, salary)
for item in get_person(json):
save_to_monogo(item)
save_image(item)
'''
图片保存
'''
def save_image(item):
if not os.path.exists('images2'):
os.mkdir('images2')
try:
image_url = item.get('avatar')
response = requests.get(image_url)
if response.status_code == 200:
file_path = '{0}/{1}.{2}'.format('images2', md5(response.content).hexdigest(), 'jpg')
if not os.path.exists(file_path):