专栏名称: Dance with GenAI
关于生成式人工智能AIGC的一切
目录
相关文章推荐
人民日报评论  ·  经济大省何以挑大梁 | 人民时评 ·  2 天前  
51好读  ›  专栏  ›  Dance with GenAI

AI金融投资:批量下载深交所公募REITs公开说明书

Dance with GenAI  · 公众号  ·  · 2024-06-15 07:32

正文

打开深交所公募REITs公开说明书页面,F12查看网络,找到真实地址: https:// reits.szse.cn/api/disc/ announcement/annList?random=0.3555675437003616

{

"announceCount": 39,

"data": [

{

"id": "80bc99a7-8a04-4803-b42a-d9cca1e6c5d5",

"annId": 1220300147,

"title": "华夏华润商业REIT:华夏华润商业资产封闭式基础设施证券投资基金招募说明书更新",

"content": null,

"publishTime": "2024-06-08 00:00:00",

"attachPath": "/disc/disk03/finalpage/2024-06-08/a77d6a34-c4eb-4dcf-9b16-7c2ce856ebdd.PDF",

"attachFormat": "PDF",

"attachSize": 6265,

"secCode": [

"180601"

],

"secName": [

"华夏华润商业REIT"

],

"bondType": null,

"bigIndustryCode": null,

"bigCategoryId": null,

"smallCategoryId": null,

"channelCode": null,

"_index": "ows_disclosure-20180825"

},

返回的是json数据,PDF地址在这里:"/disc/disk03/finalpage/2024-06-08/a77d6a34-c4eb-4dcf-9b16-7c2ce856ebdd.PDF",

打开下载页面,查看网站URL: https:// disc.static.szse.cn/dis c/disk03/finalpage/2024-06-08/a77d6a34-c4eb-4dcf-9b16-7c2ce856ebdd.PDF

那么,开头要添加的是“ https:// disc.static.szse.cn

在deepseek中输入提示词:

你是一个Python编程专家,写一个Python脚本,具体步骤如下:

请求网址:

https:// reits.szse.cn/api/disc/ announcement/annList?random=0.3555675437003616

请求方法:

POST

状态代码:

200 OK

远程地址:

58.251.50.138:443

引荐来源网址政策:

strict-origin-when-cross-origin

请求载荷:

{"seDate":["",""],"channelCode":["reits-xxpl"],"bigCategoryId":["directions"],"pageSize":50,"pageNum":1}

请求标头:

Accept:

application/json, text/javascript, */*; q=0.01

Accept-Encoding:

gzip, deflate, br, zstd

Accept-Language:

zh-CN,zh;q=0.9,en;q=0.8

Connection:

keep-alive

Content-Length:

104

Content-Type:

application/json

Host:

http:// reits.szse.cn

Origin:

https:// reits.szse.cn

Referer:

https:// reits.szse.cn/disclosur e/index.html

Sec-Ch-Ua:

"Google Chrome";v="125", "Chromium";v="125", "Not.A/Brand";v="24"

Sec-Ch-Ua-Mobile:

?0

Sec-Ch-Ua-Platform:

"Windows"

Sec-Fetch-Dest:

empty

Sec-Fetch-Mode:

cors

Sec-Fetch-Site:

same-origin

User-Agent:

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36

X-Request-Type:

ajax

X-Requested-With:

XMLHttpRequest

获取网页返回的响应,这是一个嵌套的json数据;

定位到 "data"键下"title"键对应的值,这是PDF文件的标题;

定位到 "data"键下 "attachPath"键对应的值,这是PDF文件URL,前面加上“ https:// disc.static.szse.cn ”,构成一个完整的PDF下载URL;

下载PDF文件,保存到文件夹:F:\AI自媒体内容\AI炒股\REITs

注意:每一步都要输出信息

PDF文件标题中可能包括一些不符合window系统命名规则的特殊符号,在重命名PDF文件前要先进行处理;

每下一个PDF文件,就随机暂停3-6秒;

源代码:

import requests

import json

import os

import time

import random

import re

# 定义请求URL和请求头

url = " https:// reits.szse.cn/api/disc/ announcement/annList?random=0.3555675437003616 "

headers = {

"Accept": "application/json, text/javascript, */*; q=0.01",

"Accept-Encoding": "gzip, deflate, br, zstd",

"Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",

"Connection": "keep-alive",

"Content-Type": "application/json",

"Host": " http:// reits.szse.cn ",

"Origin": " https:// reits.szse.cn ",

"Referer": " https:// reits.szse.cn/disclosur e/index.html ",

"Sec-Ch-Ua": '"Google Chrome";v="125", "Chromium";v="125", "Not.A/Brand";v="24"',

"Sec-Ch-Ua-Mobile": "?0",

"Sec-Ch-Ua-Platform": '"Windows"',

"Sec-Fetch-Dest": "empty",

"Sec-Fetch-Mode": "cors",

"Sec-Fetch-Site": "same-origin",

"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36",

"X-Request-Type": "ajax",

"X-Requested-With": "XMLHttpRequest"







请到「今天看啥」查看全文