专栏名称: AEii国际应用能源
发布应用能源领域资讯,介绍国际应用能源创新研究院工作,推广应用能源优秀项目,增进应用能源领域合作
目录
相关文章推荐
新郑发布  ·  教育部重要通知! ·  昨天  
看理想  ·  无人在意的角落里,一个少女自杀了 ·  2 天前  
中产先生  ·  厉害了,全民减负行动,马上要启动了? ·  2 天前  
Gogolearning教师成长学苑  ·  被质疑整容的甜馨,穿女装的恩利,给普通孩子上 ... ·  2 天前  
Gogolearning教师成长学苑  ·  被质疑整容的甜馨,穿女装的恩利,给普通孩子上 ... ·  2 天前  
51好读  ›  专栏  ›  AEii国际应用能源

【Applied Energy最新原创论文】结合双向LSTM和注意力机制的DQN-CE算法船舶能量调度

AEii国际应用能源  · 公众号  ·  · 2023-07-02 18:30

正文

原文信息:

Ship Energy Scheduling with DQN-CE Algorithm Combining Bi-directional LSTM and Attention Mechanism

原文链接:

https://www.sciencedirect.com/science/article/pii/S0306261923007420

Highlights

建立了AES电力系统智能能量调度环境。

将BI-LSTM与注意力机制相结合,提出了BI-LSTM-Att行动网络。

通过对DQN算法进行改进提出了DQN-CE算法。

在全电动渡轮上在多个场景下进行了多个方法的对比实验,从而验证了本文方法的有效性和优越性。

摘要

基于深度强化学习(DRL)的船舶能量调度是目前研究的重要方向,其中深度Q学习算法(Deep Q-learning,DQN)在能量调度领域已有较多成功运用的案例,本文针对全电力船舶(All-Electric Ships,AES)能量调度中DQN算法使用多层感知机(MLP)作为动作网络性能不足,以及DQN过高估计Q值导致收敛性能降低问题,提出了一个结合双向LSTM和注意力机制(BI-LSTM-Att)的DQN交叉熵(DQN-CE)能量调度算法。首先将BI-LSTM-Att代替MLP作为DQN动作网络,然后在此基础上,提出了一个改进的DRL算法DQN-CE,即在DQN损失函数的基础上,增加了目标网络和动作网络下一时刻预测动作的交叉熵损失。在全电力渡轮上进行了仿真实验,实验结果显示本文提出的能量调度算法相比原始DQN能量调度算法,经济消耗降低了4.11%,对能量存储系统的利用率提高了24.4%,智能体训练探索时间减少了31.3%。最后在新的案例下进一步验证了算法的有效性和优越性。

更多关于“Energy scheduling”的文章请见:

https://www.sciencedirect.com/search?qs=Energy%20scheduling&pub=Applied%20Energy&cid=271429

Abstr act

At present, Ship energy scheduling based on deep reinforcement learning (DRL) is an important research direction in which Deep Q learning algorithm(DQN) has been successfully applied in the field of energy scheduling. In terms of the shortcomings of DQN in all electric ships(AES) energy scheduling, such that, the insufficient performance of using multiplayer perceptron(MLP) as action network and the degradation of convergence performance due to over estimation of Q value, in this paper, we propose a DQN cross entropy(DQN-CE) energy scheduling algorithm combing bi-directional LSTM and attention mechanism(BI-LSTM-Att). Instead of using MLP, firstly, we use BI-LSTM-Att as action network of DQN, and then an improved DRL algorithm called DQN-CE is proposed, that is, we add cross-entropy loss of predicted action of the next time for the target network and the action network to DQN. Finally, we conduct simulation experiments on an all-electric ferry, and the experimental results show that the energy scheduling algorithm proposed in this paper reduces the economic consumption by 4.11%, increases the utilization rate of energy storage system by 24.4%, and reduces the exploration time of agent training by 31.3% compared with the original DQN energy scheduling algorithm. Finally, the effectiveness and superiority of the algorithm were further verified in the new case study.

Keywords

Energy scheduling

AES

DQN

MLP

BI-LSTM-Att

DQN-CE

Fig. 7. BI-LSTM-Att action network.

Fig. 8. DQN-CE training flow with experience playback and parameter freezing mechanism.

Fig. 14. Scheduling results.

(a)

(b)

Fig. 16.Visualization result of attention mechanism. (a) BI-LSTM-Att(DQN). (b) BI-LSTM-Att (DQN-CE).







请到「今天看啥」查看全文