当前位置：代码迷 >> 综合 >> Reinforcement Learning（四）：Actor-Critic Methods

详细解决方案

Reinforcement Learning（四）：Actor-Critic Methods

热度：14 发布时间：2023-12-12 01:06:30.0

主要思想：

Policy Network (Actor)

Value Network (Critic):

形象对比：

Train the Neural Networks

具体步骤：

Update value network q using TD

Update policy network Π using policy gradient

Actor-Critic Method

Summary of Algorithm

Summary

Policy Network and Value Network

Training

相关解决方案