学科分类
/ 1
1 个结果
  • 简介:Theadaptivecriticheuristichasbeenapopularalgorithminreinforcementlearning(RL)andapproximatedynamicprogramming(ADP)alike.ItisoneofthefirstRLandADPalgorithms.RLandADPalgorithmsareparticularlyusefulforsolvingMarkovdecisionprocesses(MDPs)thatsufferfromthecursesofdimensionalityandmodeling.Manyreal-worldproblems,however,tendtobesemi-Markovdecisionprocesses(SMDPs)inwhichthetimespentineachtransitionoftheunderlyingMarkovchainsisitselfarandomvariab...

  • 标签: 适应批评家 演员批评家 Semi-Markov 近似动态编程 加强学习