Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/24394
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Gao, S | - |
dc.contributor.author | Shi, H | - |
dc.contributor.author | Wang, F | - |
dc.contributor.author | Wang, Z | - |
dc.contributor.author | Zhang, S | - |
dc.contributor.author | Li, Y | - |
dc.contributor.author | Sun, Y | - |
dc.date.accessioned | 2022-04-05T14:45:52Z | - |
dc.date.available | 2022-04-05T14:45:52Z | - |
dc.date.issued | 2022-02-16 | - |
dc.identifier.citation | Gao, S. et al. (2022) ‘Deterministic policy optimization with clipped value expansion and long-horizon planning’, Neurocomputing, 483, pp. 299 - 310. doi:10.1016/j.neucom.2022.02.022. | en_US |
dc.identifier.issn | 0925-2312 | - |
dc.identifier.uri | https://bura.brunel.ac.uk/handle/2438/24394 | - |
dc.description.sponsorship | National key R&D Program of China (2019YFC1906201); National Natural Science Foundation of China (91748122). | en_US |
dc.format.extent | 299 - 310 | - |
dc.language.iso | en | en_US |
dc.publisher | Elsevier | en_US |
dc.subject | model-based reinforcement learning | en_US |
dc.subject | policy gradient | en_US |
dc.subject | sample efficiency | en_US |
dc.subject | planning | en_US |
dc.subject | imitation learning | en_US |
dc.title | Deterministic policy optimization with clipped value expansion and long-horizon planning | en_US |
dc.type | Article | en_US |
dc.identifier.doi | https://doi.org/10.1016/j.neucom.2022.02.022 | - |
dc.relation.isPartOf | Neurocomputing | - |
pubs.publication-status | Published | - |
pubs.volume | 483 | - |
dc.identifier.eissn | 1872-8286 | - |
Appears in Collections: | Dept of Computer Science Embargoed Research Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FullText.pdf | Embargoed until 16 Feb 2024 | 2.84 MB | Adobe PDF | View/Open |
Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.