How do I discretise a continuous observation and action space in Pytho - Enhance your coding expertise with Dzartz94 on @onlycoders.net

1 year ago

#385831

Dzartz94

How do I discretise a continuous observation and action space in Python?

My professor has asked me to apply a Policy Iteration method on the Pendulum-V1 gym environment in OpenAI.

Pendulum-V1 has the following Environment:

Observation

Type: Box(3)

Num	Observation	Min	Max
0	cos(theta)	-1.0	1.0
1	sin(theta)	-1.0	1.0
2	theta dot	-8.0	8.0

Actions

Type: Box(1)

Num	Observation	Min	Max
0	Joint effort	-2.0	2.0

From my understanding, Policy Iteration requires discrete actions, discrete observations and probability functions, such as the Frozen Lake OpenAI environment. I know that there are methods designed for box type data in a continuous range but the requirement is to apply a "correct" Policy Iteration method and explain why it doesn't work.

Does anyone have a source, know a code repo, or could help me with how I would discretise the action and observation state data and apply it via the Policy Method? Everything I have read has told me this is a bad way to solve this problem and I cannot seem to find anyone who has actually implemented this method on Pendulum-V1.

python

reinforcement-learning

openai-gym

discretization

openai-api

0 Answers

Your Answer

Posts

Questions

Blogs

Jobs