A framework for Reinforcement Learning with Keras
Questions tagged [keras-rl]
11 questions
7
votes
1 answer
What are the effects of clipping the reward in stability?
I am looking for stabilizing my results of DQN, I found clipping is one technique to do it but I did not understand it completely!
1- what are the effects of clipping the reward, clipping the gradient, clipping the error in stability and how makes…
user10296606
- 1,784
- 5
- 17
- 31
4
votes
1 answer
How to implement clipping the reward in DQN in keras
How to implement clipping the reward in DQN in keras? especially how to implement clipping the reward?
Is this pseudo code correct:
if reward<-threshold reward=-1
elseif reward>threshold reward=1
elseif -threshold
user10296606
- 1,784
- 5
- 17
- 31
4
votes
1 answer
What is a minimal setup to solve the CartPole-v0 with DQN?
I solved the CartPole-v0 with a CEM agent pretty easily (experiments and code), but I struggle to find a setup which works with DQN.
Do you know which parameters should be adjusted so that the mean reward is about 200 for this problem?
What I…
Martin Thoma
- 18,630
- 31
- 92
- 167
3
votes
1 answer
Evaluating a trained Reinforcement Learning Agent?
I am new to reinforcement learning agent training. I have read about PPO algorithm and used stable baselines library to train an agent using PPO. So my question here is how do I evaluate a trained RL agent. Consider for a regression or…
chink
- 555
- 9
- 17
2
votes
0 answers
Actions taken by agentn/ agent performance not improving
Hi I am trying to develop an rl agent using PPO algorithm. My agent takes an action(CFM) to maintain a state variable called RAT in between 24 to 24.5. I am using PPO algorithm of stable-baselines library to train my agent.I have trained the agent…
chink
- 555
- 9
- 17
2
votes
2 answers
Keras models break when I add batch normalization
I'm creating the model for a DDPG agent (keras-rl version) but i'm having some trouble with errors whenever I try adding in batch normalization in the first of two networks.
Here is the creation function as i'd like it to be:
def…
axon
- 23
- 4
1
vote
0 answers
Is "nb_steps_warmup" set for each episode or globally?
When I configure a DQN agent, nb_steps_warmup can be set. Is this parameter set for each episode or once globally?
What I am trying to ask is, imaging I have a game environment which takes about 3000 max. steps per episode. The DQN is fitted as…
StefanOverFlow
- 111
- 1
1
vote
1 answer
Formulation of a reward structure
I am new to reinforcement learning and experimenting with training of RL agents.
I have a doubt about reward formulation, from a given state if a agent takes a good action i give a positive reward, and if the action is bad, i give a negative reward.…
chink
- 555
- 9
- 17
1
vote
0 answers
with tf.device(DEVICE): model = modellib.MaskRCNN(mode = "inference", model_dir = LOGS_DIR, config = config)
ValueError Traceback (most recent call last)
/miniconda/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
509 as_ref=input_arg.is_ref,
--> 510…
shiva
- 11
- 1
0
votes
0 answers
using Reinforcement learning for binary classification
I want to build an agent for binary classification. I have a large dataset with two label (0 and 1). I want to build an agent to predict labels. I build a deep model and now I want to build an agent. I use keras-rl2. but there is a problem: for dqn…
sdbvuf sbjdsfdib
- 1
- 1
0
votes
1 answer
Q-Learning experience replay: how to feed the neural network?
I'm trying to replicate the DQN Atari experiment. Actually my DQN isn't performing well; checking another one's codes, I saw something about experience replay which I don't understand. First, when you define your CNN, in the first layer you have to…
Joaquin
- 1
- 3