Questions tagged [reward]

A reward is the network feedback in a reinforcement-learning setting. Reward functions describe how an agent is awarded for its actions in a given state.

When an agent takes a step, the feedback from the network is known as the reward.

4 questions
1
vote
1 answer

What is a good reward function when objective is to minimize the average along with the variance?

I am trying to formulate a problem where we are trying to minimize the average resource allocated to different users. Due to some inherent properties of the environment, some users can be easily minimized while it is difficult for other users due to…
user3656142
  • 181
  • 1
  • 6
0
votes
0 answers

Frozen baseline for policy gradient rewards

I have a continuous reinforcement learning problem for which I use policy gradients and I use a baseline to decrease the variance of the gradients. The baseline that I used is the moving average of the rewards obtained during the last 10 time steps…
aby
  • 11
  • 1
0
votes
0 answers

What can I infer if large negative penalties are not increasing?

I am running a Deep RL algorithm. I defined a custom reward function. I run the algorithm for at least 500 epochs. For each epoch, I am printing the total reward received by the actor-network. It is around $- 10^5$ for the first epoch. After the…
0
votes
1 answer

How to write a reward function that optimizes for profit and revenue?

So I want to write a reward function for a reinforcement learning model which picks products to display to a customer. Each product has a profit margin %. Higher price products will have a higher profit margin but lower probability of being…
JimDoe
  • 23
  • 4