Skip to content

allow ending the episode for MaxStepsReached #4453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 9, 2020

Conversation

chriselion
Copy link
Contributor

@chriselion chriselion commented Sep 3, 2020

Proposed change(s)

This came up when trying to use MaxSteps on the match-3 sample. If you have multiple Agents in the scene using on-demand decisions, their step count will get incremented whenever the Academy steps, even if they're not doing anything. For training, you can work around this by requesting a decision each step, but for inference or heuristic mode when you might want to show animations of the Agent "boards", the decision request frames get out of sync and you end up hitting MaxSteps too soon.

I think this is the best solution without breaking existing behavior (a better approach would be to only count decision or action steps towards MaxSteps, but that's a breaking change). An alternative would be to make DoneReason public and allow that as an optional parameter to EndEpisode.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

https://jira.unity3d.com/browse/MLA-1345

Types of change(s)

  • New feature

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

Copy link
Contributor

@vincentpierre vincentpierre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic is cleaner, I have some reservations on adding the new public method. Happy to hear you thoughts.

/// </remarks>
/// <seealso cref="OnEpisodeBegin"/>
/// <seealso cref="EndEpisode"/>
public void EpisodeMaxStepReached()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not calling this MaxStepReached, but call it Interrupted. The reason is that MaxStepReached will not (I hope) be the only reason an agent will be interrupted forever. A user triggered scene reset should have the same effect as MaxStepReached (for example if the Agents reset but not because it is their fault)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so just rename this to EpisodeInterrupted (and change the docs accordingly)?

@chriselion chriselion merged commit fce3521 into master Sep 9, 2020
@delete-merged-branch delete-merged-branch bot deleted the MLA-1345-manual-MaxStepReached branch September 9, 2020 22:37
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants