You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Jobs which complete successfully or end on an error are going back into the queue to be rerun rather than being removed from it. This is a known issue and we are working on it.
The text was updated successfully, but these errors were encountered:
a new part of the epilog was introduced - this was designed to clear any locks on /dev/ipath which jobs left behind. This worked, but unfortunately if there were no locks to start with it introduced an error.
extra debugging and logging was added to try and find and record the above problem, and any futures ones like it.
once the original problem was resolved, the debugging (which retained the jobs state as 'active' on nodes were it failed) remained, and certain jobs kept reuseing this data - saying they had failed, when in fact now they shouldn't be.
So the first part was fixed on Friday, and the last step was resolved this morning (Monday).
Jobs which complete successfully or end on an error are going back into the queue to be rerun rather than being removed from it. This is a known issue and we are working on it.
The text was updated successfully, but these errors were encountered: