Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-44554: refactor pdb targets (and internal tweaks) #26992

Merged
merged 14 commits into from
Jul 19, 2021

Conversation

jaraco
Copy link
Member

@jaraco jaraco commented Jul 2, 2021

  • Refactor module/script handling to share an interface (check method).
  • Import functools and adjust tests for the new line number for find_function.
  • Use cached_property for details.
  • Add blurb.

https://bugs.python.org/issue44554

Automerge-Triggered-By: GH:jaraco


if not run_as_module:
mainpyfile = os.path.realpath(mainpyfile)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line changes the meaning of mainpyfile mid-function. It's because of this behavior that ScriptTarget contains an 'orig' property (to keep both representations), even though the "real path" form is the preferred one except during error checking.

@iritkatriel
Copy link
Member

This changes the behavior of pdb, right?

@jaraco
Copy link
Member Author

jaraco commented Jul 2, 2021

This PR extracts the refactoring only, but not the fix for bpo-44461, as discussed in iritkatriel#16.

I would be careful with it - pdb invokes code in several different way (from command line, from the pdb prompt, from Pdb subclasses that various debuggers implement).

One place this refactoring could go awry is if there were other callers of Pdb._runmodule, because this code changes the parameter type for that method. By my estimation, that method was only ever called from main().

Please do have a look at this approach as it focuses on the refactoring and ignores the concerns for now in fixing the reported issue.

@jaraco
Copy link
Member Author

jaraco commented Jul 2, 2021

This changes the behavior of pdb, right?

I do not believe this changes the behavior except for a couple subtle differences. I'll point those out in the diff.

Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A refactor changes the code to improve its structure without changing its behavior. I don't think you can call this a refactor.

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@@ -1674,28 +1706,19 @@ def main():
print(_usage)
sys.exit(2)

commands = []
run_as_module = False
for opt, optarg in opts:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the opts would be iterated over just once. In this new code, they are iterated over three times. Any performance degradation is negligible and outweighed by the value of disentangling the concerns.


target.check()

sys.argv[:] = args # Hide "pdb.py" and pdb options from argument list
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous implementation, assignment of sys.argv[:] would happen before the sys.path change. In the new code, the sys.path change happens during the ScriptTarget.check. I contend these two operations are independent and can happen in any order.

@iritkatriel
Copy link
Member

Please do have a look at this approach as it focuses on the refactoring and ignores the concerns for now in fixing the reported issue.

This PR doesn't really have anything to do with the reported issue. It should be a new BPO.

return res

def check(self):
if not os.path.exists(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the existence check was done on mainpyfile before it was mutated to a "real path". Now the existence check is the real path. If compatibility is needed, this line could be changed to:

Suggested change
if not os.path.exists(self):
if not os.path.exists(self.orig):

But in my opinion, the existence check on either form is equivalent, and the form presented is more readable and distracts less from the primary intention for self.orig (the error message).

@jaraco
Copy link
Member Author

jaraco commented Jul 2, 2021

A refactor changes the code to improve its structure without changing its behavior. I don't think you can call this a refactor.

Perhaps I'm using the term too loosely. I believe the changes do not change the behavior or interfaces for the intended use-cases.

@jaraco
Copy link
Member Author

jaraco commented Jul 2, 2021

This PR doesn't really have anything to do with the reported issue. It should be a new BPO.

Respectfully, I disagree. The whole point of this refactor is to enable an earlier check for a runpy module target, symmetric to the syntax check on a script target, a technique which has been shown to fix the issue (thanks to your analysis and test).

@jaraco jaraco changed the title bpo-44461: refactor pdb targets bpo-44461: refactor pdb targets (and internal tweaks) Jul 2, 2021
@iritkatriel
Copy link
Member

A refactor changes the code to improve its structure without changing its behavior. I don't think you can call this a refactor.

Perhaps I'm using the term too loosely. I believe the changes do not change the behavior or interfaces for the intended use-cases.

I think you are using it too loosely. A refactor improves the code without changing its behavior. As I said, pdb has quite a few different use cases. I don't know it well enough to tell which changes are minor and which will break someone's debugger. Any behavior change needs to be associated with a bpo and news item that states clearly what the change is, not sneaked into a refactor.

I'm also not sure about the motivation here. We have a fix for the reported bug.

@jaraco
Copy link
Member Author

jaraco commented Jul 2, 2021

I think you are using it too loosely. A refactor improves the code without changing its behavior.

I've updated the blurb and PR title to reflect that it's more than a refactor. But I do genuinely expect this change to result in no significant behavior change. Can you point to a behavior change that you think is more than a refactor?

I'm also not sure about the motivation here. We have a fix for the reported bug.

The motivation comes from my observation here that when the target is a script, there's a separate check for its existence (and similarly later a separate trap for bad syntax) that doesn't go through the Pdb interactive module. In my opinion, passing an invalid module to -m is equivalent to passing a non-existent script and should be handled similarly. I would not expect the interactive prompt to appear if a non-executable module is indicated as a runpy target. In other words, I don't expect Pdb to break into runpy code any more than I would expect it to break into CPython code if a script didn't exist or had a syntax error.

As I analyzed the issue, following your analysis and initial patch, I felt uneasy with adding the .botframe property. I asked myself, why is the botframe property needed for only this use case? And as you found, it's because the Bdb.run is never called. That got me thinking - the Pdb interactive prompt probably shouldn't be invoked at all if .run is never reached. So I asked myself, why is .run never reached in this case, and it's because runpy is raising an exception loading the module and checking it for runability. And that's what led me to the consideration that the module should be checked for runability early, and a failure to pass runability shouldn't trigger the "Unhandled Exception" logic and interactive prompt.

@jaraco
Copy link
Member Author

jaraco commented Jul 2, 2021

We have a fix for the reported bug.

I believe that by first applying this refactor, then applying the test and fix in iritkatriel#16, you get a superior fix, avoiding adding a .botframe property to Bdb where it was previously not needed and avoiding the need to duplicately check for _user_requested_quit (although I suspect that change may also be independently beneficial). Additionally, with this refactor, the fix for the issue becomes a few lines in a single location (implementing ModuleTarget.check) instead of altering behavior in other classes and states.

@iritkatriel
Copy link
Member

Can you point to a behavior change that you think is more than a refactor?

Since it handles bad modules differently (we agree on that right?) it's already more than a refactor.

That may be a minor change, it might be a good change, I don't know. I need to spend some time reviewing it. I've moved on to work on something else between last week and now. I don't have time at the moment to convince myself that your change is safe. I'll have a look when I have time, unless someone else will before me.

@jaraco
Copy link
Member Author

jaraco commented Jul 2, 2021

Since it handles bad modules differently (we agree on that right?)

Sorry this wasn't clear. In this change, ModuleTarget.check is empty, matching the existing behavior. Bad modules are not handled. This change does not fix bpo-44461, but only refactors the code such that there is a designated place for such a check to exist. I would expect to merge this change prior to the fix for bpo-44461.

But as I think about it, perhaps the concerns this change addresses could be enumerated in its own bug.

I don't have time at the moment

No problem. I understand. Thanks for taking the time to review as much as you have already and for keeping me apprised of your schedule so I'm not left feeling abandoned.

@jaraco
Copy link
Member Author

jaraco commented Jul 2, 2021

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@iritkatriel: please review the changes made to this pull request.

@bedevere-bot bedevere-bot requested a review from iritkatriel July 2, 2021 22:49
@jaraco
Copy link
Member Author

jaraco commented Jul 18, 2021

In fe8aea5, I've pushed an additional refactor that consolidates _run(:ModuleTarget) and _run(:ScriptTarget) by implementing the differential behaviors on each target. This approach dramatically simplifies the implementation of _run but also clarifies the purpose of each target.

This change has the additional benefit of restoring the association of the comments with their function. Prior to this change, the two block comments in _run_script were disassociated from their corresponding implementation in _run_module.

Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version is indeed clearer. Can spec() and details() be _spec() and _details()?

@jaraco
Copy link
Member Author

jaraco commented Jul 18, 2021

This version is indeed clearer. Can spec() and details() be _spec() and _details()?

Done in c91ba80.

Copy link
Member

@iritkatriel iritkatriel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@miss-islington
Copy link
Contributor

@jaraco: Status check is done, and it's a success ❌ .

@miss-islington
Copy link
Contributor

@jaraco: Status check is done, and it's a success ✅ .

@miss-islington
Copy link
Contributor

Sorry, I can't merge this PR. Reason: 3 of 5 required status checks are expected..

@miss-islington
Copy link
Contributor

@jaraco: Status check is done, and it's a success ✅ .

@miss-islington miss-islington merged commit 2c20558 into main Jul 19, 2021
@miss-islington miss-islington deleted the bpo-44461-refactor-targets branch July 19, 2021 01:00
shihai1991 added a commit to shihai1991/cpython that referenced this pull request Jul 20, 2021
* origin/main: (1146 commits)
  bpo-42064: Finalise establishing sqlite3 global state (pythonGH-27155)
  bpo-44678: Separate error message for discontinuous padding in binascii.a2b_base64 strict mode (pythonGH-27249)
  correct spelling (pythonGH-27076)
  bpo-44524: Add missed __name__ and __qualname__ to typing module objects (python#27237)
  bpo-27513: email.utils.getaddresses() now handles Header objects (python#13797)
  Clean up comma usage in Doc/library/functions.rst (python#27083)
  bpo-42238: Fix small rst issue in NEWS.d/. (python#27238)
  bpo-41972: Tweak fastsearch.h string search algorithms (pythonGH-27091)
  bpo-44340: Add support for building with clang full/thin lto (pythonGH-27231)
  bpo-44661: Update property_descr_set to use vectorcall if possible. (pythonGH-27206)
  bpo-44645: Check for interrupts on any potentially backwards edge (pythonGH-27216)
  bpo-41546: make pprint (like print) not write to stdout when it is None (pythonGH-26810)
  bpo-44554: refactor pdb targets (and internal tweaks) (pythonGH-26992)
  bpo-43086: Add handling for out-of-spec data in a2b_base64 (pythonGH-24402)
  bpo-44561: Update hyperlinks in Doc/distributing/index.rst (python#27032)
  bpo-42355: symtable.get_namespace() now checks whether there are multiple or any namespaces found (pythonGH-23278)
  bpo-44654: Do not export the union type related symbols (pythonGH-27223)
  bpo-44633: Fix parameter substitution of the union type with wrong types. (pythonGH-27218)
  bpo-44654: Refactor and clean up the union type implementation (pythonGH-27196)
  bpo-20291: Fix MSVC warnings in getargs.c (pythonGH-27211)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants