-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make pyvenv style virtual environments easier to configure when embedding Python #66409
Comments
In am embedded system, as the 'python' executable is itself not run and the Python interpreter is initialised in process explicitly using PyInitialize(), in order to find the location of the Python installation, an elaborate sequence of checks is run as implemented in calculate_path() of Modules/getpath.c. The primary mechanism is usually to search for a 'python' executable on PATH and use that as a starting point. From that it then back tracks up the file system from the bin directory to arrive at what would be the perceived equivalent of PYTHONHOME. The lib/pythonX.Y directory under that for the matching version X.Y of Python being initialised would then be used. Problems can often occur with the way this search is done though. For example, if someone is not using the system Python installation but has installed a different version of Python under /usr/local. At run time, the correct Python shared library would be getting loaded from /usr/local/lib, but because the 'python' executable is found from /usr/bin, it uses /usr as sys.prefix instead of /usr/local. This can cause two distinct problems. The first is that there is no Python installation at all under /usr corresponding to the Python version which was embedded, with the result of it not being able to import 'site' module and therefore failing. The second is that there is a Python installation of the same major/minor but potentially a different patch revision, or compiled with different binary API flags or different Unicode character width. The Python interpreter in this case may well be able to start up, but the mismatch in the Python modules or extension modules and the core Python library that was actually linked can cause odd errors or crashes to occur. Anyway, that is the background. For an embedded system the way this problem was overcome was for it to use Py_SetPythonHome() to forcibly override what should be used for PYTHONHOME so that the correct installation was found and used at runtime. Now this would work quite happily even for Python virtual environments constructed using 'virtualenv' allowing the embedded system to be run in that separate virtual environment distinct from the main Python installation it was created from. Although this works for Python virtual environments created using 'virtualenv', it doesn't work if the virtual environment was created using pyvenv. One can easily illustrate the problem without even using an embedded system. $ which python3.4
/Library/Frameworks/Python.framework/Versions/3.4/bin/python3.4
$ pyvenv-3.4 py34-pyvenv
$ py34-pyvenv/bin/python
Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 00:54:21)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.prefix
'/private/tmp/py34-pyvenv'
>>> sys.path
['', '/Library/Frameworks/Python.framework/Versions/3.4/lib/python34.zip', '/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4', '/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/plat-darwin', '/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/lib-dynload', '/private/tmp/py34-pyvenv/lib/python3.4/site-packages']
$ PYTHONHOME=/tmp/py34-pyvenv python3.4
Fatal Python error: Py_Initialize: unable to load the file system codec
ImportError: No module named 'encodings'
Abort trap: 6 The basic problem is that in a pyvenv virtual environment, there is no duplication of stuff in lib/pythonX.Y, with the only thing in there being the site-packages directory. When you start up the 'python' executable direct from the pyvenv virtual environment, the startup sequence checks know this and consult the pyvenv.cfg to extract the: home = /Library/Frameworks/Python.framework/Versions/3.4/bin setting and from that derive where the actual run time files are. When PYTHONHOME or Py_SetPythonHome() is used, then the getpath.c checks blindly believe that is the authoritative value:
/* If PYTHONHOME is set, we believe it unconditionally */
if (home) {
wchar_t *delim;
wcsncpy(prefix, home, MAXPATHLEN);
prefix[MAXPATHLEN] = L'\0';
delim = wcschr(prefix, DELIM);
if (delim)
*delim = L'\0';
joinpath(prefix, lib_python);
joinpath(prefix, LANDMARK);
return 1;
}
Because of this, the problem above occurs as the proper runtime directories for files aren't included in sys.path. The result being that the 'encodings' module cannot even be found. What I believe should occur is that PYTHONHOME should not be believed unconditionally. Instead there should be a check to see if that directory contains a pyvenv.cfg file and if there is one, realise it is a pyvenv style virtual environment and do the same sort of adjustments which would be made based on looking at what that pyvenv.cfg file contains. For the record this issue is affecting Apache/mod_wsgi and right now the only workaround I have is to tell people that in addition to setting the configuration setting corresponding to PYTHONHOME, to use configuration settings to have the same effect as doing: PYTHONPATH=/Library/Frameworks/Python.framework/Versions/3.4/lib/python34.zip:/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4:/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/plat-darwin:/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/lib-dynload so that the correct runtime files are found. I am still trying to work out a more permanent workaround I can add to mod_wsgi code itself since can't rely on a fix for existing Python versions with pyvenv support. Only other option is to tell people not to use pyvenv and use virtualenv instead. Right now I can offer no actual patch as that getpath.c code is scary enough that not even sure at this point where the check should be incorporated or how. Only thing I can surmise is that the current check for pyvenv.cfg being before the search for the prefix is meaning that it isn't consulted.
joinpath(tmpbuffer, env_cfg);
env_file = _Py_wfopen(tmpbuffer, L"r");
if (env_file == NULL) {
errno = 0;
reduce(tmpbuffer);
reduce(tmpbuffer);
joinpath(tmpbuffer, env_cfg);
env_file = _Py_wfopen(tmpbuffer, L"r");
if (env_file == NULL) {
errno = 0;
}
}
if (env_file != NULL) {
/* Look for a 'home' variable and set argv0_path to it, if found */
if (find_env_config_value(env_file, L"home", tmpbuffer)) {
wcscpy(argv0_path, tmpbuffer);
}
fclose(env_file);
env_file = NULL;
}
} pfound = search_for_prefix(argv0_path, home, _prefix, lib_python); |
Yeah, PEP-432 (my proposal to redesign the startup sequence) could just as well be subtitled "getpath.c hurts my brain" :P One tricky part here is going to be figuring out how to test this - perhaps adding a new test option to _testembed and then running it both inside and outside a venv. |
Graham pointed out that setting PYTHONHOME ends up triggering the same control flow through getpath.c as calling Py_SetPythonHome, so this can be tested just with pyvenv and a suitably configured environment. It may still be a little tricky though, since we normally run the pyvenv tests in isolated mode to avoid spurious failures due to bad environment settings... |
Some more experiments, comparing an installed vs uninstalled Python. One failure mode is that setting PYTHONHOME just plain breaks running from a source checkout (setting PYTHONHOME to the checkout directory also fails): $ ./python -m venv --without-pip /tmp/issue22213-py35
$ /tmp/issue22213-py35/bin/python -c "import sys; print(sys.base_prefix, sys.base_exec_prefix)"
/usr/local /usr/local
$ PYTHONHOME=/usr/local /tmp/issue22213-py35/bin/python -c "import sys; print(sys.base_prefix, sys.base_exec_prefix)"
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
Aborted (core dumped) Trying after running "make altinstall" (which I had previously done for 3.4) is a bit more enlightening: $ python3.4 -m venv --without-pip /tmp/issue22213-py34
$ /tmp/issue22213-py34/bin/python -c "import sys; print(sys.base_prefix, sys.base_exec_prefix)"
/usr/local /usr/local
$ PYTHONHOME=/usr/local /tmp/issue22213-py34/bin/python -c "import sys; print(sys.base_prefix, sys.base_exec_prefix)"
/usr/local /usr/local
$ PYTHONHOME=/tmp/issue22213-py34 /tmp/issue22213-py34/bin/python -c "import sys; print(sys.base_prefix, sys.base_exec_prefix)"
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
Aborted (core dumped)
$ PYTHONHOME=/tmp/issue22213-py34:/usr/local /tmp/issue22213-py34/bin/python -c "import sys; print(sys.base_prefix, sys.base_exec_prefix)"
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
Aborted (core dumped)
[ncoghlan@lancre py34]$ PYTHONHOME=/usr/local:/tmp/issue22213-py34/bin /tmp/issue22213-py34/bin/python -c "import sys; print(sys.base_prefix, sys.base_exec_prefix)"
/usr/local /tmp/issue22213-py34/bin I think what this is actually showing is that there's a fundamental conflict between mod_wsgi's expectation of being able to set PYTHONHOME to point to the virtual environment, and the way PEP-405 virtual environments actually work. With PEP-405, all the operations in getpath.c expect to execute while pointing to the *base* environment: where the standard library lives. It is then up to site.py to later adjust the based prefix location, as can be demonstrated by the fact pyvenv.cfg isn't processed if processing the site module is disabled: $ /tmp/issue22213-py34/bin/python -c "import sys; print(sys.prefix, sys.exec_prefix)"
/tmp/issue22213-py34 /tmp/issue22213-py34
$ /tmp/issue22213-py34/bin/python -S -c "import sys; print(sys.prefix, sys.exec_prefix)"
/usr/local /usr/local At this point in time, there isn't an easy way for an embedding application to say "here's the standard library, here's the virtual environment with user packages" - it's necessary to just override the path calculations entirely. Allowing that kind of more granular configuration is one of the design goals of PEP-432, so adding that as a dependency here. |
It is actually very easy for me to work around and I released a new mod_wsgi version today which works. When I get a Python home option, instead of calling Py_SetPythonHome() with it, I append '/bin/python' to it and call Py_SetProgramName() instead. |
Excellent! If I recall correctly, that works because we resolve the symlink when looking for the standard library, but not when looking for venv configuration file. I also suspect this is all thoroughly broken on Windows - there are so many configuration operations and platform specific considerations that need to be accounted for in getpath.c these days that it has become close to incomprehensible :( One of my main goals with PEP-432 is actually to make it possible to rewrite the path configuration code in a more maintainable way - my unofficial subtitle for that PEP is "getpath.c must die!" :) |
I only make the change to Py_SetProgramName() on UNIX and not Windows. This is because back in mod_wsgi 1.0 I did actually used to use Py_SetProgramName() but it didn't seem to work in sane way on Windows so changed to Py_SetPythonHome() which worked on both Windows and UNIX. Latest versions of mod_wsgi haven't been updated yet to even build on Windows, so not caring about Windows right now. |
That workaround would definitely deserve being wrapped in a higher-level API invokable by embedding applications, IMHO. |
(Added Victor, Eric, and Steve to the nosy list here, as I'd actually forgotten about this until issue bpo-35706 reminded me) Core of the problem: the embedding APIs don't currently offer a Windows-compatible way of setting up "use this base Python and this venv site-packages", and the way of getting it to work on other platforms is pretty obscure. |
Victor may be thinking about it from time to time (or perhaps it's time to make the rest of the configuration changes plans concrete so we can all help out?), but I'd like to see this as either:
In the latter case, the main python.exe also gets to define its behavior. So for the most part, we should be able to remove getpath[p].c and move it into the site module, then make that our Python initialization step. This would also mean that if you are embedding Python but not allowing imports (e.g. as only a calculation engine), you don't have to do the dance of _denying_ all lookups, you simply don't initialize them. But as far as I know, we don't have a concrete vision for "how will consumers embed Python in their apps" that can translate into work - we're still all individually pulling in slightly different directions. Sorting that out is most important - having someone willing to do the customer engagement work to define an actual set of requirements and desirables would be fantastic. |
Yeah, I mainly cc'ed Victor and Eric since making this easier ties into one of the original design goals for PEP-432 (even though I haven't managed to persuade either of them to become co-authors of that PEP yet). |
PEP-432 will allow to give with fine control on parameters used to initialize Python. Sadly, I failed to agree with Nick Coghlan and Eric Snow on the API. The current implementation (_PyCoreConfig and _PyMainInterpreterConfig) has some flaw (don't separate clearly the early initialization and Unicode-ready state, the interpreter contains main and core config whereas some options are duplicated in both configs, etc.). See also bpo-35706. |
I just closed 35706 as a duplicate of this one (the titles are basically identical, which feels like a good hint ;) ) It seems that the disagreement about the design is fundamentally a disagreement between a "quick, painful but complete fix" and "slow, careful improvements with a transition period". Both are valid approaches, and since Victor is putting actual effort in right now he gets to "win", but I do think we can afford to move faster. It seems the main people who will suffer from the pain here are embedders (who are already suffering pain) and the core developers (who explicitly signed up for pain!). But without knowing the end goal, we can't accelerate. Currently PEP-432 is the best description we have, and it looks like Victor has been heading in that direction too (deliberately? I don't know :) ). But it seems like a good time to review it, replace the "here's the current state of things" with "here's an imaginary ideal state of things" and fill the rest with "here are the steps to get there without breaking the world". By necessity, it touches a lot of people's contributions to Python, but it also has the potential to seriously improve even more people's ability to _use_ Python (for example, I know an app that you all would recognize the name of who is working on embedding Python right now and would _love_ certain parts of this side of things to be improved). Nick - has the steering council been thinking about ways to promote collaborative development of ideas like this? I'm thinking an Etherpad style environment for the brainstorm period (in lieu of an in-person whiteboard session) that's easy for us all to add our concerns to, that can then be turned into something more formal. Nick, Victor, Eric, (others?) - are you interested in having a virtual whiteboard session to brainstorm how the "perfect" initialization looks? And probably a follow-up to brainstorm how to get there without breaking the world? I don't think we're going to get to be in the same room anytime before the language summit, and it would be awesome to have something concrete to discuss there. |
Technically, the API already exists and is exposed as a private API:
I'm not really sure of the benefit compared to the current initialization API using Py_xxx global configuration variables (ex: Py_IgnoreEnvironmentFlag) and Py_Initialize(). _PyCoreConfig basically exposed *all* input parameters used to initialize Python, much more than jsut global configuration variables and the few function that can be called before Py_Initialize():
Well, it's a strange story. At the beginning, I had a very simple use case... it took me more or less one year to implement it :-) My use case was to add... a new -X utf8 command line option:
If the utf8 mode is enabled (PEP-540), the encoding must be set to UTF-8, all configuration must be removed and the whole configuration (env vars, cmdline, etc.) must be read again from scratch :-) To be able to do that, I had to collect *every single* thing which has an impact on the Python initialization: all things that I moved into _PyCoreConfig. ... but I didn't want to break the backward compatibility, so I had to keep support for Py_xxx global configuration variables... and also the few initialization functions like Py_SetPath() or Py_SetStandardStreamEncoding(). Later it becomes very dark, my goal became very unclear and I looked at the PEP-432 :-) Well, I wanted to expose _PyCoreConfig somehow, so I looked at the PEP-432 to see how it can be exposed.
_PyCoreConfig "API" makes some things way simpler. Maybe it was already possible to do them previously but it was really hard, or maybe it was just not possible. If a _PyCoreConfig field is set: it has the priority over any other way to initialize the field. _PyCoreConfig has the highest prioririty. For example, _PyCoreConfig allows to completely ignore the code which computes sys.path (and related variables) by setting directly the "path configuration":
The code which initializes these fields is really complex. Without _PyCoreConfig, it's hard to make sure that these fields are properly initialized as an embedder would like.
Sorry, I'm not sure of the API / structures, but when I discussed with Eric Snow at the latest sprint, we identified different steps in the Python initialization:
-- Once I experimented to reorganize _PyCoreConfig and _PyMainInterpreterConfig to avoid redundancy: add a _PyPreConfig which contains only fields which are needed before _PyMainInterpreterConfig. With that change, _PyMainInterpreterConfig (and _PyPreConfig) *contained* _PyCoreConfig. But it the change became very large, I wasn't sure that it was a good idea, I abandonned my change.
-- Ok, something else. _PyCoreConfig (and _PyMainInterpreterConfig) contain memory allocated on the heap. Problem: Python initialization changes the memory allocator. Code using _PyCoreConfig requires some "tricks" to ensure that the memory is *freed* with the same allocator used to *allocate* memory. I created bpo-35265 "Internal C API: pass the memory allocator in a context" to pass a "context" to a lot of functions, context which contains the memory allocator but can contain more things later. The idea of "a context" came during the discussion about a new C API: stop to rely on any global variable or "shared state", but *explicitly* pass a context to all functions. With that, it becomes possible to imagine to have two interpreters running in the same threads "at the same time". Honestly, I'm not really sure that it's fully possible to implement this idea... Python has *so many* "shared state", like *everywhere*. It's really a giant project to move these shared states into structures and pass pointers to these structures. So again, I abandonned my experimental change: -- Memory allocator, context, different structures for configuration... it's really not an easy topic :-( There are so many constraints put into a single API! The conservation option at this point is to keep the API private. ... Maybe we can explain how to use the private API but very explicitly warn that this API is experimental and can be broken anytime... And I plan to break it, to avoid redundancy between core and main configuration for example. ... I hope that these explanations give you a better idea of the big picture and the challenges :-) |
Thanks, Victor, that's great information.
This is why I'm keen to design the ideal *user* API first (that is, write the examples of how you would use it) and then figure out how we can make it fit. It's kind of the opposite approach from what you've been doing to adapt the existing code to suit particular needs. For example, imagine instead of all the PySet*() functions followed by Py_Initialize() you could do this: PyObject *runtime = PyRuntime_Create();
/* optional calls */
PyRuntime_SetAllocators(runtime, &my_malloc, &my_realloc, &my_free);
PyRuntime_SetHashSeed(runtime, 12345);
sys.executable = argv0
sys.prefix = os.path.dirname(argv0)
sys.path = [os.getcwd(), sys.prefix, os.path.join(sys.prefix, "Lib")]
pyvenv = os.path.join(sys.prefix, "pyvenv.cfg")
try:
with open(pyvenv, "r", encoding="utf-8") as f: # *only* utf-8 support at this stage
for line in f:
if line.startswith("home"):
sys.path.append(line.partition("=")[2].strip())
break
except FileNotFoundError:
pass
if sys.platform == "win32":
sys.stdout = open("CONOUT$", "w", encoding="utf-8")
else:
# no idea if this is right, but you get the idea
sys.stdout = open("/dev/tty", "w", encoding="utf-8")
"""; PyObject *globals = PyDict_New();
/* only UTF-8 support at this stage */
PyDict_SetItemString(globals, "argv0", PyUnicode_FromString(argv[0]));
PyRuntime_Initialize(runtime, init_code, globals);
Py_DECREF(globals);
PyEval_EvalString("open('file.txt', 'w', encoding='gb18030').close()");
/* may as well reuse DECREF for consistency */
Py_DECREF(runtime); Maybe it's a terrible idea? Honestly I'd be inclined to do other big changes at the same time (make PyObject opaque and interface driven, for example). My point is that if the goal is to "move the existing internals around" then that's all we'll ever achieve. If we can say "the goal is to make this example work" then we'll be able to do much more. |
On Wed, Feb 13, 2019 at 10:56 AM Steve Dower <[email protected]> wrote:
Count me in. This is a pretty important topic and doing this would |
On Wed, Feb 13, 2019 at 5:09 PM Steve Dower <[email protected]> wrote:
That makes sense. :)
FYI, we already have a _PyRuntimeState struct (see
Note that one motivation behind PEP-432 (and its config structs) is to I don't know that you consciously intended to move away from the dense
Hmm, there are two ways we could go with this: keep using TLS (or
Nice. I like that this keeps the init code right by where it's used,
I definitely like the approach of directly embedding the Python code
Nah, we definitely want to maximize simplicity and your example offers
Definitely! Those aren't big blockers on cleaning up initialization
Yep. I suppose part of the problem is that the embedding use cases |
Steve, you're describing the goals of PEP-432 - design the desired API, then write the code to implement it. So while Victor's goal was specifically to get PEP-540 implemented, mine was just to make it so working on the startup sequence was less awful (and in particular, to make it possible to rewrite getpath.c in Python at some point). Unfortunately, it turns out that redesigning a going-on-thirty-year-old startup sequence takes a while, as we first have to discover what all the global settings actually *are* :) https://www.python.org/dev/peps/pep-0432/#invocation-of-phases describes an older iteration of the draft API design that was reasonably accurate at the point where Eric merged the in-development refactoring as a private API (see https://bugs.python.org/issue22257 and https://www.python.org/dev/peps/pep-0432/#implementation-strategy for details). However, that initial change was basically just a skeleton - we didn't migrate many of the settings over to the new system at that point (although we did successfully split the import system initialization into two parts, so you can enable builtin and frozen imports without necessarily enabling external ones). The significant contribution that Victor then made was to actually start migrating settings into the new structure, adapting it as needed based on the goals of PEP-540. Eric updated quite a few more internal APIs as he worked on improving the subinterpreter support. Between us, we also made a number of improvements to https://docs.python.org/3/c-api/init.html based on what we learned in the process of making those changes. So I'm completely open to changing the details of the API that PEP-432 is proposing, but the main lesson we've learned from what we've done so far is that CPython's long history of embedding support *does* constrain what we can do in practice, so it's necessary to account for feasibility of implementation when considering what we want to offer. Ideally, the next step would be to update PEP-432 with a status report on what was learned in the development of Python 3.7 with the new configuration structures, and what the internal startup APIs actually look like now. Even though I reviewed quite a few of Victor and Eric's PR, even I don't have a clear overall picture of where we are now, and I suspect Victor and Eric are in a similar situation. |
Note also that Eric and I haven't failed to agree with Victor on an API, as Victor hasn't actually written a concrete proposal *for* a public API (neither as a PR updating PEP-432, nor as a separate PEP). The current implementation does NOT follow the PEP as written, because _Py_CoreConfig ended up with all the settings in it, when it's supposed to be just the bare minimum needed to get the interpreter to a point where it can run Python code that only accesses builtin and frozen modules. |
Since I haven't really written them down anywhere else, noting some items I'm aware of from the Python 3.7 internals work that haven't made their way back into the PEP-432 public API proposal yet:
|
I created bpo-36142: "Add a new _PyPreConfig step to Python initialization to setup memory allocator and encodings". |
I wrote the PEP-587 "Python Initialization Configuration" which has been accepted. It allows to completely override the "Path Configuration". I'm not sure that it fully implementation what it requested here, but it should now be easier to tune the Path Configuration. See: |
To Victor: |
Just in case this will be of help to anyone, I found a way to use venvs in embedded python.
import sys
sys.executable = r"Path to the python executable inside the venv"
path = sys.path
for i in range(len(path)-1, -1, -1):
if path[i].find("site-packages") > 0:
path.pop(i)
import site
site.main()
del sys, path, i, site |
If you just want to be able to import modules from the venv, and you know the path to it, it's simpler to just do: import sys
sys.path.append(r"path to venv\Lib\site-packages") Updating sys.executable is only necessary if you're going to use libraries that try to re-launch itself, but any embedding application is going to have to do that anyway. |
To Steve: I want the embedded venv to have the same sys.path as if you were running the venv python interpreter. So my method takes into account for instance the include-system-site-packages option in pyvenv.cfg. Also my method sets sys.prefix in the same way as the venv python interpreter. |
I just can say that sorting this issue (and PEP-0432) out would be great! |
The better approach is to just set the search paths you want to have, and leave the python.c-specific functionality to python.c.1 venv initialization is not supported in any manner other than the There are more than enough configuration options to initialise exactly the runtime you want. Footnotes
|
@zooba In my use case, the runtime I want to embed is available as a virtual environment set up with standard package installation tooling, so having to reimplement the virtual environment
The fact that setting (the init config equivalent of)
@benh |
I think we're differing on a few terms/concepts here:
I really don't want to encourage embedding of "whatever version of Python I found on disk". The only time that's going to lead to a good experience is when it's the system interpreter, which isn't "whatever" version. But taking an arbitrary install and doing anything other than launching If you can ensure that your users are using the base runtime you expect, then if you really need the same search paths, I'd really suggest running the interpreter with |
I know which runtime I am embedding (the primary project is creating and shipping a fully integrated set of Python base runtimes and then virtual environments that use those runtimes). Further allowing for embedding applications that use the environments directly via the CPython C API is an alternative we're considering to running components in the environments up as subprocesses and communicating with them via FastAPI (essentially, we'd be adapting the existing C++ application server that is used for Typescript component embedding rather than shipping a separate Python-only application server implementation solely for the Python components and having to maintain two separate implementations of the local client application authentication and authorisation bits). At the very least, those embedding apps need to set While it's undocumented, that has turned out to also have the effect of getting I do think it would be a good idea to document a way to check for runtime compatibility before attempting to start the embedded interpreter when relying on setting
(the arcane failures you get when mixing and matching 3.11 venvs with a Python 3.12 runtime, and vice-versa, are cryptic enough that I agree we don't want to risk giving the impression that folks can point an embedding config at an arbitrary virtual environment and expect it to magically work even when the specificed interpreter's ABI is inconsistent with what the embedding application expects) |
Yeah, this is the intended use of setting the executable. It's expecting/relying on it also doing the pyvenv.cfg lookup that I don't want to commit to. Specifying the search path normally should also resolve
Agreed. Do we not have a pre-init function that returns the hex version already? (I guess I've never worried too much because Windows doesn't have a versionless DLL that gives you non-stable ABI.) |
For dynamic linking, the shared object is versioned, so From a docs point if view, here's my suggestion for a path forward:
Given the necessary infrastructure in the embedding app to run the version check, the future proof code wouldn't be that much harder to write than just the version check. |
I missed this detail (and it's not been brought up by anyone else who I've argued this with 😆 ). So you already need to know which version you're going to be loading. We make good enough guarantees for during a feature release that I'd be okay with labelling these as "may change in future versions without a deprecation period". I'm not sure that anything is future-proof though - the future proof way is to hard code the version you expect and then check again whenever you update to a newer one... most people don't consider that future proof! Perhaps this is best as a "how to" style page, rather than something that might be confused for a specification? These things come up occasionally, and they don't really have a good home (e.g. we recently changed matplotlib to statically link the C++ runtime to avoid conflicts with other modules, which is not "our" responsibility, but also not obvious how to do it). |
I'm going back to the original post, and addressing the raised issues directly based on the current state of things, to understand if they are still valid.
The current code tries determining Lines 559 to 594 in 7900a85
But looking into the code, I found that this functionality is only available on Windows, or macOS builds using
The Python home always refers to the base installation, it sets Setting Here's an example, as implemented in cocotb/cocotb#4293:
Applying this to your use-case in Apache/mod_wsgi. You should only need to set Looking at your code, as far as I can tell, it seems your issues come from backwards compatibility, as you allow users to select which Python environment to use via a My recommendation would be to deprecate these options for Python 2, and replace them with options to specify the Python interpreter path instead. Footnotes
|
…ning exec_prefix Signed-off-by: Filipe Laíns <[email protected]
…rching executable_dir Signed-off-by: Filipe Laíns <[email protected]
GH-127972 fixes using the loaded Considering that, as long as the PRs go through, I believe the issues raised by the original post are adequately addressed. @ncoghlan, the history is a bit too extensive and dense for me to parse right now, would you be able to summarize your particular issues? Perhaps, also, note if you think they are addressed via the proposed changes? |
@FFY00 This part is the same solution I found as well:
However, that's also the behaviour that @zooba is reluctant to elevate from "This happens to work right now, in the current CPython implementation" to "This is the supported way to setup a venv-style environment in an embedding application that is either not linked to CPython, or is linked to the same based runtime as the venv is configured to use". Hence the proposed compromise in #66409 (comment), where we would document that it currently works, but be clear that it isn't guaranteed to keep working in the future and explain what to do instead (specifically, do a runtime query of the target environment and use it to perform a version compatibility check, and then appropriately configure the embedded runtime) |
Thanks for the background. I agree with documenting the From https://docs.python.org/3/c-api/init_config.html#c.PyConfig.program_name:
|
Long term I'm keen to see a separation between "parameters for things we want" and "parameters for things that we use to infer other things in the standard And I'm still convinced that So I think the most correct workaround would be to document how to infer module search paths from the contents of a The middle ground we have now where embedders just have to poke a black box until it works is the worst of all worlds. Let's not ingrain it any further. |
I used the way @FFY00 initialized Python with 3.13 and it works well (just setting One needs to have the original (not venv) python in the And a question arises: After initialization, My exe loads and initializes a custom python module. Now if I use the This is not an issue for me. Part of this is how to find out the current executable on Windows or Linux, but I think that doesn't belong here. |
I'd strongly recommend using
I thought we had a configuration field for this already? If not, then yeah, it should be safe enough to just overwrite it. However, if someone just does |
For the "emulating a venv" use case, I'd recommend setting Setting |
Hmm, I think my last comment was misunderstood. The only exception to this is that my executable loads and initializes a custom 3rdParty Python module which needs a secret. And my executable has a different name so that it can easily be identified in the MS Windows task manager (or in So, I am not "emulating" a venv. Regarding Hmm... When venv creates the file scripts\python.exe in the .venv, does it do some DLL load magic? |
Could the reason for this difference between python.exe and my own executable be related to this (from #112984 (comment))?
And it seems somewhat strange that the |
I think I would disagree with this recommendation as |
You say "in my software" but also "using venv in a normal way", so I'm not clear whether you are making an application that includes a Python runtime (for which the embeddable ZIP is intended), or just trying to avoid running the installer on your own machine?
There's nothing magical - they're in the same directory, and Windows uses that as the biggest hint to find a dependency. Your executable either needs python313.dll in the same directory, or it needs its own launcher that doesn't directly depend on python313.dll, but is able to AddDllDirectory and then launch the executable that depends on it (the setting should be inherited). Alternatively, you could make the second executable a DLL and then LoadLibrary it, which will use the added DLL directory to resolve python313.dll and then you can call through your own interface. On second thought, this is probably the better approach. Ultimately, Python is designed for a Unix-style system where all references are embedded absolute paths and nothing can be relocated. On Windows, the closest way to emulate this is to include your own copy of the runtime in your application directory (and this is what the embeddable distro is for). It sounds like the best way to do what you seem to be doing (relying on someone else's install of Python) is to get them to provide the
Some ways to install Python don't allow directly loading its DLL, so launching a child process is the only way to do it. It would theoretically be possible to test this at runtime and only use the child process as a fallback (I have actually just been writing code that does this), but again, it's less reliable than using the proper interfaces. We also get to take proper advantage of Windows preferring the application directory to ensure that we load the correct dependencies, or else we may crash at runtime. Previously, we either got lucky, or we had to copy practically the entire runtime into the venv to make it work reliably. With a small, self-contained launcher, we don't have to worry about either. Out of interest, why do you find this strange? What were you expecting instead, and why? |
Neither nor. After installing Python and creating a venv with
So, basically, my executable is "python.exe + a properly configured 3rd party module".
No, they are not.
By looking into the source code of
So the reason for the creation of the My executable code does not use this logic, which is quite complex and requires very good knowledge of the Windows process creation API. Duplicating this logic in my own code would be the worst solution. Even if that "find python home (including DLL) from a virtual environment by parsing I think adding the Python home to the PATH is the most reasonable solution for my use-case. Probably my use-case is so special that I'll have to live with this little inconvenience.
I can't resist to say: This is exactly what venv is doing. |
We control both
If it makes you feel any better, we've copy-pasted that code around more than a few times. It tries to handle many of the edge cases that people expect to Just Work, and mostly handles the most common ones (at least those presumed by POSIX developers, who expect it to work like
I'm still intrigued by this. Not that I doubt it, but I've only come across one of these concepts before that couldn't be solved in some easier way.1 Can you provide any hint as to how this mechanism works? Or a link to docs, if it's documented somewhere? I'm interested. But as you say:
Yeah, I'm afraid so. Unless there's a sudden spate of similar irresolvable requests, I'm not keen to make the venv resolution a supported API here. (Supported API meaning anyone can call it, the behaviour is consistent across all platforms, safe across most [mis]configurations, and guaranteed not to change without a proper deprecation cycle. The internal behaviour of Footnotes
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: