Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have plugin include K8SPhysicalHostName in the hold message #1480

Closed
bbockelm opened this issue Jul 12, 2024 · 2 comments · Fixed by #1891
Closed

Have plugin include K8SPhysicalHostName in the hold message #1480

bbockelm opened this issue Jul 12, 2024 · 2 comments · Fixed by #1891
Assignees
Labels
client Issue affecting the OSDF client enhancement New feature or request
Milestone

Comments

@bbockelm
Copy link
Collaborator

At some hosts, the "EP name" is meaningless (it is the randomly-generated pod name) but the machine's K8SPhysicalHostName attribute records the "real" hostname.

We should add this name to the metadata we include in the hold message for the plugin. It should be recorded, if present, as the hostname attribute); if K8SPhysicalHostName is not present, then no hostname should be put in the error message (as it would be duplicative of other parts of the error message.

So, example:

Attempt #1: from osg-kansas-city-stashcache.nrp.internet2.edu:8443: transfer error: \
   Unable to read /path-facility/data/foo; network dropped connection on reset (5m47.8s since start) \
   (Version: 7.9.2; Site: UNL-PATH)

would become:

Attempt #1: from osg-kansas-city-stashcache.nrp.internet2.edu:8443: transfer error: \
   Unable to read /path-facility/data/foo; network dropped connection on reset (5m47.8s since start) \
   (Version: 7.9.2; Site: UNL-PATH; Hostname: foo.unl.edu)
@bbockelm bbockelm added enhancement New feature or request client Issue affecting the OSDF client labels Jul 12, 2024
@bbockelm bbockelm added this to the v7.10.0 milestone Jul 12, 2024
@turetske
Copy link
Collaborator

turetske commented Jul 22, 2024

@bbockelm I'm not finding that attribute in the condor documentation at all for MachineAd. Is it custom or would it be under Machine or ClientMachine? Or am I looking in the wrong place entirely? If it is custom, would it be in the machine ad or the class ad? My instinct is the machine ad, but I want to be sure.

@bbockelm
Copy link
Collaborator Author

You want to look for the file via the _CONDOR_MACHINE_AD environment variable. See this documentation section: https://htcondor.readthedocs.io/en/latest/users-manual/env-of-job.html#extra-environment-variables-htcondor-sets-for-jobs

@turetske turetske modified the milestones: v7.10.0, v7.12.0 Dec 2, 2024
@turetske turetske added the critical High priority for next release label Dec 2, 2024
@turetske turetske removed the critical High priority for next release label Dec 13, 2024
@turetske turetske modified the milestones: v7.12.0, v7.13.0 Dec 13, 2024
@turetske turetske modified the milestones: v7.13.0, parking-lot Jan 9, 2025
@turetske turetske linked a pull request Jan 13, 2025 that will close this issue
@jhiemstrawisc jhiemstrawisc modified the milestones: v7.13.0, v7.14 Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client Issue affecting the OSDF client enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants