Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable PROPFIND of Director health test API #2002

Merged
merged 4 commits into from
Feb 14, 2025

Conversation

jhiemstrawisc
Copy link
Member

The cache's xrdcl-pelican plugin switched from doing a HEAD request against requested resources to a PROPFIND request, the answer to which is used by the cache to determine whether something is an object or a collection. That's hunky dory, except that it broke the pseudo Director origin responsible for generating health test files, because the Director's health test API wasn't handling the propfinds.

This commit adds PROPFIND handling and generates the XML needed by the cache to understand the response. Note that I constructed the XML by pinging a cache with a real PROPFIND and then templating the response with file name, size, and time stamp.

One item remaining here is an actual E2E test of the setup -- while the unit tests are passing just fine, they passed just fine when everything broke... a better setup is forthcoming.

@jhiemstrawisc jhiemstrawisc added cache Issue relating to the cache component director Issue relating to the director component monitoring labels Feb 11, 2025
@bbockelm
Copy link
Collaborator

The reason why we don't have unit test coverage here is that scitokens.cfg circular dependencies required circular logic (the audience URL in the scitokens.cfg requires the port number but the port number is only known after scitokens.cfg is parsed). The director uses audience URLs when generating tokens for the cache tests.

So, any future work for an end-to-end test requires:

  1. Having a test-specific mode where the director doesn't add an audience. Don't love that because we are testing different code paths than we run in production.
  2. (New!) Take advantage of [XrdSciTokens] Automatically add WLCG-audiences upon request xrootd/xrootd#2417 - now in the test container since this is in our patch build. That PR has XRootD put in the correct port, not the writer of the scitokens.cfg, breaking the circular dependency mentioned above.

I think (2) is the right way to go -- but it'll require a separate follow-up to leverage the xrootd functionality first. So, I'm OK with splitting E2E out into a separate PR.

@jhiemstrawisc
Copy link
Member Author

I don't follow -- /pelican/monitoring is a public namespace, appearing in the cache's authfile as:

u * /pelican/monitoring lr

Have I completely missed something related to tokens and health test files?

@jhiemstrawisc jhiemstrawisc marked this pull request as ready for review February 11, 2025 16:13
@jhiemstrawisc jhiemstrawisc added this to the v7.14 milestone Feb 11, 2025
@jhiemstrawisc jhiemstrawisc added test Improvements to the test suite critical High priority for next release labels Feb 11, 2025

Verified

This commit was signed with the committer’s verified signature.
renovate-bot Mend Renovate
The cache's `xrdcl-pelican` plugin switched from doing a HEAD request against
requested resources to a PROPFIND request, the answer to which is used by the
cache to determine whether something is an object or a collection. That's
hunky dory, except that it broke the pseudo Director origin responsible for
generating health test files, because the Director's health test API wasn't
handling the propfinds.

This commit adds PROPFIND handling and generates the XML needed by the cache to
understand the response. Note that I constructed the XML by pinging a cache with
a real PROPFIND and then templating the response with file name, size, and time
stamp.

One item remaining here is an actual E2E test of the setup -- while the unit tests
are passing just fine, they passed just fine when everything broke... a better
setup is forthcoming.
This creates a new package for testing these types of E2E situations where
we couldn't otherwise import a package (in this case, the test might live
in the Director package, except we can't import fed_test_utils there b.c.
of cyclic imports).
@jhiemstrawisc
Copy link
Member Author

Adding a second reviewer to help load balance some of these reviews while we're in crunch time for 7.14

Copy link
Collaborator

@turetske turetske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes. I also don't know what Brian is asking for either.

federation test. However, that package cannot be imported by Director, Cache, or Origin tests directly
because the fed test itself _must_ import those packages, leading to a cyclical dependency.

The `github_scripts` directory contains a similar set of CI tests, but it's easier to write rigorous
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are benefits to both. Bash tests allow us to run integration tests using the commands and potential environment that are in the real word and are important to do. I would are that instead of "it's easier", instead to say that this "allows us to leverage the benefits of Golang's testing package".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually do think "easier" is an accurate description for some tests. For example, in the e2e cache test from this PR, I'm using the fact that I can directly call functions from our go packages to simplify some aspects of E2E testing while still spinning up an entire federation. This lets me tap into specific functionality while still testing that the whole communication pipeline for token fetching works as needed. This would be very difficult to do with raw bash.

@jhiemstrawisc
Copy link
Member Author

That I can tell, all these failing tests are related to #1942. Until the underlying issue is resolved, let's plan to ignore test failures when they're related only to the memstat stuff. We could chase green boxes by re-running everything until we get lucky, but I don't think that's a good use of our time when the failures are isolated to a known bad test.

@turetske turetske merged commit ae9291f into PelicanPlatform:main Feb 14, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cache Issue relating to the cache component critical High priority for next release director Issue relating to the director component monitoring test Improvements to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The Director doesn't handle PROPFIND when /pelican/monitoring is requested
3 participants