Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PQL support, refactor and reorganize code #201

Merged
merged 22 commits into from
Apr 15, 2021

Conversation

gdubicki
Copy link
Contributor

@gdubicki gdubicki commented Mar 16, 2021

Basic Not so basic anymore but and working PQL support. ;)

There are 2 ways of using PQL:

  • as a type of query for existing methods, like nodes() - this gives you casting to our rich types like Node, but limits ability to use projections,
  • as a direct call to the pql endpoint - this gives all the features of PQL but returns raw JSON = list of dicts,

While working on this I revamped the user docs.

While doing that I have added info on how to use the ssl_verify to provide the CA root certificate (fixes #143) and added missing info about using the Basic auth. I also improved some examples, fixing #129.

UPDATE: the scope and API have changed since I created this PR, see the comments below.

@gdubicki gdubicki mentioned this pull request Mar 16, 2021
@coveralls
Copy link

coveralls commented Mar 16, 2021

Coverage Status

Coverage increased (+1.5%) to 84.529% when pulling 2b0ba7e on gdubicki:pql into 1f474f8 on voxpupuli:master.

@gdubicki gdubicki marked this pull request as ready for review March 16, 2021 19:18
@gdubicki gdubicki marked this pull request as draft March 21, 2021 12:35
@gdubicki gdubicki mentioned this pull request Mar 21, 2021
@gdubicki gdubicki requested a review from bastelfreak March 21, 2021 13:48
@gdubicki gdubicki changed the title Add PQL support Add PQL support, revamp the user docs Mar 21, 2021
@gdubicki gdubicki marked this pull request as ready for review March 21, 2021 20:56
Copy link
Member

@b4ldr b4ldr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im no expect on pypuppetdb but this looks like a great addition to me thanks

@gdubicki
Copy link
Contributor Author

gdubicki commented Mar 22, 2021

Frankly I am not very happy with the proposed API design. We would be repeating ourselves when using PQL and rich types:

db.nodes(pql="nodes { (...) })
#  ^^^^^      ^^^^^   
#      \ ugly /

To remove this duplication we could ask the user to strip PQL of the "nodes {" and "}" parts but I think that would be wrong. Apart from other reasons this will make creating or testing the query elsewhere (f.e. in the Query view of the Puppetboard) harder.

So this leaves us with changing the method name. Maybe we should rethinkg the "pql()" method?

How about we autocast when projections are not used, like this:

# this would return Nodes
db.pql("nodes { (...) }")

# this would return Inventories
db.pql("inventory { (...) }")

...and automatically switch to dicts when projection is detection:

# this would return dicts
db.pql("inventory[foobar] { (...) }")

...and an optional extra parameter to disable the autocasting completely:

# this would return dicts
db.pql("nodes { (...) }", raw=True)

But then I would NOT add the pql parameter to existing methods like nodes() as "there should be only way to do it". 🧘‍♂️

Thoughts?

@bastelfreak
Copy link
Member

I don't know enough Python to give a proper statement here. Maybe @sebastianrakel is able to review this.

Copy link
Member

@b4ldr b4ldr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.nodes(pql="nodes { (...) })
#  ^^^^^      ^^^^^   
#      \ ugly /

I also thought this looked a bit strange however ...

To remove this duplication we could ask the user to strip PQL of the "nodes {" and "}" parts but I think that would be wrong.

I thought this was worse (and so didn't comment)

How about we autocast when projections are not used, like this:
...and an optional extra parameter to disable the autocasting completely:

I like the idea of auto casting however i think the raw pql method and the autocasting method should be two separate methods

def _pql(self, pql, **kwargs):
   elements = self._query(pql=pql, **kwargs)
       for element in elements:
           yield element

def pql(self, pql, **kwargs):
     for element in _pql(pql, **kwargs):
        try:
            yield # cast element
       except:
            yield element

Further as to my comments in the review i think it could make sense to change the code so that the _pql method dosn't depend on _query. instead split out the logic to make the actual request into its own method _request and have _query and _pql both call _request

But then I would NOT add the pql parameter to existing methods like nodes() as "there should be only way to do it". lotus_position_man

agreed

>>> value = "Debian"
>>> }
>>> }
>>> """
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here an else where missing a trailing )

@gdubicki
Copy link
Contributor Author

Thanks a lot for the insightful review, @b4ldr ! I will try to apply your suggestions within the next few days.

@gdubicki gdubicki changed the title Add PQL support, revamp the user docs Add PQL support, refactor and reorganize code, revamp the user docs Apr 5, 2021
@gdubicki
Copy link
Contributor Author

gdubicki commented Apr 5, 2021

I have applied your suggestions about the code, @b4ldr (not the docs yet, but I have them on my TODO list, I won't forget!) and while doing that I got carried away a bit...

I have refactored the code a lot.

Firstly I have splitted the BaseAPI from a single > 1000 LOC long api.py into separate classes and files by the query endpoints, in a way almost exactly like they are listed in the PuppetDB docs:

  • pdb/query/v4 - PqlAPI
  • pdb/query/v4/* - QueryAPI
  • pdb/cmd/v1 - CommandAPI
  • status/v1/** - StatusAPI
  • pdb/meta/v1* - MetadataAPI
  • metrics/* - MetricsAPI

(I have done a similar split in the tests.)

They all inherit from BaseAPI where I left the code shared by all the APIs and I used multiple inheritance to combine those APIs into a single API class that should be used externally.

Secondly have I pulled out the methods to create the entities like Node into those classes, as create_from_dict() methods.

I am sure that this is better than what we had before, but not sure if this is the ideal design. What do you think, @b4ldr ?

@b4ldr
Copy link
Member

b4ldr commented Apr 7, 2021

Just a quick note to say this is on my list and thanks for all the effort, hope to give it a review over the weekend

@bastelfreak
Copy link
Member

hey, this diff is quite big. Is it possible to split it into multiple PRs? that makes it easier to review and better for a generated changelog.

@gdubicki
Copy link
Contributor Author

gdubicki commented Apr 8, 2021

Yes, I know that the PR is big. I don't like such PRs too, but IMHO this code badly needed refactoring.

So there are 3 groups of changes here: 1. PQL support added, 2. Reorganization and very small refactoring of the rest of the code, 3. Docs update.

The refactoring in 2. is minimal to keep the PR manageable. I just pulled some code out into separate methods and did not change the existing logic, with one exception: I fixed #202 . I changed those 2 lines mentioned in that issue. So most of 2. is moving the code around.

I can pretty easily get the 3. out of this PR. Would that be enough, @bastelfreak ?

@bastelfreak
Copy link
Member

getting 3 out would be a good start. can you do that, we review and merge the new PR and afterwards rebase #201 ?

@gdubicki
Copy link
Contributor Author

gdubicki commented Apr 8, 2021

I moved docs update out of this PR but I think that we have a misunderstanding about that, @bastelfreak .

Docs update is related to the other changes here as not only it add the docs for the new PQL API but also puts PQL in the first place as the recommended way of querying PuppetDB.

So IMHO we should review this PR first and then then I'll create the PR with docs.

Copy link
Member

@b4ldr b4ldr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good to me as you mentioned most of the changes are just moving code around. Im not sure i like the big mix-in API class however i think the organisation is much better. however it would be good to try all of the mix-in classes implement unique methods. i,e. some way to prevent other class implementing query

The main risk i see is if someone is using BaseAPI directly which seems unlikely to me but still worth highlighting. I think it would be very good and increase my confidence if we could show that this new version works with puppetboard (with no alterations). Is this something you are able to test?

return self._make_request(url, request_method, payload)

# TODO: deduplicate this - see QueryAPI.nodes()
def pql(self, pql, with_status=False, unreported=2, with_event_numbers=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't feel right to have node specific parameters for such a generic pql method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, but I don't have a better idea..

@gdubicki
Copy link
Contributor Author

gdubicki commented Apr 9, 2021

I think this looks good to me as you mentioned most of the changes are just moving code around. Im not sure i like the big mix-in API class however i think the organisation is much better. however it would be good to try all of the mix-in classes implement unique methods. i,e. some way to prevent other class implementing query

Frankly I don't think that we should implement protection from something like this - this problem should be caught by the tests or in a code review.

The main risk i see is if someone is using BaseAPI directly which seems unlikely to me but still worth highlighting. I think it would be very good and increase my confidence if we could show that this new version works with puppetboard (with no alterations). Is this something you are able to test?

I did such test and found only one bug that I fixed in 9e3048f. (I also splitted other.py into a separate file for each class - better to do it now, that another reorg in the future.)

If we get this merged then I propose to create a pre-release of pypuppetdb and a pre-release of puppetboard depending on it. Then everyone interested (like me :)) can test it. Only after we won't find bugs after some time we would release a final release. What do you think?

@b4ldr
Copy link
Member

b4ldr commented Apr 10, 2021

Frankly I don't think that we should implement protection from something like this - this problem should be caught by the tests or in a code review.

Perhaps my language was to terse i meant protections as in unit test, definitely not some type of run time check (not even sure how that would look) so i think we agree here.

I did such test and found only one bug that I fixed in 9e3048f. (I also splitted other.py into a separate file for each class - better to do it now, that another reorg in the future.)

Awesome <3

If we get this merged then I propose to create a pre-release of pypuppetdb and a pre-release of puppetboard depending on it. Then everyone interested (like me :)) can test it. Only after we won't find bugs after some time we would release a final release. What do you think?

This sound like a good plan to me

Copy link
Member

@sebastianrakel sebastianrakel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks very clean, but i'm missing a test TestVersionCmp. Is this right?

@gdubicki
Copy link
Contributor Author

The code looks very clean, but i'm missing a test TestVersionCmp. Is this right?

Oh, I forgot that I removed versioncmp() because it was unused here. But you made me realize that it may be used in someone's code using this library. I'll just restore it as I don't want to break anyone's code without a good reason.

Greg Dubicki added 6 commits April 11, 2021 11:08
and refactor the testsby splitting them into
more files too, like the API code.
plus we are not using nosetests
and don't build RPM so remove traces
of it
Greg Dubicki added 10 commits April 11, 2021 11:20
as we don't really have any other arguments
that you can pass here except the listed ones
now to avoid having to have another reorg
in the future and lose git history again
- a bug found when testing this
version of the pypuppet code with
a current version of Puppetboard
(v3.0.0).
@gdubicki gdubicki changed the title Add PQL support, refactor and reorganize code, revamp the user docs Add PQL support, refactor and reorganize code Apr 11, 2021
@gdubicki
Copy link
Contributor Author

I cleaned up the history but I didn't squash everything as I think that some development history kept in the separate commits and their messages may be useful.

@bastelfreak : you are the 3rd reviewer, are you ok with emerging this?

gdubicki added a commit to gdubicki/pypuppetdb that referenced this pull request Apr 11, 2021
and generally revamp the docs. Some fixes:
* add info on how to use the ssl_verify to provide
  the CA root certificate (fixes voxpupuli#143),
* improve some examples, fixing voxpupuli#129,
gdubicki added a commit to gdubicki/pypuppetdb that referenced this pull request Apr 11, 2021
and generally revamp the docs. Some fixes:
* add info on how to use the ssl_verify to provide
  the CA root certificate (fixes voxpupuli#143),
* improve some examples, fixing voxpupuli#129,
@gdubicki
Copy link
Contributor Author

A draft of the PR for updating the docs to add PQL and the new code structure is here: gdubicki#1

@bastelfreak bastelfreak merged commit 894703c into voxpupuli:master Apr 15, 2021
gdubicki added a commit to gdubicki/pypuppetdb that referenced this pull request Apr 15, 2021
and generally revamp the docs. Some fixes:
* add info on how to use the ssl_verify to provide
  the CA root certificate (fixes voxpupuli#143),
* improve some examples, fixing voxpupuli#129,
bastelfreak added a commit that referenced this pull request Apr 15, 2021
Document new pql() method and reorg from PR #201
@gdubicki gdubicki linked an issue Apr 16, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

expose pql API ssl_verify
5 participants