-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add arrow support for timestamp and bytea #724
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #724 +/- ##
==========================================
- Coverage 48.81% 48.75% -0.06%
==========================================
Files 70 70
Lines 6887 6895 +8
==========================================
Hits 3362 3362
- Misses 3061 3069 +8
Partials 464 464
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
⏱️ Benchmark resultsComparing with e675cf0
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, I have a couple of questions on the impact of the changes
@@ -183,7 +183,7 @@ func GenTestData(table *schema.Table) schema.CQTypes { | |||
data[i] = uuidArrayColumn | |||
case schema.TypeInet: | |||
inetColumn := &schema.Inet{} | |||
if err := inetColumn.Set("192.0.2.1/24"); err != nil { | |||
if err := inetColumn.Set("192.0.2.0/24"); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's because the bitmask filters out the final .1
, so it amounts to the same, but if Arrow rejects this type of IP we may need to handle it better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 192.0.2.1./24
is basically the same as 192.0.2.0/24
after applying the mask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yevgenypats Right, but was this change necessary for the tests to pass, or was it just some cleanup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe Im confusing something about inet vs cidr . but I did need it for a test to pass because our own arrow inet
converted it to 192.0.2.0/24
but I think this happened just from the original IPNet.String
conversion so I suspect it should affect anything but maybe I missed something and didn't get to the root cause of this, not sure.
@@ -438,7 +438,7 @@ var calculateCQIDPrimaryKeyTestCases = []struct { | |||
"MacAddressValue": "aa:bb:cc:dd:ee:ff", | |||
"MacAddressArrayValue": []string{"aa:bb:cc:dd:ee:ff", "11:22:33:44:55:66"}, | |||
}, | |||
ExpectedValue: &UUID{Bytes: [16]uint8{0xa6, 0x76, 0x76, 0xfb, 0x4c, 0x95, 0x51, 0x2d, 0x9e, 0xcd, 0xf4, 0xc4, 0xae, 0xc1, 0x2a, 0xf5}, Status: 0x2}, | |||
ExpectedValue: &UUID{Bytes: [16]uint8{0x82, 0xaa, 0xa2, 0xc7, 0x75, 0x10, 0x5c, 0x6d, 0xb8, 0xef, 0x18, 0x49, 0x33, 0x99, 0x4a, 0x9d}, Status: 0x2}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this PR is backwards compatible I would expect existing tests to pass. Is this related to the change in testdata/testdata.go
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is backward compat the the binary can read older encoding but the encoding for the binary types moving forward will be base64. I don't think it should impact anything though.
@@ -41,7 +42,7 @@ func (dst *Bytea) Equal(src CQType) bool { | |||
|
|||
func (dst *Bytea) String() string { | |||
if dst.Status == Present { | |||
return hex.EncodeToString(dst.Bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do we use this String
method? Is this not a breaking change as we return a different format of the string?
Trying to understand the impact of this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it is a breaking change (I did add a way for us to support both conversion in Set
) but yes moving forward we will use base64 encoding for bytea to work like arrow. I can add a new type but I don't think we have anything that depends on that.
Unrelated but I've disabled the codecov status checks in https://app.codecov.io/account/gh/cloudquery/yaml/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with the concern that we needed to update test data to make the tests pass. I would expect it to pass with existing data.
If we don't expect to receive data in the old format I guess this is ok
🤖 I have created a release *beep* *boop* --- ## [1.42.0](v1.41.0...v1.42.0) (2023-03-06) ### Features * Add arrow support for timestamp and bytea ([#724](#724)) ([c2e84c3](c2e84c3)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Blocked by cloudquery/plugin-sdk#724 but ready for initial review and discussion. I can do a short walkthrough for anyone up for review. Apache arrow fork is here (We use it until [this](apache/arrow#34454) is merged): https://github.com/cloudquery/arrow/tree/feat_extension_builder. Some more notes: - Right now we are just migrating the json writer/reader to use apache arrow so we can roll format by format and see if there are any real world issues before we roll this to everywhere instead of our own type system.
This should unblock cloudquery/filetypes#87
Add supports for timestamp to decode arrow string and for byte to decode base64 (backward compat is saved)