Skip to content

Add scale method to Moments #850

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 23, 2020
Merged

Add scale method to Moments #850

merged 5 commits into from
Sep 23, 2020

Conversation

johnynek
Copy link
Collaborator

This is a follow up to #829 which adds the scale method to Moments.

I had thought we would have to wait to break binary compatibility, but by writing the case class methods by hand, we can actually make the internal state a double and add the scale method to Moments.

This offers the ability to do exponentially decayed moments, which can be useful to aggregate over long time scales but not have very old values dominate the results.

It's a bit ugly to write this code out by hand, but binary compatibility is worth it, IMO.

PTAL @eigenvariable @regadas

Copy link
Contributor

@eigenvariable eigenvariable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another approach that might be worth exploring: declare a new version of moments where count is a Double instead of a Long, with a trait that implements all of the shared business logic between both versions in terms of a Numeric (or whatever the best mechanism is for making this generic). Overriding methods like equals is just kind of a lot here.

else
/* don't return junk when the moment is not defined */
Double.NaN
math.sqrt(m0D) * m3 / math.pow(m2, 1.5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer filtering on count > 2 is a semantic change that's probably worth at least calling out in the release notes or something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed it is. I will call it out, but since we returned NaN before, we were already in a "silent-ish" failure more.

@johnynek
Copy link
Collaborator Author

Yeah. I could just copy all the code and leave the old one for binary compatibility but I don't really want to have to two copies in the repository. The mima check shows we needed to make all these changes to support binary compatibility.

It really wasn't that much work to implement everything.

@codecov-commenter
Copy link

codecov-commenter commented Jul 23, 2020

Codecov Report

Merging #850 into develop will increase coverage by 0.04%.
The diff coverage is 77.14%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #850      +/-   ##
===========================================
+ Coverage    89.29%   89.34%   +0.04%     
===========================================
  Files          117      117              
  Lines         9326     9348      +22     
  Branches       365      371       +6     
===========================================
+ Hits          8328     8352      +24     
+ Misses         998      996       -2     
Impacted Files Coverage Δ
...main/scala/com/twitter/algebird/MomentsGroup.scala 79.24% <75.75%> (-11.55%) ⬇️
...scala/com/twitter/algebird/CorrelationMonoid.scala 86.76% <100.00%> (-0.08%) ⬇️
.../main/scala/com/twitter/algebird/BloomFilter.scala 95.13% <0.00%> (+0.44%) ⬆️
.../main/scala/com/twitter/algebird/Approximate.scala 90.32% <0.00%> (+1.61%) ⬆️
.../main/scala/com/twitter/algebird/Applicative.scala 58.06% <0.00%> (+3.22%) ⬆️
.../main/scala/com/twitter/algebird/Successible.scala 91.66% <0.00%> (+4.16%) ⬆️
...in/scala/com/twitter/algebird/scalacheck/Gen.scala 100.00% <0.00%> (+4.65%) ⬆️
...src/main/scala/com/twitter/algebird/Interval.scala 90.43% <0.00%> (+6.95%) ⬆️
...n/scala/com/twitter/algebird/SuccessibleLaws.scala 92.30% <0.00%> (+15.38%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8b7bd86...6bc4189. Read the comment docs.

@eigenvariable
Copy link
Contributor

I don't doubt that this was straightforward to write, but it's pretty difficult to read and I suspect maintain. Ultimately, algebird is your baby more so than anyone else's, and IMO you've earned the right to make these sorts of aesthetic decisions for the library. But my 2¢ is that this would all be much cleaner by just defining a ScalableMoments class and sharing as much code as possible between the two.

@@ -94,6 +188,8 @@ class MomentsMonoid extends Monoid[Moments] with CommutativeMonoid[Moments] {
* Uses a more stable online algorithm which should be suitable for
* large numbers of records similar to:
* http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm
*
* we no longer use this, but we can't remove it due to binary compatibility
*/
def getCombinedMean(n: Long, an: Double, k: Long, ak: Double): Double =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add a @deprecated as well?

@CLAassistant
Copy link

CLAassistant commented Sep 22, 2020

CLA assistant check
All committers have signed the CLA.

@johnynek
Copy link
Collaborator Author

johnynek commented Sep 22, 2020

@regadas merged your suggestion, sorry for being slow. I merged develop and lets make sure CI is green, then it would be nice to merge.

@johnynek johnynek force-pushed the oscar/add_moments_scale branch from ad0ca48 to 6bc4189 Compare September 22, 2020 22:50
@regadas
Copy link
Collaborator

regadas commented Sep 23, 2020

@johnynek awesome! yeah don't worry about it!

@regadas regadas merged commit 5409e33 into develop Sep 23, 2020
@regadas regadas deleted the oscar/add_moments_scale branch September 23, 2020 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants