-
Notifications
You must be signed in to change notification settings - Fork 62
Update mean and sum functions #643
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…correctly handle NaN values in coefficients irreg Updated mean an sum functions for FData, FDataGrid, FDataBasis and FDataIrregular to correctly handle NaN values in coefficients
skfda/representation/irregular.py
Outdated
if skipna: | ||
count_values = np.sum(~np.isnan(common_values), axis=0) | ||
else: | ||
count_values = np.full(sum_values.shape, self.n_samples) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this just self.n_samples
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To operate with sum_values, it is needed in array form to fit seamlessly with the flow of the case where skipna is specified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really? I think NumPy's broadcasting would handle it just fine. Or am I wrong in that?
out: None = None, | ||
keepdims: bool = False, | ||
skipna: bool = False, | ||
min_count: int = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that min_count
is not being used here. Why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is left for compatibility with the mean functions of FDataIrregular and Grid, but it does not make sense to use it, as you do not have measurements for each observation, but simply the observations approximated by functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would at least make sense in a global level (if some curves are NaN because they were not measured). Of course, if you only have a FDataBasis
that does not make much sense, but it does if the FDataBasis
is just a column among many in a DataFrame, and it was not measured in some cases.
…correctly handle NaN values in coefficients irreg Updated mean an sum functions for FData, FDataGrid, FDataBasis and FDataIrregular to correctly handle NaN values in coefficients
.all-contributorsrc
Outdated
@@ -595,15 +595,6 @@ | |||
"contributions": [ | |||
"doc" | |||
] | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you removing a contributor??
CONTRIBUTORS.md
Outdated
@@ -1,6 +1,6 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to contributors should be made using the bot, please remove these files from the PR.
skipna=skipna, | ||
min_count=min_count, | ||
) | ||
/ np.sum(~self.isna()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the trailing comma. Otherwise, you are returning a tuple and the tests fail:
- This is a number:
(5)
- This is a tuple:
(5,)
Update mean and sum functions for FData, FDataGrid, FDataIrregular and FDataBasis to correctly handle NaN values in coefficients.
Fixes #642
Describe the proposed changes
Edit the mean function from FData so that it only becomes a parameter check, leaving the checks as it is.
Add an auxiliar function in FDataGrid that works for mean, sum and var, and simply calls the relevant np.sum/nansum, mean/nanmean, var/nanvar when relevant depending on the skipna parameter, have the mean and sum function work with this auxiliar function.
Add a mean function in FDataBasis that calculates the means for the coefficients when the functions have no nan values in the coefficients, otherwise it is not considered for the calculations.
Add a mean function in FDataIrregular that calculates the mean based on the mean_counts parameter and depending on skipna or not.