Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCML: Raise ValueError if n_features larger than n_triplets #350

Conversation

nikosmichas
Copy link
Contributor

Related discussion #348
cc @grudloff

@@ -240,6 +240,12 @@ def _generate_bases_dist_diff(self, triplets, X):
raise ValueError("n_basis should be an integer, instead it is of type %s"
% type(self.n_basis))

if n_features > n_triplets:
raise ValueError(
"Number of features (%s) is greater than the nuber of triplets(%s).\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"Number of features (%s) is greater than the nuber of triplets(%s).\n"
"Consider using a dimensionality reduction preprocessing or create "
"a new basis generation scheme." % (n_features, n_triplets))

Copy link
Contributor

@grudloff grudloff Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I would reformulate it to

"Consider using dimensionality reduction or using another basis generation scheme."

This way we consider possible future basis generation alternatives as well as just passing in a basis set directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, updated

@nikosmichas nikosmichas force-pushed the scml_error_if_large_n_features branch from a15470e to 51c35b7 Compare April 7, 2022 21:15
Copy link
Contributor

@grudloff grudloff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@bellet bellet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work @nikosmichas !
This looks good to me, except that I realize that the docstring entry for the ‘triplet_diffs’ option says:

The basis set is constructed from the differences between points of n_basis positive or negative pairs taken from the triplets constrains.

Shoudn't it be something like:

Each basis is constructed from the differences between points of n_features positive or negative pairs taken randomly from the triplets constraints.

@nikosmichas
Copy link
Contributor Author

Thanks for the work @nikosmichas ! This looks good to me, except that I realize that the docstring entry for the ‘triplet_diffs’ option says:

The basis set is constructed from the differences between points of n_basis positive or negative pairs taken from the triplets constrains.

Shoudn't it be something like:

Each basis is constructed from the differences between points of n_features positive or negative pairs taken randomly from the triplets constraints.

@bellet not sure about the exact wording. I can make the docstring change if you think that it's better. Do you thing that the change is related to this PR ?

@perimosocordiae perimosocordiae merged commit 8520418 into scikit-learn-contrib:master Jun 20, 2022
@perimosocordiae
Copy link
Contributor

Merging now, we can fix the docstring separately.

@bellet
Copy link
Member

bellet commented Jun 21, 2022

Opened a separate PR for #351

@bellet
Copy link
Member

bellet commented Jun 21, 2022

Thanks @nikosmichas !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants