-
Notifications
You must be signed in to change notification settings - Fork 1
Set up feature Validation #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Proposal: You are given papers who have not previously been validated, we store these as ground truth ratings and use them in downstream performance adjustments (e.g., DSL) Highlight columns that are bad with some kind of coloring and on hover show details about the performance metric (F1 or R^2 etc) and score. |
Here are some more details on how the types of truth and validation might work...
|
Fundamentally we only know if a feature is good if we can compare our result with some other (presumably more trustable) result, e.g., comparing to a human ground truth rating.
We need to build in a system for checking quality and reporting quality such that users can quickly know what to trust and think about how to improve it.
Caveat: we may be able to aggregate answers across sources in some cases to validate columns.
The text was updated successfully, but these errors were encountered: