Add fix quality metadata to ChangeSet spec #47

drdavella · 2025-02-18T21:06:05Z

No description provided.

drdavella · 2025-02-18T21:17:37Z

After further discussion we might be looking for three separate ratings:

Safety (can I take this change without breaking my code)
Effectiveness (does it fix the problem without introducing syntactic or semantic changes)
Cleanliness (does this code introduce stylistic issues, add/remove comments, etc.)

It's worth noting that these are not strictly independent categories so we need to be careful when scoring.

From a user perspective, I might argue that they really only care about two dimensions:

Is the fix correct?
Will it break my app?

However, those two could ultimately be derived from the first three criteria.

nahsra · 2025-02-19T14:02:45Z

From a user perspective, I might argue that they really only care about two dimensions:
Is the fix correct?
Will it break my app?

I'd argue these are just different ways of "safe" and "effective".

nahsra

Are the 3 things implementation details of the 1 thing? I'm developing more conviction about:

Safe
Effective
Palatability

I think "safe" encapsulates a whole set of concerns about off-target effects on the code compilation and code functioning.

I think "effective" encapsulates everything about the fix effectiveness, completeness, durability, etc.

I think "palatability" is about all the stylistic / formatting / bikeshedding / team norm kind of stuff that we will, frankly, only have limited insight into. If we're going to be opinionated about how implementors should feel about this change by exposing these, we must come to agreement about the factors now, and I'm happy to do that.

nahsra · 2025-02-19T14:08:14Z

Shouldn't this also be at the "change" level, rather than the changeset? I might be misunderstanding, but each individual change will have its own score, right? Or are we asking CodeTF providers to estimate a "global score"?

drdavella · 2025-02-19T15:09:27Z

Shouldn't this also be at the "change" level, rather than the changeset? I might be misunderstanding, but each individual change will have its own score, right? Or are we asking CodeTF providers to estimate a "global score"?

My assumption so far has been that this evaluation applies per fix and we have effectively been assuming a 1:1 mapping between fixed finding and changeset entry.

Add fix quality metadata to ChangeSet spec

5d8669c

drdavella requested a review from nahsra February 18, 2025 21:06

nahsra approved these changes Feb 19, 2025

View reviewed changes

nahsra requested changes Feb 19, 2025

View reviewed changes

Use three ratings: safety, effectiveness, cleanliness

4dd27cf

drdavella requested a review from nahsra February 19, 2025 15:09

drdavella added 2 commits February 19, 2025 10:12

Remove enum rating

b226b3d

Make all three ratings required

315a3ca

nahsra approved these changes Feb 19, 2025

View reviewed changes

drdavella merged commit dcc92ae into main Feb 19, 2025
2 checks passed

drdavella deleted the fix-quality-ratings branch February 19, 2025 15:15

drdavella mentioned this pull request Feb 19, 2025

Add fixQuality per CodeTF spec update pixee/codemodder-python#999

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fix quality metadata to ChangeSet spec #47

Add fix quality metadata to ChangeSet spec #47

drdavella commented Feb 18, 2025

drdavella commented Feb 18, 2025

nahsra commented Feb 19, 2025

nahsra left a comment

nahsra commented Feb 19, 2025

drdavella commented Feb 19, 2025

Add fix quality metadata to ChangeSet spec #47

Add fix quality metadata to ChangeSet spec #47

Conversation

drdavella commented Feb 18, 2025

drdavella commented Feb 18, 2025

nahsra commented Feb 19, 2025

nahsra left a comment

Choose a reason for hiding this comment

nahsra commented Feb 19, 2025

drdavella commented Feb 19, 2025