Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSHARP-5478: Support $rankFusion stage #1636

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

adelinowona
Copy link
Contributor

No description provided.

@adelinowona adelinowona requested a review from a team as a code owner March 18, 2025 00:33
@adelinowona adelinowona requested review from JamesKovacs, rstam and BorisDog and removed request for a team and JamesKovacs March 18, 2025 00:33
/// <summary>
/// Gets the $rankFusion feature.
/// </summary>
public static Feature RankFusion => __rankFusion;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feature names for stages end in Stage. Rename to RankFusionStage?

/// <typeparam name="TNewResult">The type of the new result.</typeparam>
/// <param name="pipelines">The map of named pipelines whose results will be combined. The pipelines must operate on the same collection.</param>
/// <param name="weights">The map of pipeline names to numerical weights determining result importance during combination. Default weight is 1 when unspecified.</param>
/// <param name="scoreDetails">Flag that specifies whether to return a detailed breakdown of the score for each document in the result.</param>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this description is correct, but I'm also not confident how $rankFusion works exactly.

My question is that based on a Slack conversation I don't think setting this option to true changes the result at all. What I understood from Adelin is that setting it to true causes the $meta variable to be defined, and it's up to the user whether to project that into the result or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the description to mention setting the options to true doesn't change the resulting POCO but makes the score available as a new metadata field that can be projected by the user using $meta.

return new RenderedPipelineStageDefinition<TOutput>(
operatorName,
new BsonDocument(operatorName, rankFusionOptions),
args.SerializerRegistry.GetSerializer<TOutput>());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the registry the right place to get this serializer from?

Sometimes we allow the user to tell us what serializer to use in an options parameter, and only fall back to the registry if none was provided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I just noticed that some of the methods in PipelineStageDefinitionBuilder relied upon registry and I did as well but it is probably the right call to allow users to provide a serializer if they want and fall back to the registry if none if provided.

/// </summary>
/// <typeparam name="TNewResult">The type of the new result.</typeparam>
/// <param name="pipelines">The map of named pipelines whose results will be combined. The pipelines must operate on the same collection.</param>
/// <param name="weights">The map of pipeline names to numerical weights determining result importance during combination. Default weight is 1 when unspecified.</param>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"non-negative numerical weights"

renderedPipelines.Add(pipeline.Key, new BsonArray(pipeline.Value.Render(args).Documents));
}

var rankFusionOptions = new BsonDocument
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: consider renaming rankFusionOptions to document for consistency with other operators.

Dictionary<string, PipelineDefinition<TResult, TNewResult>> pipelines,
Dictionary<string, double> weights = null,
bool scoreDetails = false)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the following convenience overloads in IAggregateFluentExtensions

RankFusion<TNewResult>(PipelineDefinition<TResult, TNewResult>[] pipelines, options)
RankFusion<TNewResult>((PipelineDefinition<TResult, TNewResult>, double)[] pipelinesWithWeights, options)

cc @rstam

result[1].Y.Should().Be(7);
result[2].Y.Should().Be(5);
result[3].Y.Should().Be(8);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some basic validation for scoreDetails, it will also serve as example on how to consume it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants