Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSHARP-5505: Add $geoNear stage aggregation builders #1621

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

adelinowona
Copy link
Contributor

No description provided.

@adelinowona adelinowona requested a review from a team as a code owner February 22, 2025 02:46
@adelinowona adelinowona requested review from JamesKovacs, rstam and BorisDog and removed request for a team and JamesKovacs February 22, 2025 02:46
/// <summary>
/// Gets or sets the distance field. Required if querying a time-series collection.
/// </summary>
public string DistanceField { get; set; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make a note this is optional here given that we don't use the nullable reference annotations in our API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is DistanceField actually optional? The server documentation doesn't say that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a string here is also not completely safe.

This name MUST match the corresponding element name (usually the same as the property name unless configured differently) in the TNewResult POCO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is DistanceField actually optional? The server documentation doesn't say that.

In server 8.1, it'll be optional for non-timeseries collections but for time-series collections it's still required. I mentioned the time-series restriction in the property summary to guide users appropriately. If users accidentally omit the property when working with a time-series collection, they'll receive a detailed error message from the server explaining the issue so I don't think there should be too much concern for us here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a string here is also not completely safe.

This name MUST match the corresponding element name (usually the same as the property name unless configured differently) in the TNewResult POCO.

Thanks for flagging this! I missed the fact that the element name isn't necessarily the same as the property name. We can address this by changing the type of the DistanceField option to FieldDefinition<TDocument> so that during the serialization process, the appropriate element name will be used if it's different from the property name. However, this means the type of the document passed into GeoNearOptions in the GeoNear methods should be the TOutput instead of TInput. @rstam lmk what you think about this?

Copy link
Contributor

@rstam rstam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Some small changes requested.

/// <summary>
/// Represents options for the $geoNear stage.
/// </summary>
public record GeoNearOptions<TDocument>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going with record we should also make this class immutable.

/// <summary>
/// Gets or sets the distance field. Required if querying a time-series collection.
/// </summary>
public string DistanceField { get; set; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is DistanceField actually optional? The server documentation doesn't say that.

/// <summary>
/// Gets or sets the distance field. Required if querying a time-series collection.
/// </summary>
public string DistanceField { get; set; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a string here is also not completely safe.

This name MUST match the corresponding element name (usually the same as the property name unless configured differently) in the TNewResult POCO.

/// <param name="options">The options.</param>
/// <returns>The fluent aggregate interface.</returns>
IAggregateFluent<TNewResult> GeoNear<TCoordinates, TNewResult>(
TCoordinates[] near,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use double[] instead of TCoordinates[].

/// <param name="near">The point for which to find the closest documents.</param>
/// <param name="options">The options.</param>
/// <returns>The fluent aggregate interface.</returns>
IAggregateFluent<TNewResult> GeoNear<TNewResult>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed in Slack that this overload could be dropped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BorisDog Just to loop you in on the decision here. Users can provide a "legacy coordinates pair" as either an array or a bsondocument. The server docs mention that using an array is preferred so Robert and I agreed we could just provide the GeoNear overload using an array and drop the bsondocument overload. Lmk if you agree with this as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

/// <returns>A new pipeline with an additional stage.</returns>
public static PipelineDefinition<TInput, TOutput> GeoNear<TInput, TIntermediate, TCoordinates, TOutput>(
this PipelineDefinition<TInput, TIntermediate> pipeline,
TCoordinates[] near,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use double[] instead of TCoordinates[].

/// <param name="near">The point for which to find the closest documents.</param>
/// <param name="options">The options.</param>
/// <returns>A new pipeline with an additional stage.</returns>
public static PipelineDefinition<TInput, TOutput> GeoNear<TInput, TIntermediate, TOutput>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overload can be dropped.

/// <returns>The stage.</returns>
internal static PipelineStageDefinition<TInput, TOutput> GeoNear<TInput, TPoint, TOutput>(
TPoint near,
GeoNearOptions<TInput> options = null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a comment like:

            // where TPoint is either a GeoJsonPoint or a legacy coordinate array

/// <param name="options">The options.</param>
/// <returns>The stage.</returns>
public static PipelineStageDefinition<TInput, TOutput> GeoNear<TInput, TCoordinates, TOutput>(
TCoordinates[] near,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use double[] instead of TCoordinates[]

/// <param name="near">The point for which to find the closest documents.</param>
/// <param name="options">The options.</param>
/// <returns>The stage.</returns>
public static PipelineStageDefinition<TInput, TOutput> GeoNear<TInput, TOutput>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overload can be dropped.

/// <param name="near">The point for which to find the closest documents.</param>
/// <param name="options">The options.</param>
/// <returns>The fluent aggregate interface.</returns>
IAggregateFluent<TNewResult> GeoNear<TNewResult>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

{ "near", pointSerializer.ToBsonValue(near)},
{ "distanceField", options?.DistanceField, options?.DistanceField != null },
{ "maxDistance", () => options?.MaxDistance.Value, options?.MaxDistance != null },
{ "minDistance", () => options?.MinDistance.Value, options?.MinDistance != null },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in case of nullable the value still can be provided directly, and avoid delegate creation.
{ "maxDistance", options?.MaxDistance, options?.MaxDistance != null }

Could you double check that please?

};

var outputSerializer = args.SerializerRegistry.GetSerializer<TOutput>();
return new RenderedPipelineStageDefinition<TOutput>(operatorName, new BsonDocument(operatorName, geoNearOptions), outputSerializer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var outputSerializer = args.GetSerializer<TOutput>(); might safe serializer lookup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants