Skip to content

Enum-like properties like "refine" or enum "valueType" allow general strings, although invalid according to validator #795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Philipp-M opened this issue Mar 2, 2025 · 6 comments

Comments

@Philipp-M
Copy link

Philipp-M commented Mar 2, 2025

I'm currently writing a Rust library to parse and validate 3d-tiles tilesets.

I'm orienting closely to the jsonschema defined in this repo, and I've seen property type definitions like:

"refine": {
  "anyOf": [
    { "const": "ADD" },
    { "const": "REPLACE" },
    { "type": "string" }
  ]
}

My first thought and interpretation according to the JSON schema specification was to allow additionally custom strings other than "ADD" and "REPLACE", but according to the validator (tests etc.) and what I can read on the 3d-tiles spec, is that this is not allowed.

I've now checked all (or most) other places like enum "valueType" or class property types that custom strings are seemingly not allowed.

So I think { "type": "string" } needs to be removed from all those "anyOf"s, to disallow any other strings being passed (and to avoid confusion of interpreting the jsonschema).

@javagl
Copy link
Contributor

javagl commented Mar 3, 2025

The general idea behind that is to allow for extensibility. (This might be inspired by glTF, where you see the same pattern, e.g. for image MIME types).

If the { "type": "string" } part was omitted, then there could never be an additional refinement strategy. If an extension wants to define a new refinement strategy, say "INSERT" or whatever, then this can become a valid value for the "refine" property by declaring it as valid via the extension.

(The validator would then have to be updated accordingly. There already are some mechanisms for that in place. The specific form of extension in this example would require moving the validation of the refine into some overridable function or so, but no larger structural changes)

I know that this does raise some questions for library implementors: When creating a library in a high-level programming language*, then you'd usually perform some validation on the library level. Something like (pseudocode)

setRefine(r) {
   if (r != "ADD" && r != "REPLACE") throw error;
   this.refine = r;
}

But this can cause trouble when the set of valid values actually are extended later.

I don't have a "silver bullet" solution for that.

(In fact, I stumbled over the exact same thing, in my glTF library, for the [case of the mimeType that I mentioned above. There, I took the shortcut of just omitting the validity check), but it would be nice to have a better solution than that...)


* No, I don't count JavaScript into the category of "high-level programming languages"

@kring
Copy link
Member

kring commented Mar 4, 2025

In case it's helpful, I've been playing around with glTF / 3D Tiles in Rust myself (strictly for fun and to learn Rust), and found this construct quite natural for the various enums:

pub enum TileRefine {
    /**
     * `ADD`
     */
    Add,
    /**
     * `REPLACE`
     */
    Replace,
    /**
     * Another value not included in the specification, such as a value from an extension.
     */
    Other(String),
}

This allows the refine property on a Tile to represent in Rust types anything that is representable in the JSON. Whether any given value is valid, given the set of extensions, is another matter, of course. There also might be more memory-efficient ways to do this than an owned String.

@Philipp-M
Copy link
Author

The general idea behind that is to allow for extensibility.

I was suspecting already that extensibility was one of the ideas.
Thanks for the detailed explanation.

Maybe it makes sense to add a note in the spec for this, i.e. something like: "Additionally an arbitrary string is allowed for future extension, but is currently not supported", so that it's more clear why the string is allowed as well in the spec.

In case it's helpful, I've been playing around with glTF / 3D Tiles in Rust

Oh cool, yeah I should probably open that repo (I plan on open-sourcing this anyway), to avoid possibly duplicated work.
Do you have something already somewhere to look at? Maybe it's possible to collaborate? My plan (not sure how far I get, as I got very limited time for this) is to support bevy and have general tooling/crates around the 3D-tiles spec for Rust, since there's basically none currently.
I've seen some repos already but they were either unmaintained or not complete enough (and unmaintained).

quite natural for the various enums:

Yep that was also how I was doing it until I opened that issue or rather have found the tests that were prohibiting this custom string.

@javagl
Copy link
Contributor

javagl commented Mar 4, 2025

I didn't know that Rust allows for "abitrary strings" in enums. The details here vary from language to language. An enum may just be an int (C/C++), or a special object (Java), or a string (JavaScript) optionally narrowing the type down to a "specific set of strings" (TypeScript). And the question about how to "map" the enum concept from JSONSchema into a particular language is only one of the many design decisions (similar to whether anyof implies inheritance etc).

In practice I think that there is hardly ever a reason to use enum at all (EDIT: In programming languages. For the JSON Schema, it makes sense). Trying to think about real-world use-cases for that, I only came up with something like enum TrafficLightColors { RED, YELLOW, GREEN } or maybe enum Weekdays { MONDAY... } or so. But something like enum State { INITIALIZED, RUNNING, ... } for a custom state machine is bound to break. The point (that is also the important one for the issue here) is: This set may never change. Using an enum will carve a certain structure in stone, and in practice, often make it impossible to add a new value later.

Here, your answer popped up...

Insofar, the fact that refine can be a string has the benefit of "forcing" developers to think about that case explicitly. It won't avoid breaking changes. There will still be places in the code that may cause an error for unhandled values. But if this was modeled with an enum at the level of the "parsing library", then such a file would in doubt break during parsing, with no chance of recovering from that...

Maybe it makes sense to add a note in the spec for this,

We can consider this. I think that there are not sooo many places where enum is used (I may have to check), but we'd have to sort out whether to add that implementation note everywhere, or at once central place (maybe in a section that talks about 'Extensions'... though people might still overlook that)

@Philipp-M
Copy link
Author

Philipp-M commented Mar 4, 2025

In practice I think that there is hardy ever a reason to use enum at all

I'd agree with you at least somewhat in other languages than in Rust, but as it's an algebraic sum type there (or in typescript terms a "discriminated/tagged union") it's not really comparable to what e.g. a C enum is, and usually an idiomatic solution for a lot of data types in Rust (like this case).

We can consider this. I think that there are not sooo many places where enum is used (I may have to check), but we'd have to sort out whether to add that implementation note everywhere, or at once central place (maybe in a section that talks about 'Extensions'... though people might still overlook that)

Yeah the amount of these enums is manageable to add a note over each of those and might be worth it so that it's more obvious when looking up those properties instead of having to jump to a specific section where this is described in general.

@kring
Copy link
Member

kring commented Mar 6, 2025

Do you have something already somewhere to look at?

Not yet. It's just been a learning exercise for me, so far, so not much worth sharing. I've been trying to follow the approach used in cesium-native, though, where the Rust code to glTF and 3D Tiles is generated from JSON Schema specs. Generating Rust code in Rust is actually pretty nice! I've also been looking at basing the 3D Tiles representation around the Bevy ECS, so that a Tile, for example, is an entity with a Tile component (might break this down further later) and extensions are additional components. It's been a challenge to deserialize tileset.json into the Bevy ECS World, though, probably mostly because I'm a Rust amateur. I'm not completely sure any of this is a good idea, but I thought it would be fun to try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants