`json`: unified properties order across optional & required #8133

ochafik · 2024-06-26T08:46:26Z

This makes properties to be generated in the order they're defined, no matter whether they're optional or required

(Follow up to #7840 (comment), cc/ @HanClinto )

{
    "type": "object",
    "properties": {
        "a": { "type": "integer" },
        "b": { "type": "integer" },
        "c": { "type": "integer" },
        "d": { "type": "integer" }
    },
    "required": ["b", "d"]
}

Before this PR, required properties were generated first (and order is then following definition within group of required & optional props):

{"b": 0, "d": 0, "a": 0, "b": 0}
{"b": 0, "d": 0}

After, all properties appear in a unified order:

{"a": 0, "b": 0, "c": 0, "d": 0}
{"b": 0, "d": 0}

This has the added benefit of reducing the max number of parallel alternatives, which should speed grammar sampling up.

TODO:

Merge json: restore default additionalProperties to false, fix some pattern escapes #8180 first (restore default additionalProperties to fals and fix tsconfig.json example)

Benchmark

Show commands

  hyperfine --warmup 1 --runs 3 \
  -L branch master,json-order \
    --setup 'git checkout {branch} && \
             make clean && \
             make -j LLAMA_CURL=1 llama-cli' \
    'BRANCH={branch} \
      ./llama-cli --grammar-file grammars/tsconfig.json.gbnf \
        -p "Write a tsconfig.json for a simple project with strict types incremental compiler/build options:" \
        -mu https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q8_0.gguf \
        --seed 1334'

Test propOrder in Python & JS implementations
Import graph from json: fix additionalProperties, allow space after enum/const #7840 (comment)

slaren · 2024-06-27T22:03:58Z

Unrelated to this PR, but I am mentioning this here since you are working on this. The test-json-schema-to-grammar is failing regularly in the AVX512 CI due to hitting a 15 minute timeout, eg. https://github.com/ggerganov/llama.cpp/actions/runs/9701435352/job/26775088066. This test runs under an emulator if the runner doesn't have AVX512 support, so I think it is expected that tests will take longer than normal, but it shouldn't take that long. Any ideas about how to fix it?

ochafik · 2024-06-28T01:33:04Z

Unrelated to this PR, but I am mentioning this here since you are working on this. The test-json-schema-to-grammar is failing regularly in the AVX512 CI due to hitting a 15 minute timeout, eg. https://github.com/ggerganov/llama.cpp/actions/runs/9701435352/job/26775088066. This test runs under an emulator if the runner doesn't have AVX512 support, so I think it is expected that tests will take longer than normal, but it shouldn't take that long. Any ideas about how to fix it?

@slaren oh that's weird, looks like it's timing out in the node.js portion of the test, although without timestamped logs it's not clear if the C++ or (more likely) the Python versions are (also) to blame.

If the flakiness is an issue I'd opt to modify the LLAMA_NODE_AVAILABLE & python logic in the test to just let any CI runner relying on an emulator skip both the python & js branches (+ file a bug for me to investigate further, I've certainly optimised the latest changes for ease of writing & cross-language portability rather than speed 😅)

ochafik · 2024-06-28T21:51:59Z

Currently seems this is a bit slower than master (although weirdly with a bit of variability despite the --seed), against my expectations from #7840 (comment)

# git add remote ochafik https://github.com/ochafik/llama.cpp
# git fetch ochafik

python examples/json_schema_to_grammar.py \
  https://json.schemastore.org/tsconfig.json \
  > grammars/tsconfig.json.gbnf

hyperfine --warmup 1 --runs 10 \
  -L branch master,ochafik/json-order \
    --setup 'git checkout {branch} && \
             make clean && \
             make -j LLAMA_CURL=1 llama-cli' \
    'BRANCH={branch} \
      ./llama-cli --grammar-file grammars/tsconfig.json.gbnf \
        -p "Write a tsconfig.json for a simple project with strict types incremental compiler/build options:" \
        -mu https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q8_0.gguf \
        --seed 13345'

Benchmark 1: BRANCH=master \
      ./llama-cli --grammar-file grammars/tsconfig.json.gbnf \
        -p "Write a tsconfig.json for a simple project with strict types incremental compiler/build options:" \
        -mu https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q8_0.gguf \
        --seed 13345
  Time (mean ± σ):      8.894 s ±  0.702 s    [User: 1.177 s, System: 0.366 s]
  Range (min … max):    8.059 s … 10.106 s    10 runs
 
Benchmark 2: BRANCH=ochafik/json-order \
      ./llama-cli --grammar-file grammars/tsconfig.json.gbnf \
        -p "Write a tsconfig.json for a simple project with strict types incremental compiler/build options:" \
        -mu https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q8_0.gguf \
        --seed 13345
  Time (mean ± σ):      9.467 s ±  0.585 s    [User: 1.127 s, System: 0.342 s]
  Range (min … max):    8.363 s …  9.987 s    10 runs
 
Summary
  'BRANCH=master \
      ./llama-cli --grammar-file grammars/tsconfig.json.gbnf \
        -p "Write a tsconfig.json for a simple project with strict types incremental compiler/build options:" \
        -mu https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q8_0.gguf \
        --seed 13345' ran
    1.06 ± 0.11 times faster than 'BRANCH=ochafik/json-order \
      ./llama-cli --grammar-file grammars/tsconfig.json.gbnf \
        -p "Write a tsconfig.json for a simple project with strict types incremental compiler/build options:" \
        -mu https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF/resolve/main/Meta-Llama-3-8B-Instruct-Q8_0.gguf \
        --seed 13345'

mofosyne added the Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level label Jun 26, 2024

github-actions bot added testing Everything test related examples python python script changes server labels Jun 26, 2024

json: unified properties order across optional & required

757c4df

ochafik force-pushed the json-order branch from 06dc545 to 757c4df Compare June 28, 2024 08:49

ochafik mentioned this pull request Jun 28, 2024

json: skip slow tests when running under emulator #8189

Merged

Merge remote-tracking branch 'origin/master' into json-order

f286589

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`json`: unified properties order across optional & required #8133

`json`: unified properties order across optional & required #8133

ochafik commented Jun 26, 2024 •

edited

Loading

slaren commented Jun 27, 2024

ochafik commented Jun 28, 2024

ochafik commented Jun 28, 2024

json: unified properties order across optional & required #8133

Are you sure you want to change the base?

json: unified properties order across optional & required #8133

Conversation

ochafik commented Jun 26, 2024 • edited Loading

slaren commented Jun 27, 2024

ochafik commented Jun 28, 2024

ochafik commented Jun 28, 2024

`json`: unified properties order across optional & required #8133

`json`: unified properties order across optional & required #8133

ochafik commented Jun 26, 2024 •

edited

Loading