Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuously fuzzing tidy-html5 with OSS-Fuzz #788

Open
stefanbucur opened this issue Dec 21, 2018 · 28 comments
Open

Continuously fuzzing tidy-html5 with OSS-Fuzz #788

stefanbucur opened this issue Dec 21, 2018 · 28 comments

Comments

@stefanbucur
Copy link

OSS-Fuzz is free fuzzing infrastructure for automatically identifying security vulnerabilities and stability bugs in open source projects.

We believe tidy-html5 is an important part of the open source ecosystem (all integrated projects can be found here), and as such we have recently integrated a few fuzz targets that we developed for tidy-html5 into OSS-Fuzz. Once integrated, OSS-Fuzz will continuously fuzz this project, alert when it finds bugs, and verify the fixes.

Would any of the contributors be interested in becoming a contact person for receiving any bug reports?

Since some of these bugs may be security vulnerabilities, we have a disclosure policy where bugs are first reported to the maintainers before being publicly released after a certain deadline (see the link for the complete details).

Ideally, these fuzz targets should also reside in the main project repository, so they are updated together with API changes. Let me know if you're also interested in integrating these targets in this repository (since this is additional work on top of your volunteer time to open-source, we are also offering integration rewards, more details here).

Thank you!

@stefanbucur
Copy link
Author

I just wanted to add that we're starting seeing a number of crashes from fuzzing tidy-html5, and it would be very helpful if someone could take a look at them.

@geoffmcl
Copy link
Contributor

@stefanbucur thank you for the comments, and information...

I am NOT interested in setting up automated or continuous fuzzing, in this repo, so in general, simply no thanks!

But I am very interested in any tidy crashes found. Where do I look? links?

Any that are valid, repeatable bugs should be added to these issues, together with a minimal sample html, setup, and configuration used... thanks...

It goes without saying, code fixes, patches or PR's are always welcome...

@stefanbucur
Copy link
Author

I have just CC-ed your GitHub e-mail address on some of the most important issues (for example, here). Some of the crashes are potentially security vulnerabilities, so by default they are not publicly visible, to reduce the chance of them being exploited while we give maintainers a chance to take a look first.

Regarding how to report these bugs, I can try filing the existing bugs as issues here on GitHub too (but these will be publicly visible). Let me know if you'd like me to do this for all crashes. I can also try producing some fixes, but unfortunately I'm not familiar with the implementation so I might not be able to fix too many things.

@geoffmcl
Copy link
Contributor

@stefanbucur thanks for the comment... and the CC-ed emails... which I have started to look at... will continue that...

Regarding using the sanitizer please note we already have open #791 - the fix for this is only presently in the issue-791 for testing, prior to merging to next... appreciate any testing on this... thanks...

Also see like #588, #622, which got solved, and closed... there are probably like issues open/closed... still checking... any pointers would help...

Before we get to ... producing some fixes, ..., we need to be able to reproduce the bug!

I built an issue-791 tidy 5.7.18.sani, and downloaded/*85920, copied the test to cases/testcase-12111.html, but get no Sanitizer errors... what am I doing wrong?

Of course, running tidy on that almost random set of bytes yields some 9067 warnings and 149 errors... but no problem...

So the most important thing here is to produce a repeatable test... where debuggers can help isolate the precise problem... etc... can do nothing without a test...

Any help really appreciated... thanks...

@geoffmcl
Copy link
Contributor

@stefanbucur went through the emails.. seems there are 3 issues - please confirm?

from: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12111
 Issue 12111: tidy-html5/tidy_fuzzer: Heap-buffer-overflow in prvTidyEncodeCharToUTF8Bytes
 Reproducer Testcase: https://oss-fuzz.com/download?testcase_id=5639351547985920

from: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12074
 Issue 12074: tidy-html5/tidy_fuzzer: Use-of-uninitialized-value in prvTidyIsHighSurrogate
 Reproducer Testcase: https://oss-fuzz.com/download?testcase_id=5697834188275712

from: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12309
 Issue 12309: tidy-html5/tidy_fuzzer: Crash in GetSurrogatePair
 Reproducer Testcase: https://oss-fuzz.com/download?testcase_id=5123069669015552

This is my 3 downloads... copied to respective cases...

-rw-rw-r--  1 geoff geoff     15794 Jan 15 15:26  clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920 cases/testcase-12111.html
-rw-rw-r--  1 geoff geoff      4264 Jan 16 02:29  clusterfuzz-testcase-minimized-tidy_fuzzer-5123069669015552 cases/testcase-12309.html
-rw-rw-r--  1 geoff geoff         9 Jan 16 02:45  clusterfuzz-testcase-minimized-tidy_fuzzer-5697834188275712 cases/testcase-12074.html

To do a bit of testing... quick and dirty... but should do it...

#!/bin/sh
#< test-cases.sh
BN=`basename $0`

TMPTIDY="./tidy"
TMPCASES="12111 12309 12074"
TMPCFG="--force-output yes"
# show tidy version
$TMPTIDY -v
if [ ! "$?" = "0" ]; then
    echo "$BN: Unable to run '$TMPTIDY' - FIX ME "
    exit 1
fi

for arg in $TMPCASES; do
    #echo "$BN: $arg"
    TMPHTML="cases/testcase-$arg.html"
    TMPOUT="cases/tempout-$arg.html"
    TMPERR="cases/temperr-$arg.txt"
    echo "$BN: '$TMPTIDY $TMPCFG -o $TMPOUT -f $TMPERR $TMPHTML'"
    $TMPTIDY $TMPCFG -o $TMPOUT -f $TMPERR $TMPHTML
done

# eof

Is there any specific tidy config being used? Didn't see any, but... need to be sure...

Certainly need help to reproduce the sanitizer errors... thanks...

@stefanbucur
Copy link
Author

Thanks a lot @geoffmcl for looking into these issues! You are right that the test cases in the bug reports don't seem to reproduce directly with the 'tidy' tool, so I spent some time to understand what's going on. I believe the source of differences is that our fuzzer uses a special driver that relies on in-memory buffers to parse the input HTML file. This differs from what the 'tidy' tool does, which reads the data directly from a file. Perhaps reading from a file would mask any memory issues that the sanitizer would catch?

In any case, here is how one can reproduce the crashes using the fuzzer driver:

$ git clone [email protected]:google/oss-fuzz.git && cd oss-fuzz/
$ python infra/helper.py build_image tidy-html5
$ python infra/helper.py build_fuzzers --sanitizer=address tidy-html5

# You should now be able to reproduce the crash.
$ build/out/tidy-html5/tidy_fuzzer ~/Downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920

That being said, it is good to know that the tidy tool itself is not directly vulnerable, but users of the tidy API may be exposed.

@geoffmcl
Copy link
Contributor

Oops, there seems a 4th...

from: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=12411
 Issue 12411: tidy-html5/tidy_fuzzer: Use-of-uninitialized-value in PPrintText
 Reproducer Testcase: https://oss-fuzz.com/download?testcase_id=5705060225384448
 copy to: cases/testcase-12411

@stefanbucur thanks for the information... ok, you run the tests, using your own module... interesting... and exports int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size), and use a TidyBuffer ... all great...

Just reading the source, a little suspect of the attach_string_to_buffer service... can see it uses strlen, in tidyBufAttach(buffer, (byte*)data_string, strlen(data_string) + 1);, which would fail with binary data, which is what is in the inputs I downloaded... but need to look at that...

Will try certainly the steps you outlined soonest... looks easy as pie... a python builder, looks interesting for Windows build, but... every fix begins by repeating the bug... somehow... but that will probably be tomorrow now...

Agree 100%, libTidy should not fail, even on binary data input...

Be back soonest...

@geoffmcl
Copy link
Contributor

file: tidy-sanitize-04.txt

@stefanbucur when it is interesting code/coding, I can come back to my computers after dinner... to play a little more...

No particular problem following your setup, except some steps required sudo, ... but all seemed to download/build fine...

And running tidy_fuzzer on a previous download did indeed produce a sanitizer error - but what error?

-rwxr-xr-x 1 root root 16275248 Jan 16 23:35 tidy_fuzzer
~/projects/oss-fuzz$ build/out/tidy-html5/tidy_fuzzer ~/downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920
INFO: Seed: 4061903721
INFO: Loaded 1 modules   (9150 inline 8-bit counters): 9150 [0xac1028, 0xac33e6), 
INFO: Loaded 1 PC tables (9150 PCs): 9150 [0x7cbbc0,0x7ef7a0), 
build/out/tidy-html5/tidy_fuzzer: Running 1 inputs 1 time(s) each.
Running: /home/geoff/downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920
=================================================================
==5645==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x625000002100 at pc 0x000000636d07 bp 0x7ffff791de90 sp 0x7ffff791de88
WRITE of size 1 at 0x625000002100 thread T0
    #0 0x636d06  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x636d06)
    #1 0x637197  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x637197)
    #2 0x617684  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x617684)
    #3 0x6174b6  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x6174b6)
    #4 0x61bae0  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x61bae0)
    #5 0x61ba75  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x61ba75)
    #6 0x61ba75  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x61ba75)
    #7 0x61ba75  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x61ba75)
    #8 0x5c1332  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5c1332)
    #9 0x5bd7ed  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5bd7ed)
    #10 0x5344eb  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5344eb)
    #11 0x5346ca  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5346ca)
    #12 0x55f205  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x55f205)
    #13 0x535206  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x535206)
    #14 0x540ac6  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x540ac6)
    #15 0x53487c  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x53487c)
    #16 0x7fc3a57a3b96  (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)
    #17 0x41d288  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x41d288)

0x625000002100 is located 0 bytes to the right of 8192-byte region [0x625000000100,0x625000002100)
allocated by thread T0 here:
    #0 0x4ef32f  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x4ef32f)
    #1 0x62f638  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x62f638)
    #2 0x5f6061  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5f6061)
    #3 0x5f5d91  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5f5d91)
    #4 0x5fbae8  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5fbae8)
    #5 0x5e8dd5  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5e8dd5)
    #6 0x5be8ab  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5be8ab)
    #7 0x5bce50  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5bce50)
    #8 0x534493  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x534493)
    #9 0x5346ca  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x5346ca)
    #10 0x55f205  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x55f205)
    #11 0x535206  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x535206)
    #12 0x540ac6  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x540ac6)
    #13 0x53487c  (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x53487c)
    #14 0x7fc3a57a3b96  (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/geoff/projects/oss-fuzz/build/out/tidy-html5/tidy_fuzzer+0x636d06) 
Shadow bytes around the buggy address:
  0x0c4a7fff83d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff83e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff83f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff8400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff8410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c4a7fff8420:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8430: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8440: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8450: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8460: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8470: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==5645==ABORTING
~/projects/oss-fuzz$ 

Wow, that looks interesting... but what am I to make of this?

In say, a current next memory leak case, I get a very clear report -

==6782==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 298 byte(s) in 18 object(s) allocated from:
    #0 0x7fce12b49b50 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb50)
    #1 0x562f006631f3 in stringWithFormat (/home/geoff/projects/html_tidy/tidy-html5/build/temp-sanit/tidy+0x991f3)
    #2 0x562f00663bbf in localize_option_names (/home/geoff/projects/html_tidy/tidy-html5/build/temp-sanit/tidy+0x99bbf)
    #3 0x562f0066a6cf in xml_help (/home/geoff/projects/html_tidy/tidy-html5/build/temp-sanit/tidy+0xa06cf)
    #4 0x562f0066b99d in main (/home/geoff/projects/html_tidy/tidy-html5/build/temp-sanit/tidy+0xa199d)
    #5 0x7fce1269bb96 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b96)

SUMMARY: AddressSanitizer: 298 byte(s) leaked in 18 allocation(s).

You filed this issue as Issue 12111: tidy-html5/tidy_fuzzer: Heap-buffer-overflow in prvTidyEncodeCharToUTF8Bytes... Where/how did you get that information? ...

It seems a good lead... getting closer... I hope...

@stefanbucur
Copy link
Author

@geoffmcl thanks for pointing out this issue! It does look like running the fuzzer binary directly prevents ASAN from resolving symbols and showing them in stack traces. I did a bit of research and realized that there is a better way to debug the issue:

  1. In the projects/tidy-html5/build.sh file, could you add in the beginning flags that cause the build to include debug symbols:

    export CFLAGS="${CFLAGS} -g"
    export CXXFLAGS="${CXXFLAGS} -g"
  2. You can then rebuild the fuzzer and run it inside a debug container, using the infra/helper.py scripts:

    $ python infra/helper.py build_image tidy-html5
    $ python infra/helper.py build_fuzzers --sanitizer=address tidy-html5 --clean
    # Copy the crashing test case in the debug environment.
    $ cp ~/Downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920 build/out/tidy-html5/
    
    # Run the debug container.
    $ python infra/helper.py shell base-runner-debug
    
    # Once inside the container, you can reproduce the crash with full debug information.
    $ gdb --args /out/tidy-html5/tidy_fuzzer /out/tidy-html5/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920
    

I hope this setup will be more effective at pointing out how the execution leads to the crash.

@geoffmcl
Copy link
Contributor

@stefanbucur thank you for the additional clues, help...

I think what you propose for build.sh is very valid, even workable, but why not use the native cmake... like -

diff --git a/projects/tidy-html5/build.sh b/projects/tidy-html5/build.sh
index 840e9a50..522713de 100644
--- a/projects/tidy-html5/build.sh
+++ b/projects/tidy-html5/build.sh
@@ -18,8 +18,9 @@
 
 mkdir -p ${WORK}/tidy-html5
 cd ${WORK}/tidy-html5
+CMAKE_FLAGS="-DCMAKE_C_FLAGS=-fsanitize=address -DTIDY_RC_NUMBER=SN01 -DCMAKE_BUILD_TYPE=Debug"
 
-cmake -GNinja ${SRC}/tidy-html5/
+cmake -GNinja ${SRC}/tidy-html5/ $CMAKE_FLAGS
 ninja
 
 for fuzzer in tidy_config_fuzzer tidy_fuzzer; do

Assuming that Ninja supports this/these... but I think it will...

It ensures the generated static libtidys.a which also has the sanitizer code, and the build type, debug, ensures -g is added to the mix...

Will try to test, try, this... soonest...

Including a tidy_fuzzer.c suggested code change... use of TidyBuffer, which has a length attribute...

diff --git a/projects/tidy-html5/tidy_fuzzer.c b/projects/tidy-html5/tidy_fuzzer.c
index 3cf2732d..23e1de39 100644
--- a/projects/tidy-html5/tidy_fuzzer.c
+++ b/projects/tidy-html5/tidy_fuzzer.c
@@ -38,6 +38,7 @@ void run_tidy_parser(TidyBuffer* data_buffer,
     tidyRelease(tdoc);
 }
 
+#if 0 /* 00000000000000000000000000000000000 */
 void attach_string_to_buffer(const uint8_t* data,
                              size_t size,
                              TidyBuffer* buffer) {
@@ -50,6 +51,7 @@ void attach_string_to_buffer(const uint8_t* data,
     }
     tidyBufAttach(buffer, (byte*)data_string, strlen(data_string) + 1);
 }
+#endif /* #if 0 - 00000000000000000000000000000000000 */
 
 int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
     TidyBuffer data_buffer;
@@ -59,7 +61,9 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
     tidyBufInit(&output_buffer);
     tidyBufInit(&error_buffer);
 
-    attach_string_to_buffer(data, size, &data_buffer);
+    /* attach_string_to_buffer(data, size, &data_buffer); can be binary data!!! ie has null's */
+    tidyBufAppend( &data_buffer, (void *)data, (uint)size ); /* move data into buffer */
+    
     run_tidy_parser(&data_buffer, &output_buffer, &error_buffer);
     
     tidyBufFree(&error_buffer);

Seems no reason to strndup the input, when a TidyBuffer offers that same safe, complete (8-bit-binary) length copy...

You could even append a "\0", if you think that would help... but input length is input length in tidy,,,

Still to test out the results...

Some initial trouble, that maybe sudo python infra/helper.py build_fuzzers --sanitizer=address tidy-html5 --clean might remove...

Thanks for the help, feedback... look forward to more... thanks...

@geoffmcl
Copy link
Contributor

@stefanbucur - suggested changes push to my fork, test-tidy branch...

But still some troubles in testing... any help appreciated... thanks...

@geoffmcl
Copy link
Contributor

tidy-sanitize-05.txt

@stefanbucur some steps forward, but some failures...

Was unable to get the error output, to show the addresses... do not know why... need more help... but...

Using the 4 samples - 12111 12309 12074 12411 - only 1 and 4 showed an error, but it looks like the same error! a 1 byte overrun...

Using my fork amended tidy_fuzzer.c got no errors...

Just to hopefully prove the point, made another test fix of tidy_fuzzer.c - removing strndup and strlen -

diff --git a/projects/tidy-html5/tidy_fuzzer.c b/projects/tidy-html5/tidy_fuzzer.c
index 3cf2732d..5c582b4b 100644
--- a/projects/tidy-html5/tidy_fuzzer.c
+++ b/projects/tidy-html5/tidy_fuzzer.c
@@ -42,13 +42,16 @@ void attach_string_to_buffer(const uint8_t* data,
                              size_t size,
                              TidyBuffer* buffer) {
     // Use a NULL-terminated copy to make it more likely to expose
-    // buffer overflows.
-    char *data_string = strndup((const char*)data, size);
+    // buffer overflows. Where is this documented?
+    /* char *data_string = strndup((const char*)data, size); */
+    char *data_string = (char *)malloc( size + 1 ); /* allocate desired buffer + 1 */
     if (data_string == NULL) {
         perror("Could not allocate string buffer.");
         abort();
     }
-    tidyBufAttach(buffer, (byte*)data_string, strlen(data_string) + 1);
+    memcpy( data_string, data, size ); /* copy in the data */
+    data_string[size] = 0; /* ensure zero term., not really required, but no harm... */
+    tidyBufAttach(buffer, (byte*)data_string, size + 1); /* attach buffer */
 }
 
 int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
@@ -64,6 +67,8 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
     
     tidyBufFree(&error_buffer);
     tidyBufFree(&output_buffer);
+    /* API NOTE: If user supplied the buffer, should call tidyBufDetach()
+       so caller must free... but if default malloc/free use, then no prob. with */
     tidyBufFree(&data_buffer);
     return 0;
 }

Note I respect your desire to add a null byte to the input stream... but do not think this adds anything...

And please see API NOTE: using tidyBufAttach should be paired with tidyBufDetach - that is the caller deals with allocating and freeing, if need be... but...

Prefer the direct data transfer to the TidyBuffer as suggested in the previous patch...

Starting to conclude there may be a bug in the fuzzer code... what do you think...

@stefanbucur
Copy link
Author

@geoffmcl my apologies for the latency (I was away for a while). Thank you for the detailed investigation, let me take a closer look and will get back to you soon!

@stefanbucur
Copy link
Author

@geoffmcl you are right that copying the original buffer to a null-terminated string is unnecessary! My original confusion stemmed from an assumption that tidy would not accept \0 characters in its input, which was not the case.

So I ended up simplifying the code even more, and now I made the following changes:

diff --git a/projects/tidy-html5/tidy_fuzzer.c b/projects/tidy-html5/tidy_fuzzer.c
index 3cf2732..3de6854 100644
--- a/projects/tidy-html5/tidy_fuzzer.c
+++ b/projects/tidy-html5/tidy_fuzzer.c
@@ -38,19 +38,6 @@ void run_tidy_parser(TidyBuffer* data_buffer,
     tidyRelease(tdoc);
 }
 
-void attach_string_to_buffer(const uint8_t* data,
-                             size_t size,
-                             TidyBuffer* buffer) {
-    // Use a NULL-terminated copy to make it more likely to expose
-    // buffer overflows.
-    char *data_string = strndup((const char*)data, size);
-    if (data_string == NULL) {
-        perror("Could not allocate string buffer.");
-        abort();
-    }
-    tidyBufAttach(buffer, (byte*)data_string, strlen(data_string) + 1);
-}
-
 int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
     TidyBuffer data_buffer;
     TidyBuffer output_buffer;
@@ -59,11 +46,11 @@ int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
     tidyBufInit(&output_buffer);
     tidyBufInit(&error_buffer);
 
-    attach_string_to_buffer(data, size, &data_buffer);
+    tidyBufAttach(&data_buffer, (byte*)data, size);
     run_tidy_parser(&data_buffer, &output_buffer, &error_buffer);
-    
+
     tidyBufFree(&error_buffer);
     tidyBufFree(&output_buffer);
-    tidyBufFree(&data_buffer);
+    tidyBufDetach(&data_buffer);
     return 0;
 }

I believe this is in line with what you were suggesting too. Note that I directly attached the original data buffer to the tidy buffer: This is because the original buffer is guaranteed to be allocated in its own block, without any extra padding in memory. This helps ASAN catch any memory operations outside the buffer.

The next step was to transform the original crashing input: for some reason, tidy was crashing if that input was considered only until the null terminator, including the null. So I wrote a one liner to copy the test case to a null-terminated string:

$ cat ~/Downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920 \
  | python -c 'import sys; input = sys.stdin.read(); print input[:input.find("\0")+1]' \
  >~/Downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-nullterm
$ sudo cp ~/Downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-nullterm \
  build/out/tidy-html5/

You can check that the two files are actually different, so the original file had a null character inside. (On a side note, I found it very interesting that the OSS-Fuzz crash minimizer wasn't able to remove everything after the first null character, and we had to do it manually here.)

Now we can rebuilt the image, rebuild the fuzzer, etc. and run the new clusterfuzz-testcase-minimized-tidy_fuzzer-nullterm test case inside the container, which will cause an ASAN violation.

Let me know if you are now able to reproduce the issue, and thanks a lot for looking into this thus far!

@geoffmcl
Copy link
Contributor

@stefanbucur thanks for the feedback... yes, definitely, using a TidyBuffer, you do not have to be concerned about a \0, or 2, or many - it works on byte size only...

Reviewing your patch, using tidyBufAttach/tidyBufDetach looks great...

But where is the code? Has it been pushed to the oss-fuzz repo, or somewhere else... can not seem to get the code...

Will try to find time to patch my fork accordingly... and test again... including running your python filter on *testcase*85920 - issue 12111...

You will note issue #798 ... is maybe related to issue 12074 - *testcase*75712 - an IsHighSurrogate indication... but we will have to see after that is fixed... shortly, I hope...

As indicated, the important issue here is to be able to replicate the bug(s)... find the code problem... then we can do something about them... thanks...

@stefanbucur
Copy link
Author

@geoffmcl thanks for the reminder, I had forgotten to submit a PR for the change. This is now done in google/oss-fuzz#2125. Hopefully this time the crash will be reproducible!

@geoffmcl
Copy link
Contributor

geoffmcl commented Feb 2, 2019

@stefanbucur thanks for the merged PR... updated my oss-fuzz clone, and my oss-fuzz-fork... rebuilt the fuzzer exe, x 2... docker is quite slow at re-creating the /usr/lib/libFuzzingEngine.a each time, but not a problem...

Got no error in the oss-fuzz re-tests... yuk!

In my oss-fuzz-fork, added your python rewrite of *testcase*85920, issue 12111... the file is reduced from 15794 to 15159 bytes...

But, sorry, still no errors...

Seem out of options... What to try next?

Any help appreciated... Thanks...

@geoffmcl
Copy link
Contributor

geoffmcl commented Feb 2, 2019

@stefanbucur oops, belay that...

Now that I look at the test log, there is an error on the modified *testcase*85920.nt... null terminated copy... don't know how I missed it...

But it is the same as previous... heap-buffer-overflow... 1 byte... and like before, all function references are missing...

Now if we could get address references into that, we may be on the way to somewhere...

As stated, any help appreciated... thanks...

@geoffmcl
Copy link
Contributor

geoffmcl commented Feb 3, 2019

20190203:oss-fuzz-02.txt

Yikes,

Had done, as mentioned -

$ cat ~/downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920 \
  | python -c 'import sys; input = sys.stdin.read(); print input[:input.find("\0")+1]' \
  >~/downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt
$ sudo cp ~/downloads/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt \
  build/out/tidy-html5/

Got -

-rw-rw-r--  1 geoff geoff     15794 Jan 15 15:26  clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920
-rw-r--r--  1 geoff geoff     15159 Feb  2 03:45  clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt

Docker rebuilt all -

~/projects/oss-fuzz-fork$ ./build/work/tidy-html5/tidy-html5/tidy -v
HTML Tidy for Linux version 5.7.22.SN01
~/projects/oss-fuzz-fork$ dir ./build/work/tidy-html5/tidy-html5/tidy
-rwxr-xr-x 1 root root 13949312 Feb  3 19:46 ./build/work/tidy-html5/tidy-html5/tidy

Running -

$ sudo python infra/helper.py shell base-runner-debug

then -

root@4d33795a92ff:/# gdb --args /out/tidy-html5/tidy_fuzzer /out/tidy-html5/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /out/tidy-html5/tidy_fuzzer...done.
(gdb) run
Starting program: /out/tidy-html5/tidy_fuzzer /out/tidy-html5/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
INFO: Seed: 1258204244
INFO: Loaded 1 modules   (6 inline 8-bit counters): 6 [0xad66a8, 0xad66ae), 
INFO: Loaded 1 PC tables (6 PCs): 6 [0x806e80,0x806ee0), 
[New Thread 0x7ffff2679700 (LWP 23)]
/out/tidy-html5/tidy_fuzzer: Running 1 inputs 1 time(s) each.
Running: /out/tidy-html5/clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt
=================================================================
==19==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x625000002100 at pc 0x00000066be59 bp 0x7fffffffd450 sp 0x7fffffffd448
WRITE of size 1 at 0x625000002100 thread T0
SCARINESS: 31 (1-byte-write-heap-buffer-overflow)
    #0 0x66be58 in prvTidyEncodeCharToUTF8Bytes /src/tidy-html5/src/utf8.c:357:16
    #1 0x66cd46 in prvTidyPutUTF8 /src/tidy-html5/src/utf8.c:454:11
    #2 0x63e3f7 in prvTidyNormalizeSpaces /src/tidy-html5/src/clean.c:1816:21
    #3 0x63dfe3 in prvTidyNormalizeSpaces /src/tidy-html5/src/clean.c:1798:13
    #4 0x6449ae in prvTidyReplacePreformattedSpaces /src/tidy-html5/src/clean.c:2540:13
    #5 0x644a33 in prvTidyReplacePreformattedSpaces /src/tidy-html5/src/clean.c:2546:13
    #6 0x644a33 in prvTidyReplacePreformattedSpaces /src/tidy-html5/src/clean.c:2546:13
    #7 0x644a33 in prvTidyReplacePreformattedSpaces /src/tidy-html5/src/clean.c:2546:13
    #8 0x5c3345 in tidyDocSaveStream /src/tidy-html5/src/tidylib.c:2240:9
    #9 0x5bdb8c in tidyDocSaveBuffer /src/tidy-html5/src/tidylib.c:1383:18
    #10 0x5bdac4 in tidySaveBuffer /src/tidy-html5/src/tidylib.c:1249:12
    #11 0x5351bb in run_tidy_parser /src/tidy_fuzzer.c:36:9
    #12 0x53532a in LLVMFuzzerTestOneInput /src/tidy_fuzzer.c:50:5
    #13 0x55fba6 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:526:15
    #14 0x535e66 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/libfuzzer/FuzzerDriver.cpp:283:6
    #15 0x541706 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:684:9
    #16 0x5354dc in main /src/libfuzzer/FuzzerMain.cpp:19:10
    #17 0x7ffff6ee582f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
    #18 0x41d158 in _start (/out/tidy-html5/tidy_fuzzer+0x41d158)

0x625000002100 is located 0 bytes to the right of 8192-byte region [0x625000000100,0x625000002100)
allocated by thread T0 here:
    #0 0x4ef97f in malloc /src/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:145
    #1 0x661f1e in defaultAlloc /src/tidy-html5/src/alloc.c:64:45
    #2 0x661f8b in defaultRealloc /src/tidy-html5/src/alloc.c:81:16
    #3 0x60bd1d in AddByte /src/tidy-html5/src/lexer.c:957:24
    #4 0x60b9e2 in prvTidyAddCharToLexer /src/tidy-html5/src/lexer.c:996:9
    #5 0x614ad7 in GetTokenFromStream /src/tidy-html5/src/lexer.c:2561:9
    #6 0x6128ec in prvTidyGetToken /src/tidy-html5/src/lexer.c:2507:12
    #7 0x5fe711 in prvTidyParseDocument /src/tidy-html5/src/parser.c:4606:20
    #8 0x5bf653 in prvTidyDocParseStream /src/tidy-html5/src/tidylib.c:1500:9
    #9 0x5bcd97 in tidyDocParseBuffer /src/tidy-html5/src/tidylib.c:1195:18
    #10 0x5bcd04 in tidyParseBuffer /src/tidy-html5/src/tidylib.c:1120:12
    #11 0x535163 in run_tidy_parser /src/tidy_fuzzer.c:33:9
    #12 0x53532a in LLVMFuzzerTestOneInput /src/tidy_fuzzer.c:50:5
    #13 0x55fba6 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:526:15
    #14 0x535e66 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/libfuzzer/FuzzerDriver.cpp:283:6
    #15 0x541706 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:684:9
    #16 0x5354dc in main /src/libfuzzer/FuzzerMain.cpp:19:10
    #17 0x7ffff6ee582f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)

SUMMARY: AddressSanitizer: heap-buffer-overflow /src/tidy-html5/src/utf8.c:357:16 in prvTidyEncodeCharToUTF8Bytes
Shadow bytes around the buggy address:
  0x0c4a7fff83d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff83e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff83f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff8400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c4a7fff8410: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c4a7fff8420:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8430: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8440: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8450: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8460: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c4a7fff8470: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==19==ABORTING
[Thread 0x7ffff2679700 (LWP 23) exited]
[Inferior 1 (process 19) exited with code 01]
(gdb) 

Still digesting all that...

Seems a good lead, in that it happens in tidySaveBuffer, after the parse, and now a repeatable input *testcase*85920.nt...

Investigating, as time permits...

@geoffmcl
Copy link
Contributor

geoffmcl commented Feb 4, 2019

Windows: Tried Dr. Memory - undated to DrMemory-Windows-1.11.17918-1 - using the tidy RelWithDebInfo build, on the *testcase*85920.nt, but, alas, no mem. errors detected...

Begin 2019-02-04  3:16:46.12 
HTML Tidy for Windows version 5.7.22
2019-02-02  03:45            15,159 clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt
         Dr. Memory version 1.11.17918
         Running "F:\Projects\tidy-html5\build\Win64\RelWithDebInfo\tidy.exe -f temperr64.txt C:\Users\user\Documents\Tidy\temp-788\clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt"
         
         NO ERRORS FOUND:
               0 unique,     0 total unaddressable access(es)
               0 unique,     0 total invalid heap argument(s)
               0 unique,     0 total GDI usage error(s)
               0 unique,     0 total handle leak(s)
               0 unique,     0 total warning(s)
               0 unique,     0 total,      0 byte(s) of leak(s)
               0 unique,     0 total,      0 byte(s) of possible leak(s)
         Details: D:\DrMemory\DrMemory-Windows-1.11.17918-1\drmemory\logs\DrMemory-tidy.exe.860.000\results.txt
         WARNING: application exited with abnormal code 0x2

But this does not involve printing to a TidyBuffer, where I think the overrun may have occured... do not know yet... just onwards...

@stefanbucur
Copy link
Author

@geoffmcl Did you get the chance to look into this issue more closely? It is indeed curious that this happens when saving the buffer (one would usually think such issues appear at parsing time). So it must be that the parser creates some structure that keeps around pointers to the input, but the structure has the sizes of the tokens wrong in some places. However, I can't seem to pin-point the exact moment when this happens, because parsing is already done by the time the buffer is saved...

In any case, I also wanted to ask you if you'd like to be notified of any new such findings in the future. I've realized I can configure the project to list you as the primary maintainer, so you also get access to all the issues by default. Let me know.

@geoffmcl
Copy link
Contributor

geoffmcl commented Mar 1, 2019

@stefanbucur yes, a week or so back, I did run my tidy-by-buf app, sort of a special version of tidy, only using TidyBuffer, with Dr. Memory, in Windows, and got sort of a hit???

Begin 2019-02-05 21:28:38.96 
tidy-by-buf version 5.7.22, circa 2019-02-05
tidy-by-buf: Using library HTML Tidy for Windows, circa 2019/01/31, version 5.7.22
2019-02-02  03:45            15,159 clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt
         Dr. Memory version 1.11.17918
         Running "F:\Projects\tidy-test\build.x64\relwithdebinfo\tidy-by-buf.exe C:\Users\user\Documents\Tidy\temp-788\clusterfuzz-testcase-minimized-tidy_fuzzer-5639351547985920.nt"
         
         Error #1: UNADDRESSABLE ACCESS beyond heap bounds: writing 1 byte(s)
         prvTidyEncodeCharToUTF8Bytes
             ??:0
         prvTidyPutUTF8 
             ??:0
         prvTidyNormalizeSpaces
             ??:0
         prvTidyReplacePreformattedSpaces
             ??:0
         prvTidyReplacePreformattedSpaces
             ??:0
         prvTidyReplacePreformattedSpaces
             ??:0
         prvTidyReplacePreformattedSpaces
             ??:0
         tidyDiscardElement
             ??:0
         prvTidyReportMarkupVersion
             ??:0
         tidySaveBuffer 
             ??:0
         run_tidy_parser
             f:\projects\tidy-test\src\tidy-by-buf.c(139):
         LLVMFuzzerTestOneInput
             f:\projects\tidy-test\src\tidy-by-buf.c(159):
         tidy_by_buffer 
             f:\projects\tidy-test\src\tidy-by-buf.c(279):
         main           
             f:\projects\tidy-test\src\tidy-by-buf.c(298):
         Note: refers to 0 byte(s) beyond last valid byte in prior malloc
         
         ERRORS FOUND:
               1 unique,     1 total unaddressable access(es)
               0 unique,     0 total invalid heap argument(s)
               0 unique,     0 total GDI usage error(s)
               0 unique,     0 total handle leak(s)
               0 unique,     0 total warning(s)
               0 unique,     0 total,      0 byte(s) of leak(s)
               0 unique,     0 total,      0 byte(s) of possible leak(s)
         Details: D:\DrMemory\DrMemory-Windows-1.11.17918-1\drmemory\logs\DrMemory-tidy-by-buf.exe.12316.000\results.txt
         WARNING: application exited with abnormal code 0x2

But in quite a number of MSVC Debug sessions, on this, have been unable to get to the actual overrun... so no fix yet... sadly...

You a very correct, the parser creates some structure that keeps around pointers to the input,..., it is called the lexer... all text parsed by tidy during input phase, is moved to and stored in this buffer, expanding it, as required... and nodes that contain text, store a start and end offset into this lexer buffer...

The input stream is then closed. After this, there is the Clean & Repair phase...

Then in the output phase, tidySaveBuffer, the node's text is accessed using these start/end offsets... I suspect, practically the last node output, in this case, is somehow overrunning this buffer... maybe, or maybe something else... very difficult to trap, since prvTidyPutUTF8/prvTidyEncodeCharToUTF8Bytes is called many times for this sample file...

Waiting until I can get up steam to attack this again... it is quite frustrating ...

As to whether I want to be listed, in the oss-fuzz repo, as the primary maintainer, I am not too sure...

Already this issue of fuzz testing has cost a lot of time - even to the extent of writing the above tidy-by-buf, to try to imitate the fuzzer's use of libTidy - has not yielded a great deal...

Out of the 4 items raised - 12111/~85920/UTF8Bytes, 12309/~15552/GetSurrogate, 12074/~75712/IsHighSurr, 12411/~84448/pprinttext - only in the first were we able to maybe repeat the results, and even then, not until you worked on it, modified it... including the fuzzer module itself...

I certainly do not want to be overloaded with so called fuzz failures/reports... there has to be a repeatable scenario... without that, they are just an unconfirmed report, of something!... that may in fact be a false positives...

But will always do my best to investigate if/when raised, as an issue, here, with a repeatable sample...

So we really need to do some sort of triage of oss-fuzz reports, and copy only the tested, repeatable ones here... if there are more...

Is that person you? or others? Certainly need help testing, finding, fixing... etc...

Look forward to further feedback... thanks...

@balthisar
Copy link
Member

Feel free to add me. I'll close this.

@stefanbucur
Copy link
Author

Feel free to add me. I'll close this.

Thanks! I opened google/oss-fuzz#6029

@stefanbucur
Copy link
Author

@balthisar you should now be CC-ed on all tidy-html5 bugs found by fuzzing. There are currently over 25 open issues, which were found over time (but no one has taken a look at them yet): https://bugs.chromium.org/p/oss-fuzz/issues/list?q=label:Proj-tidy-html5

@balthisar
Copy link
Member

@stefanbucur, I'm not really sure how to "take a look at them"; I'll try to understand everything that you and @geoffmcl went through, but in general when I debug with the test cases, I sometimes see the failures, and often don't. I am reading from the file, and not doing anything in a Docker container at this point or attaching things via a buffer. tbd.

And if I catch the error in the debugger in a container, I'm not really sure how to attach Xcode's debugger into a process in the container. I have a CLion license; I suppose I could learn that tool.

I have found and corrected a couple of stack overflow issues, but I'm not sure how to re-rest the failed cases in order to close them.

I'll re-open this issue until I have my end resolved.

@balthisar balthisar reopened this Jul 31, 2021
@stefanbucur
Copy link
Author

Apologies for the late reply!

@stefanbucur, I'm not really sure how to "take a look at them"; I'll try to understand everything that you and @geoffmcl went through, but in general when I debug with the test cases, I sometimes see the failures, and often don't. I am reading from the file, and not doing anything in a Docker container at this point or attaching things via a buffer. tbd.

Using Docker might be necessary to fully capture the execution environment where the crash happened, but you should not have to deal with it directly. https://google.github.io/oss-fuzz/advanced-topics/reproducing/ has crash repro instructions, if you haven't come across them already.

And if I catch the error in the debugger in a container, I'm not really sure how to attach Xcode's debugger into a process in the container. I have a CLion license; I suppose I could learn that tool.

Unfortunately I don't have experience with Xcode nor CLion, but the Python scripts in the repro instructions also produce binaries in an output directory that you might be able to execute directly (e.g., <debugger> path/to/fuzzer <testcase>.

I have found and corrected a couple of stack overflow issues, but I'm not sure how to re-rest the failed cases in order to close them.

This sounds great - you actually don't have to do anything here; the system will automatically retest periodically and auto-close the bugs if the crash no longer reproduces.

I'll re-open this issue until I have my end resolved.

Sounds good, and please let me know if you run into any issues. I can connect you with others who might be able to help.

@ddkilzer
Copy link

FYI, the issue in this comment above is still not fixed, and is tracked here:

tidy-html5:tidy_fuzzer: Heap-buffer-overflow in prvTidyEncodeCharToUTF8Bytes
https://issues.oss-fuzz.com/issues/42498297

I posted PR #1138 to fix it, but there is also a fix in PR #1008.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants