Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change ident for EVFILT_USER to 0 and add a test #1582

Merged
merged 22 commits into from
Apr 29, 2024

Conversation

panjf2000
Copy link
Contributor

@panjf2000 panjf2000 commented Mar 21, 2024

Conventionally, ident for EVFILT_USER is set to 0, which is what other renowned networking frameworks like netty(java), mio(rust), gnet(go), swift-nio(swift), etc. do currently.

Conventionally, ident for EVFILT_USER is set to 0 to avoid
collision of file descriptors, which is what other renowned
networking frameworks like netty(java), mio(rust), gnet(go),
swift-nio(swift), etc. do currently.
@panjf2000
Copy link
Contributor Author

Hi @azat, got a minute for this?

@panjf2000
Copy link
Contributor Author

Kindly ping @azat

Copy link
Member

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but does it fixes anything? Or just helps with debugging?
I mean that 0 is also an fd.

@panjf2000
Copy link
Contributor Author

OK, but does it fixes anything? Or just helps with debugging?

Yes, this change has a practical effect. You see, we added kevent.ident:42 for EVFILT_USER on FreeBSD/NetBSD/macOS (eventfd on Linux, pipe on others) during the initialization process. When 42 is allocated for a network file descriptor by the kernel later on, it will result in a file descriptor collision in kqueue because re-adding an existing event will modify the parameters of the original event, and not result in a duplicate entry.

I mean that 0 is also an fd.

True, but 0 represents stdin and it's ideal for the kevent.ident for EVFILT_USER for two reasons:

  1. 0 is usually reserved for standard input, which means that it won't be allocated for a new file descriptor by the kernel unless explicitly instructed to do so under some extremely specific circumstances such as closing standard streams and opening a new file, redirecting standard streams via dup2, or inheritance for subprocesses, etc.
  2. Networking frameworks like netty, libevent, etc. normally won't go under those aforementioned circumstances.

@azat

@panjf2000
Copy link
Contributor Author

panjf2000 commented Mar 30, 2024

Besides, I think this NOTIFY_IDENT:42 might have something to do with those intermittent failures of FreeBSD CI jobs and this change should have fixed them.

@azat
Copy link
Member

azat commented Mar 30, 2024

Besides, I think this NOTIFY_IDENT:42 might have something to do with those intermittent failures of FreeBSD CI jobs and this change should have fixed them.

Interesting, can you provide a reproducer with raw kqueue maybe?

When 42 is allocated for a network file descriptor by the kernel later on

But don't we hit the same issue if will add an event for 0 fd?

Maybe what do you think about using INT_MAX for this purpose?

@panjf2000
Copy link
Contributor Author

panjf2000 commented Mar 30, 2024

Interesting, can you provide a reproducer with raw kqueue maybe?

Actually, I can't assure you that's the root cause, just take a shot :)

But don't we hit the same issue if will add an event for 0 fd?

I don't quite get you here, why would we add a 0 fd?

Maybe what do you think about using INT_MAX for this purpose?

I don't think this is a good idea given that event_io_map is an array and using INT_MAX will cause a big allocation waste.

@azat
Copy link
Member

azat commented Mar 30, 2024

I don't quite get you here, why would we add a 0 fd?

To monitor events on STDIN? I.e. event_new(base, STDIN_FILENO, EV_READ, cb, NULL)

I don't think this is a good idea given that event_io_map is an array and using INT_MAX will cause a big allocation waste.

Correct, but this fd will not be added to to evmap, only to kqueue for monitoring

@panjf2000
Copy link
Contributor Author

Correct, but this fd will not be added to to evmap, only to kqueue for monitoring

You're suggesting that we create a special branch in evmap_io_add_ that avoids calling evmap_make_space() as well as adding the INT_MAX fd into the "map"?

@azat
Copy link
Member

azat commented Mar 30, 2024

The special treatment is not needed, NOTIFY_IDENT is added only to kqueue, it does not leave the kqueue.c, so it is not added to evmap

@azat
Copy link
Member

azat commented Mar 30, 2024

Actually, I can't assure you that's the root cause, just take a shot :)

This sounds really interesting, I would love to see at the example that will show this problem!

@panjf2000
Copy link
Contributor Author

The special treatment is not needed, NOTIFY_IDENT is added only to kqueue, it does not leave the kqueue.c, so it is not added to evmap

Oh,right. I misread your comment. I think it's feasible.

@azat
Copy link
Member

azat commented Mar 30, 2024

And I would really want to understand is it a problem or not (I'm not sure about this actually), so before, can you please try to reproduce it, and then a) we can add a test and b) write a sensible comment for this constant and commit

@panjf2000
Copy link
Contributor Author

And I would really want to understand is it a problem or not (I'm not sure about this actually), so before, can you please try to reproduce it, and then a) we can add a test and b) write a sensible comment for this constant and commit

This can be a tricky case to reproduce, but I'll try.

@panjf2000
Copy link
Contributor Author

It turns out that unlike epoll, kqueue uses the (ident, filter) pair to identify a kevent, therefore NOTIFY_IDENT can be any value for EVFILT_USER. As a result, either retaining or changing it is okay, I added a new test anyway, but it's not mandatory, it can be discarded from this PR.

Demo code
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <sys/types.h>
#include <sys/event.h>
#include <sys/time.h>
#include <string.h>

#define NOTIFY_IDENT 42

void *writer_thread(void *arg) {
    int *fd = (int *)arg;  // Get file descriptor from argument

    sleep(1); // Let things settle before writing
    const char *message = "Hi!";
    write(*fd, message, strlen(message));
    printf("Wrote data to pipe...\n");

    return NULL;
}

void *trigger_thread(void *arg) {
    int *fd = (int *)arg;  // Get file descriptor from argument

    struct kevent ev;
    struct timespec timeout = { 0, 0 };
    EV_SET(&ev, NOTIFY_IDENT, EVFILT_USER, 0, NOTE_TRIGGER, 0, NULL);

    sleep(1); // Let things settle before writing
    if (kevent(*fd, &ev, 1, NULL, 0, &timeout) == -1) {
        perror("Error triggering event");
        //exit(1);
	return NULL;
    }
    printf("Triggered user-level event...\n");

    return NULL;
}

int main() {
    int kq = kqueue();
    if (kq == -1) {
        perror("kqueue");
        exit(1);
    }
    printf("kqueue=%d\n", kq);

    int pipefd[2];
    if (pipe(pipefd) == -1) {
        perror("pipe");
        exit(1);
    }

    printf("pipefd[0]=%d, pipefd[1]=%d\n", pipefd[0], pipefd[1]);

    // Overwrite pipe read-end with fd 42
    if (dup2(pipefd[0], NOTIFY_IDENT) == -1) {
        perror("dup2");
        exit(1);
    }
    close(pipefd[0]);
    pipefd[0] = NOTIFY_IDENT;

    // Add the pipe read-end for monitoring read events
    struct kevent ev2;
    EV_SET(&ev2, NOTIFY_IDENT, EVFILT_READ, EV_ADD, 0, 0, NULL);
    kevent(kq, &ev2, 1, NULL, 0, NULL);

    // Initial registration for user-level events
    struct kevent ev1;
    EV_SET(&ev1, NOTIFY_IDENT, EVFILT_USER, EV_ADD | EV_CLEAR, 0, 0, NULL);
    kevent(kq, &ev1, 1, NULL, 0, NULL);

    // Writer thread
    pthread_t writer;
    pthread_create(&writer, NULL, writer_thread, &pipefd[1]); // Pass the fd of pipe write-end

    // Trigger thread
    pthread_t trigger;
    pthread_create(&trigger, NULL, trigger_thread, &kq);

    pthread_join(trigger, NULL);
    pthread_join(writer, NULL);

    // Blocking kevent for 10s if no events happen
    struct timespec timeout = { 10, 0 };
    struct kevent events[2];
    int n = kevent(kq, NULL, 0, events, 2, &timeout);
    printf("kevent returned %d events\n", n);
    for (int i = 0; i < n; i++) {
        printf("  ident: %ld, filter: %d, flags: %d, fflags: %d\n",
        events[i].ident, events[i].filter, events[i].flags, events[i].fflags);
    }

    close(pipefd[0]);
    close(pipefd[1]);
    close(kq);
    return 0;
}

Result:

kqueue=3
pipefd[0]=4, pipefd[1]=5
Triggered user-level event...
Wrote data to pipe...
kevent returned 2 events
  ident: 42, filter: -10, flags: 33, fflags: 0
  ident: 42, filter: -1, flags: 1, fflags: 0

@panjf2000 panjf2000 requested a review from azat April 1, 2024 10:41
@panjf2000 panjf2000 changed the title Change ident for EVFILT_USER to 0 to avoid fd collision Change ident for EVFILT_USER to 0 and add a test Apr 1, 2024
Copy link
Member

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, can you please fix the build and also address review comments. Thanks!

@panjf2000 panjf2000 requested a review from azat April 8, 2024 03:25
@panjf2000
Copy link
Contributor Author

panjf2000 commented Apr 8, 2024

I kept getting this error on macOS m1 while building the new test:

Undefined symbols for architecture arm64:
  "_evthread_use_pthreads", referenced from:
      _main in test-kq-collision.c.o
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [bin/test-kq-collision] Error 1
make[1]: *** [CMakeFiles/test-kq-collision.dir/all] Error 2
make: *** [all] Error 2

I don't know what I'm missing, maybe you do? @azat

@panjf2000
Copy link
Contributor Author

panjf2000 commented Apr 11, 2024

Any clues? @azat

@panjf2000
Copy link
Contributor Author

Got a minute for this? @azat

@azat
Copy link
Member

azat commented Apr 15, 2024

Undefined symbols for architecture arm64:
"_evthread_use_pthreads", referenced from:

You need to link event_pthreads for your example.

@azat
Copy link
Member

azat commented Apr 17, 2024

Build is broken:

make[3]: *** No rule to make target 'test-kq-collision', needed by 'CMakeFiles/verify'.  Stop.

@panjf2000 panjf2000 requested a review from azat April 17, 2024 08:30
@panjf2000
Copy link
Contributor Author

Kindly ping @azat

Copy link
Member

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the CI will pass it will be ready to land.

@panjf2000
Copy link
Contributor Author

Once the CI will pass it will be ready to land.

Thanks, but I don't think CI linux-autotools-job is happy about this. Should we be worried about that?

@azat

@azat
Copy link
Member

azat commented Apr 20, 2024

Definitely yes

@panjf2000
Copy link
Contributor Author

I've fixed the autotools failures. @azat

@panjf2000 panjf2000 requested a review from azat April 22, 2024 08:42
azat added 4 commits April 24, 2024 09:35
* upstream/master:
  Do not set TCP keepalive on Unix sockets
  Fix some comments
  Avoid calling read(2) on eventfd on each event-loop wakeup
@azat
Copy link
Member

azat commented Apr 24, 2024

Not completely:

2024-04-23T10:02:01.2817740Z ../test/test.sh: line 80: KQUEUE: command not found

Fixed. And I hope now test.sh is more robust

@azat
Copy link
Member

azat commented Apr 24, 2024

cmake was also broken, it simply does not run new test - fixed

@panjf2000
Copy link
Contributor Author

panjf2000 commented Apr 25, 2024

The CIs of libevent are extremely slow, I noticed that the regress tests took the most time, which makes the PR progress very inefficient, are there any improvements we can do to speed the tests up? @azat

@azat
Copy link
Member

azat commented Apr 25, 2024

The CIs of libevent are extremely slow

Yes, sadly, but this is a very know issue, especially for macos/windows, since github provides less machine time for public use

are there any improvements we can do to speed the tests up?

Yes, there is at least one thing that can be done #1452

Another thing is to setup own private workers, so that we will have more resources, but this is whole another story.

@panjf2000
Copy link
Contributor Author

I think we're good to go? @azat

@azat azat merged commit aef201a into libevent:master Apr 29, 2024
54 of 66 checks passed
@panjf2000 panjf2000 deleted the kqueue-ident branch April 29, 2024 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants