Skip to content
This repository was archived by the owner on Aug 16, 2023. It is now read-only.

Fix issue #121 #125

Merged
merged 4 commits into from
Oct 10, 2018
Merged

Conversation

Roger-luo
Copy link
Contributor

@Roger-luo Roger-luo commented Oct 5, 2018

The original method finds the minimum not the maximum version, because if there are both GLIBCXX_3.4.1 and GLIBCXX_3.4.24, then max_version will equal to the first one.

This should fix #121

@codecov-io
Copy link

codecov-io commented Oct 5, 2018

Codecov Report

Merging #125 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master    #125   +/-   ##
======================================
  Coverage    88.4%   88.4%           
======================================
  Files           8       8           
  Lines         647     647           
======================================
  Hits          572     572           
  Misses         75      75
Impacted Files Coverage Δ
src/PlatformNames.jl 95.97% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00f4910...dd352e1. Read the comment docs.

@zhouyan
Copy link

zhouyan commented Oct 6, 2018

I don't think this fixed the original issue, and there's no logical error in the original while the new ones is actually wrong, it will return 3.4.25 for arbitrary library, for example, see the following session,

julia> hdl = dlopen("/usr/lib64/libc.so.6") # Not even libstdc++ at all
Ptr{Nothing} @0x00007fe4fa47a4d0

julia> max_version = v"3.4.0"
v"3.4.0"

julia> for minor_version in 1:26
               if dlsym_e(hdl, "GLIBCXX_3.4.$(minor_version)") == C_NULL
                   global max_version = VersionNumber("3.4.$(minor_version - 1)")
               end
           end

julia> max_version
v"3.4.25"

The root problem is that you can't test if the ABI symbol exists with dlsym for some GCC builds. Even though nm shows that symbol exists but it is not in the dynamic symbol table got by Libdl. I have been searching for sometime and cannot find a solution yet. The recommended way of testing ABI compatibility is at source level with the GLIBCXX* macros. For the shared library, it will depends on the .*@GLIBCXX.* symbols, which by default is exported if GCC was built with versioning, which is the default. I can't really see a reliable way to test GCC version this way.

My suggestion is that instead of generate an error, just return the minimum version required by Julia (GCC 4.7) if it is unsure which ABI it is.

@Roger-luo
Copy link
Contributor Author

Yeah, I was wrong...

@zhouyan
Copy link

zhouyan commented Oct 6, 2018

Here is an example showing the problem. It uses C and thus by pass any possible dll issues introduced by Julia.

$ nm /opt/fairtide/tools/gcc/lib64/libstdc++.so.6.0.25 | grep 'GLIBCXX_3.4.25'
0000000000000000 A GLIBCXX_3.4.25

This is our GCC build, it is of version 8.2

$ cat test.c
#include <dlfcn.h>
#include <stdio.h>

int main()
{

    void *handle = dlopen("/opt/fairtide/tools/gcc/lib64/libstdc++.so.6.0.25",  RTLD_LAZY|RTLD_DEEPBIND|RTLD_LOCAL);
    void *sym = dlsym(handle, "GLIBCXX_3.4.25");
    printf("%p", sym);

    return 0;
}

Here's the source of the C test program, it just open the library and look for that symbol, note that it is opened with the same flags as the Julia default flags for Linux

$ cc -o test test.c -ldl
$ ./test
(nil)

Here's the results.

@Roger-luo
Copy link
Contributor Author

I did find GLIBCXX_3.4.25, but the dlsym_e(hdl, "GLIBCXX_3.4.25") returns CNULL (which I though is not).

@zhouyan
Copy link

zhouyan commented Oct 6, 2018

I did find GLIBCXX_3.4.25, but the dlsym_e(hdl, "GLIBCXX_3.4.25") returns CNULL (which I though is not).

Yeah, that's the problem. dlsym_e is not the right way to test ABI. I have no clue what's the "right way" either.

@Roger-luo
Copy link
Contributor Author

Err... probably, we will need a constant build with Julia compiler itself, then we will be able to know which ABI Julia runtime is using.

@Roger-luo
Copy link
Contributor Author

Roger-luo commented Oct 6, 2018

I deleted the error and use a warning instead, so at least this will warn people there might be something wrong. (I guess BinaryProvider are mainly used in build.jl, so a warning is OK)

And as @zhouyan suggested, returns :gcc4 now.

@staticfloat
Copy link
Member

staticfloat commented Oct 10, 2018

The root problem is that you can't test if the ABI symbol exists with dlsym for some GCC builds.

This is true, but it's because of a subtle Julia bug (fixed on master) and weird choice by the glibc developers to use NULL-valued symbols as ABI version markers. The proper way to check for the existence of a null-valued symbol is to use dlerror() after dlsym() to determine if there was an error during lookup. Check out the second paragraph in "Description" of the dlsym() man page.

For the shared library, it will depends on the .@glibcxx. symbols, which by default is exported if GCC was built with versioning, which is the default.

This symbol versioning approach is actually what I first implemented, (see the chunk of code immediately above the section changed here) but that requires more advanced ELF file introspection which would require more code (e.g. the ObjectFile package) which is no good.

Instead of returning :gcc4 here, I think we should return :gcc_any, as it's better to fail gracefully. I've made those changes and also tweaked the error message a bit; thanks for the diagnosis and I'm sorry that it took me so long to notice this, please do not be shy to ping me via @staticfloat in these issues!

@staticfloat staticfloat merged commit cb403e6 into JuliaPackaging:master Oct 10, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
4 participants