Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for bitcode generation on meson builds #103

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

darkfronza
Copy link
Collaborator

PG-0

Description

This PR adds support for generating llvm bitcode files when building PostgreSQL with meson (-Dllvm=enabled).

It handles both static C source files and generated source files as well (flex, bison, etc).

Added a new section on root meson.build to handle bitcode generation.

This commit handles bitcode generation of src/backend code, a meson
generator object is defined (`llvm_gen`) and used to generate bitcode for
backend sources listed in the `backend_sources` variable.

Generated backend sources which don't require special compilation flags
are stored in the `bc_generated_sources` variable.

For generated backend sources which require special compilation flags,
a separate variable is defined `bc_gen_custom`, it's a list storing the
results of invoking the bitcode generator for each generated backend
sources, here is an example snippet from src/backend/parser/meson.build:

```
if llvm.found()
  llvm_parser_cflags = [
    '-I@BUILD_ROOT@/src/backend/parser',
    '-I@SOURCE_ROOT@/src/backend/parser',
  ]

  llvm_gen_parser = generator(llvm_irgen_command,
    arguments: llvm_irgen_args + bitcode_cflags + llvm_parser_cflags,
    depends: parser,
    depfile: '@basename@.c.bc.d',
    output: '@Plainname@.bc',
  )

  bc_gen_custom += llvm_gen_parser.process(bc_parser_sources)
endif
```

On top we can see custom CFLAGS to be passed for this specific
generator `llvm_parser_cflags`, then the custom generator is invoked to
process the generated sources `bc_parser_sources`, the result is appended
to the `bc_gen_custom` list which will then be used by the main
meson.build file to link all the bitcode files together and create the
bitcode index file.
@it-percona-cla
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link
Collaborator

@dutow dutow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a detailed review yet, just easy to spot issues after a first look.

@@ -14,4 +14,8 @@ auth_delay = shared_module('auth_delay',
auth_delay_sources,
kwargs: contrib_mod_args,
)
contrib_targets += auth_delay
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and everywhere else: why only add it to contrib_tarets when llvm.found()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally removed, fixup: f26737a

@@ -195,4 +289,4 @@ subdir('jit/llvm')
subdir('replication/libpqwalreceiver')
subdir('replication/pgoutput')
subdir('snowball')
subdir('utils/mb/conversion_procs')
subdir('utils/mb/conversion_procs')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental end of file endline change, in multiple files

@@ -17,6 +17,10 @@ vacuumlo = executable('vacuumlo',
)
contrib_targets += vacuumlo

# if llvm.found()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why commented out?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to remove, I noticed this target doesn't have bitcode generated in autoconf setup, it links to frontend libraries and seems not required to have bitcode.
fixup: 578caa8

@@ -20,7 +20,21 @@ postgres_fdw = shared_module('postgres_fdw',
'dependencies': contrib_mod_args['dependencies'] + [libpq],
},
)
contrib_targets += postgres_fdw
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This time it is completely removed, even if llvm.found()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally removed, fixup: f26737a

@@ -17,6 +17,10 @@ oid2name = executable('oid2name',
)
contrib_targets += oid2name

# if llvm.found()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out code

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to remove, I noticed this target doesn't have bitcode generated in autoconf setup, it links to frontend libraries and seems not required to have bitcode:

oid2name = executable('oid2name',
  oid2name_sources,
  dependencies: [frontend_code, libpq],
  kwargs: default_bin_args,
)

fixup: f59f98e

bitcode_cflags += get_option('c_args')
bitcode_cflags += cppflags

# XXX: Worth improving on the logic to find directories here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still todo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left all the original comments from Andres for a review with the team, any suggestions for improvements are welcome.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that all of these should be investigated so we udnerstand why he added these and ideally we should fix all of them.

'install_dir': dir_lib_pkg,
}

# XXX: Need to determine proper version of the function cflags for clang
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still todo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left all the original comments from Andres for a review with the team, any suggestions for improvements are welcome.

pg_ext_cflags += cflags

# Directories for extensions to install into
# XXX: more needed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still todo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left all the original comments from Andres for a review with the team, any suggestions for improvements are welcome.

# referenced on some platforms, via mod_link_with_dir
pg_ext_vars += 'bindir=${prefix}/@0@'.format(dir_bin)

# XXX: Define variables making it easy to define tests, too
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still todo?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left all the original comments from Andres for a review with the team, any suggestions for improvements are welcome.

@@ -23,7 +23,24 @@ bool_plperl = shared_module('bool_plperl',
'build_rpath': '@0@/CORE'.format(archlibexp),
},
)
contrib_targets += bool_plperl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you accidentally deleted this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's adjusted/fixed on the last commit e4cb123

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as far as I can see.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, fixup that completely restores contrib_targets even when llvm.found() == false
f26737a

@dutow
Copy link
Collaborator

dutow commented Mar 5, 2025

Thanks, now it works with a clean build!

I looked into the output of this branch, and I see two problems in it.

The second might be an already existing difference that nobody noticed yet, but it's worth a look.

And the first is actually three issues internally.

  1. I still see clear differences between make and meson:
  • Make generates 940 files
  • Meson generates 924

Also, the meson build generates one single index file (posgres.index.bc), while the make build generates 49 index.bc files, one for every module.

This also means that the meson build actually generates more bc files than the make build, because the above 940/924 also includes the index files. 940-48 is less than 900. Do we generate more than necessary, or is there a bug in the make build and it's missing something?

The directory structure is also different, meson adds an additional "x86_64-linux-gnu" directory to everything, but also still places many files directly into the top directory instead of the proper directory structure. It seems like the issue is mainly with the contrib modules. That can cause issues in the future is we have any filename collisions.

I'm not saying that the directory structure has to be exactly the same - that's a nice to have thing, but it should work without it. But we should follow some directory structure to avoid file collisions, and the separate index files are also needed.

  1. Looking randomly at a few disassembled bc files, the difference I see is that make uses absolute paths for debuginfo, while meson uses relative:
1,2c1,2
< ; ModuleID = 'genam.meson.bc'
< source_filename = "../src/backend/access/index/genam.c"
---
> ; ModuleID = 'genam.make.bc'
> source_filename = "/home/dutow/work/postgres/makebuild/../src/backend/access/index/genam.c"
150c150
< @.str.1 = private unnamed_addr constant [36 x i8] c"../src/backend/access/index/genam.c\00", align 1
---
> @.str.1 = private unnamed_addr constant [72 x i8] c"/home/dutow/work/postgres/makebuild/../src/backend/access/index/genam.c\00", align 1
180c180

And then some error handling related functions use different array sizes, but that could be related to this - absolute pathnames are longer, and require bigger arrays. This could also mean that the difference is deeper in the make/meson build somewhere, and nobody noticed it yet, but it can also be that we invoke llvm during bc generation differently.

This is a corner case, but might be important if somebody wants to debug the JIT, so it is worth a look.

@darkfronza
Copy link
Collaborator Author

@jeltz @dutow this fixup adds two bitcode files for contrib modules that were missing (dblink.bc and seg.bc)
d87d1da

Added a new variable in root meson.build for tracking the list of source
files in contrib for further bitcode generation: `llvm_contrib_sources`.

This list is used for non generated sources, which share common
compilation flags for contrib modules.

For contrib modules that require specific compilation flags, a separate
variable was added on root meson.build: `bc_gen_custom_contrib`.

It must be noted that there is no standard way of reusing meson
dependency (`dep`) objects for custom target / generator as a mean of
getting proper compilation flags, those must be included manually.

As an example, for generating bitcode for contrib/hstore_plpython, we
can't reuse the `python3_dep` object in the generator, so one way to get
proper include directory was querying the `python3_inst` object like
this: pyinc = python3_inst.get_variable('INCLUDEPY'), which extracts the
variable name using python sysinfo.
Add the bitcode generator directly in the place where fmgrtab.c is
generated, this avoids iterating the generated backend sources in the
main meson.build to convert the relative file path to absolute before
bitcode generator, thus simplifying logic.

Note that the dependency on the fmgrtab generator is implicit in the
`llvm_gen_fmgrtab.process(fmgrtab_target[2])` call, as fmgrtab_target[2]
is a custom_tgt.
Meson generators yield a warning whenever the "depends" points to a
target with more than one output, since we already specify the
dependencies in the gen.process method, the "depends" can be removed
from the generator creation.
Make `llvm_contrib_sources` variable a dictionary, which tracks contrib
module names along with their related source files, then iterate over
each module generating bitcode files from it and create an index file
using the module name along with the related bitcode files.

Care was taken to generate contrib bitcode and index files following the
same directory structure from autoconf/make (applies affer install-world
only).

This commit handles only contrib modules which don't have dynamically
generated source files, following commits should add support to it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants