|
| 1 | +# Load MIMIC-IV into a PostgreSQL database |
| 2 | + |
| 3 | +The scripts in this folder create the schema for MIMIC-IV and load the data into the appropriate tables for PostgreSQL v10+. |
| 4 | + |
| 5 | +<!-- |
| 6 | +* You can follow the tutorial to run each file individually. Windows users can follow along [here](https://mimic.physionet.org/tutorials/install-mimic-locally-windows/), while *nix/Mac OS X users can follow along [here](https://mimic.physionet.org/tutorials/install-mimic-locally-ubuntu/) |
| 7 | +
|
| 8 | +If following the tutorials, be sure to download the scripts locally and the MIMIC-III files locally. If you choose the makefile approach, see the below section. |
| 9 | +
|
| 10 | +--> |
| 11 | + |
| 12 | +First ensure that Postgres is running on your computer. For installation instructions, see: [http://www.postgresql.org/download/](http://www.postgresql.org/download/) |
| 13 | + |
| 14 | +Once Postgres is installed, clone the [mimic-iv](https://github.com/MIT-LCP/mimic-iv) repository into a local directory. We only need the contents of this directory, but it's useful to have the repository locally. You can clone the repository using the following command: |
| 15 | + |
| 16 | +``` bash |
| 17 | +git clone https://github.com/MIT-LCP/mimic-iv.git |
| 18 | +``` |
| 19 | + |
| 20 | +Change to the `buildmimic/postgres/` directory. Create the schemas and tables with the following psql command. **This will delete any data present in the schemas.** |
| 21 | + |
| 22 | +```sh |
| 23 | +psql -f create.sql |
| 24 | +``` |
| 25 | + |
| 26 | +Afterwards, we need to load the MIMIC-IV files into the database. To do so, we'll specify the location of the local CSV files (compressed or uncompressed). |
| 27 | +Note that this assumes the folder structure is as follows: |
| 28 | + |
| 29 | +``` |
| 30 | +mimic_data_dir |
| 31 | + core |
| 32 | + admissions.csv |
| 33 | + ... |
| 34 | + hosp |
| 35 | + icu |
| 36 | +``` |
| 37 | + |
| 38 | +If you have compressed files (.csv.gz), you can leave them compressed, and use the `load_gz.sql` script instead. |
| 39 | +Once you have verified your data is stored in this structure, run: |
| 40 | + |
| 41 | +```sh |
| 42 | +psql -v ON_ERROR_STOP=1 -v mimic_data_dir=<INSERT MIMIC FILE PATH HERE> -f load.sql |
| 43 | +``` |
| 44 | + |
| 45 | + |
| 46 | +## Troubleshooting / FAQ |
| 47 | + |
| 48 | +### Specify a database for installation |
| 49 | + |
| 50 | +Optionally, you can specify the database name with the `-d` argument. First, you must create the database if it does not already exist: |
| 51 | + |
| 52 | +```sh |
| 53 | +createdb mimiciv |
| 54 | +``` |
| 55 | + |
| 56 | +After the database exists, the schema and tables can be created under this database as follows: |
| 57 | + |
| 58 | +```sh |
| 59 | +psql -d mimiciv -f create.sql |
| 60 | +``` |
| 61 | + |
| 62 | +Finally, loading the data into this data requires specifying the database name with `-d mimiciv` again: |
| 63 | + |
| 64 | +```sh |
| 65 | +psql -d mimiciv -v ON_ERROR_STOP=1 -v mimic_data_dir=<INSERT MIMIC FILE PATH HERE> -f load.sql |
| 66 | +``` |
| 67 | + |
| 68 | +### Error creating schema |
| 69 | + |
| 70 | +```sql |
| 71 | +psql:postgres_create_tables.sql:12: ERROR: syntax error at or near "NOT" |
| 72 | +LINE 1: CREATE SCHEMA IF NOT EXISTS mimiciii; |
| 73 | +``` |
| 74 | + |
| 75 | +The `IF NOT EXISTS` syntax was introduced in PostgreSQL 9.3. Make sure you have the latest PostgreSQL version. While one possible option is to modify the code here to be function under earlier versions, we highly recommend upgrading as most of the code written in this repository uses materialized views (which were introduced in PostgreSQL version 9.4). |
| 76 | + |
| 77 | +### Peer authentication failed |
| 78 | + |
| 79 | +If during `make mimic-build` you encounter following error: |
| 80 | + |
| 81 | +```bash |
| 82 | +psql "dbname=mimic user=postgres options=--search_path=mimiciii" -v ON_ERROR_STOP=1 -f postgres_create_tables$(psql --version | perl -lne 'print "_pg10" if / 10.\d+/').sql |
| 83 | +psql: FATAL: Peer authentication failed for user "postgres" |
| 84 | +Makefile:110: recipe for target 'mimic-build' failed |
| 85 | +make: *** [mimic-build] Error 2 |
| 86 | +``` |
| 87 | + |
| 88 | +... this indicates that the database exists, but the script failed to login as the user `postgres`. By default, postgres installs itself with a user called `postgres`, and only allows "peer" authentication: logging in with the same username as your operating system username. Consequently, a common issue users have is being unable to access the database with the default postgres users. |
| 89 | + |
| 90 | +There are many possible solutions, but the two easiest are (1) allowing `postgres` to login via password authentication or (2) creating the database with a username that matches your operating system username. |
| 91 | + |
| 92 | +#### (1) Allow password authentication |
| 93 | + |
| 94 | +Locate your `pg_hba.conf` file and update the method of access from "peer" to "md5" (md5 is password authentication), e.g. here is an example using text editor `nano`: |
| 95 | + |
| 96 | +```bash |
| 97 | +sudo nano /etc/postgresql/10/main/pg_hba.conf |
| 98 | +``` |
| 99 | + |
| 100 | +(Path may change on different postgresql version). Change `local all postgres peer` to `local all postgres md5`. |
| 101 | + |
| 102 | +Restart postgresql service with: |
| 103 | + |
| 104 | +```bash |
| 105 | +sudo service postgresql restart |
| 106 | +``` |
| 107 | + |
| 108 | +#### (2) Use operating system |
| 109 | + |
| 110 | +Specify $DBUSER to be your operating system username, e.g. on Ubuntu you can use the `$USER` environment variable directly: |
| 111 | + |
| 112 | +`make create-user mimic-gz datadir="$datadir" DBUSER="$USER"` |
| 113 | + |
| 114 | +### NOTICE |
| 115 | + |
| 116 | +```sql |
| 117 | +NOTICE: materialized view "XXXXXX" does not exist, skipping |
| 118 | +``` |
| 119 | + |
| 120 | +This is normal. By default, the script attempts to delete tables before rebuilding them. If it cannot find the table to delete, it outputs a notice letting the user know. |
| 121 | + |
| 122 | +### Stuck on copy |
| 123 | + |
| 124 | +Many users report that the scripts get stuck at the following point: |
| 125 | + |
| 126 | +```sh |
| 127 | +COPY 58976 |
| 128 | +COPY 34499 |
| 129 | +COPY 7567 |
| 130 | +``` |
| 131 | + |
| 132 | +This is expected. The 4th table is CHARTEVENTS, and this table can take many hours to load. Give it time, and ensure that the computer does not automatically hibernate during this time. |
| 133 | + |
| 134 | +Also note that eventually, the 4th line will read `COPY 0`. This is expected, see https://github.com/MIT-LCP/mimic-code/issues/182 |
| 135 | + |
| 136 | +## Older versions of PostgreSQL |
| 137 | + |
| 138 | +If you have an older version of PostgreSQL, then it is still possible to load MIMIC, but modifications to the scripts are required. In particular, the scripts use declarative partitioning for larger tables to speed up queries. To read more about [declarative partitioning, see the PostgreSQL documentation](https://www.postgresql.org/docs/10/static/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE). You can remove declarative partitionining by modifying the create script, and removing it for each affected table. For example, chartevents uses declarative partitioning, and thus the create.sql script creates many partitions for chartevents: chartevents_01, chartevents_02, ..., etc. Replacing these with a single create statement for chartevents will make the script compatible for older versions of PostgreSQL. |
| 139 | + |
| 140 | +### Other |
| 141 | + |
| 142 | +Please see the [issues page](https://github.com/MIT-LCP/mimic-iv/issues) to discuss other issues you may be having. |
0 commit comments