Implement robust data security without sacrificing performance or usability
CipherStash Proxy provides a transparent proxy to your existing Postgres database.
Proxy:
- Automatically encrypts and decrypts the columns you specify
- Supports most query types over encrypted values
- Runs in a Docker container
- Is written in Rust and uses a formal type system for SQL mapping
- Works with CipherStash ZeroKMS and offers up to 14x the performance of AWS KMS
Behind the scenes, it uses the Encrypt Query Language to index and search encrypted data.
- Getting started
- How-to
- Reference
- More info
Important
Prerequisites: Before you start you need to have this software installed:
- Docker — see Docker's documentation for installing
Get up and running in local dev in < 5 minutes:
# Clone the repo
git clone https://github.com/cipherstash/proxy
cd proxy
# Install the CipherStash CLI
## macOS
brew install cipherstash/tap/stash
## Linux
## Download from https://github.com/cipherstash/cli-releases/releases/latest
# Setup your CipherStash configuration
stash setup --proxy
# ⬆️ this outputs creds to .env.proxy.docker
# Start the containers
docker compose up
This will start a PostgreSQL database on localhost:5432
, and CipherStash Proxy on localhost:6432
.
There's an example table called users
that you can use to start inserting and querying encrypted data with.
Note
In this example table we've chosen users' email, date of birth, and salary as examples of the kind of sensitive data that you might want to protect with encryption.
Now let's connect to the Proxy via psql
and run some queries:
docker compose exec proxy psql postgres://cipherstash:3ncryp7@localhost:6432/cipherstash
This establishes an interactive session with the database, via CipherStash Proxy.
Now insert and read some data via Proxy:
INSERT INTO users (encrypted_email, encrypted_dob, encrypted_salary) VALUES ('[email protected]', '1970-01-01', '100');
SELECT encrypted_email, encrypted_dob, encrypted_salary FROM users;
The INSERT
statement inserts a record into the users
table, and the SELECT
statement reads the same record back.
Notice that it looks like nothing happened: the data in the INSERT
was unencrypted, and the data in the SELECT
is also unencrypted.
Now let's connect to the database directly via psql
and see what the data actually looks like behind the scenes:
docker compose exec proxy psql postgres://cipherstash:3ncryp7@postgres:5432/cipherstash
This establishes an interactive session directly with the database (note the change of host to postgres
and port to 5432
).
Now on this direct psql
session, query the database directly:
SELECT encrypted_email, encrypted_dob, encrypted_salary FROM users;
You'll see the output is much larger, because the SELECT
returns the raw encrypted data.
The data is transparently encrypted and decrypted by Proxy in the INSERT
and SELECT
statements.
In your psql
connection to Proxy:
docker compose exec proxy psql postgres://cipherstash:3ncryp7@localhost:6432/cipherstash
Update the data we inserted in Step 1, and read it back:
UPDATE users SET encrypted_dob = '1978-02-01' WHERE encrypted_email = '[email protected]';
SELECT encrypted_dob FROM users WHERE encrypted_email = '[email protected]';
In the UPDATE
statement, the =
comparison operation in the WHERE
clause is evaluated against encrypted data.
In the SELECT
statement, the encrypted_email
value is transparently encrypted by Proxy, and compared in the database against the stored encrypted email value.
In the SELECT
statement, the SELECT
returns 1978-02-01
.
Back on the psql
session connected directly to the database, verify the data is encrypted:
SELECT encrypted_email, encrypted_dob, encrypted_salary FROM users;
This SELECT
shows the raw encrypted data — no plaintext to see.
In your psql
connection to Proxy:
docker compose exec proxy psql postgres://cipherstash:3ncryp7@localhost:6432/cipherstash
Insert more records via Proxy, and query by email:
INSERT INTO users (encrypted_email, encrypted_dob, encrypted_salary) VALUES ('[email protected]', '1991-03-06', '10');
INSERT INTO users (encrypted_email, encrypted_dob, encrypted_salary) VALUES ('[email protected]', '2005-12-30', '1000');
SELECT encrypted_email, encrypted_dob, encrypted_salary FROM users WHERE encrypted_salary <= 100;
In the INSERT
statement, the salary value is transparently encrypted by Proxy, and stored in the database in encrypted form.
In the SELECT
statement, the encrypted_salary
value is transparently encrypted and compared in the database against the stored encrypted salary value.
In the SELECT
statement, the <=
comparison operation in the WHERE
clause is evaluated against encrypted data.
In the SELECT
statement, the SELECT
returns alice
and bob
, but not carol
.
Query users
by email:
SELECT encrypted_email, encrypted_dob, encrypted_salary FROM users WHERE encrypted_email LIKE 'alice';
The literal string alice
is transparently encrypted by Proxy, and compared in the database against the stored encrypted date value.
The LIKE
comparison operation is evaluated against encrypted data.
The SELECT
will only return alice
.
Finally, query users
by date:
SELECT encrypted_email, encrypted_dob, encrypted_salary FROM users WHERE encrypted_dob > '2000-01-01' ;
The literal date 2000-01-01
is transparently encrypted by Proxy, and compared in the database against the stored encrypted date value.
The >
comparison operation is evaluated against encrypted data.
The SELECT
will only return carol
.
Back on the psql
session connected directly to the database, verify the data is encrypted:
SELECT encrypted_email, encrypted_dob, encrypted_salary FROM users;
This SELECT
shows the raw encrypted data, no plaintext to see.
This demonstrates the power of CipherStash Proxy:
- Completely transparent encryption of sensitive data in PostgreSQL
- All data remains searchable, while being protected with non-deterministic AES-256-GCM encryption
- Zero changes required to your application's database queries
This section contains how-to documentation for installing, configuring, and running CipherStash Proxy.
CipherStash Proxy is available as a container image on Docker Hub that can be deployed locally, in CI/CD, through to production.
The easiest way to start using CipherStash Proxy with your application is by adding a container to your application's docker-compose.yml
.
The following is an example of what adding CipherStash Proxy to your app's docker-compose.yml
might look like:
services:
app:
# Your Postgres container config
db:
# Your Postgres container config
proxy:
image: cipherstash/proxy:latest
container_name: proxy
ports:
- 6432:6432
- 9930:9930
environment:
# Hostname of the Postgres database server connections will be proxied to
- CS_DATABASE__HOST=${CS_DATABASE__HOST}
# Port of the Postgres database server connections will be proxied to
- CS_DATABASE__PORT=${CS_DATABASE__PORT}
# Username of the Postgres database server connections will be proxied to
- CS_DATABASE__USERNAME=${CS_DATABASE__USERNAME}
# Password of the Postgres database server connections will be proxied to
- CS_DATABASE__PASSWORD=${CS_DATABASE__PASSWORD}
# The database name on the Postgres database server connections will be proxied to
- CS_DATABASE__NAME=${CS_DATABASE__NAME}
# The CipherStash workspace ID for making requests for encryption keys
- CS_WORKSPACE_ID=${CS_WORKSPACE_ID}
# The CipherStash client access key for making requests for encryption keys
- CS_CLIENT_ACCESS_KEY=${CS_CLIENT_ACCESS_KEY}
# The CipherStash dataset ID for generating and retrieving encryption keys
- CS_DEFAULT_KEYSET_ID=${CS_DEFAULT_KEYSET_ID}
# The CipherStash client ID used to programmatically access a dataset
- CS_CLIENT_ID=${CS_CLIENT_ID}
# The CipherStash client key used to programmatically access a dataset
- CS_CLIENT_KEY=${CS_CLIENT_KEY}
# Toggle Prometheus exporter for CipherStash Proxy operations
- CS_PROMETHEUS__ENABLED=${CS_PROMETHEUS__ENABLED:-true}
For a fully-working example, go to docker-compose.yml
.
Follow the steps in Getting started to see it in action.
Once you have set up a docker-compose.yml
, start the Proxy container:
docker compose up
Connect your PostgreSQL client to Proxy on TCP 6432. Point Prometheus to scrape metrics on TCP 9930.
To run, CipherStash Proxy needs to know:
- What port to run on
- How to connect to the target PostgreSQL database
- Secrets to authenticate to CipherStash
There are two ways to configure Proxy:
- With environment variables that Proxy looks up on startup
- With a TOML file that Proxy reads on startup
Proxy's configuration loading order of preference is:
- If
cipherstash-proxy.toml
is present in the current working directory, Proxy will read its config from that file - If
cipherstash-proxy.toml
is not present, Proxy will look up environment variables to configure itself - If both
cipherstash-proxy.toml
and environment variables are present, Proxy will usecipherstash-proxy.toml
as the base configuration, and override it with any environment variables that are set
See Proxy config options for all the available options.
If you are configuring Proxy with environment variables, these are the minimum environment variables required to run Proxy:
CS_DATABASE__NAME
CS_DATABASE__USERNAME
CS_DATABASE__PASSWORD
CS_WORKSPACE_ID
CS_CLIENT_ACCESS_KEY
CS_DEFAULT_KEYSET_ID
CS_CLIENT_ID
CS_CLIENT_KEY
Read the full list of environment variables and what they do in the reference documentation.
If you are configuring Proxy with a cipherstash-proxy.toml
file, these are the minimum values required to run Proxy:
[database]
name = "cipherstash"
username = "cipherstash"
password = "password"
[auth]
workspace_id = "cipherstash-workspace-id"
client_access_key = "cipherstash-client-access-key"
[encrypt]
default_keyset_id = "cipherstash-default-keyset-id"
client_id = "cipherstash-client-id"
client_key = "cipherstash-client-key"
Read the full list of configuration options and what they do in the reference documentation.
TODO: Add instructions for running Proxy locally
Under the hood, Proxy uses CipherStash Encrypt Query Language to index and search encrypted data.
When you start the Proxy container, you can install EQL by setting the CS_DATABASE__INSTALL_EQL
environment variable:
CS_DATABASE__INSTALL_EQL=true
This will install the version of EQL bundled with the Proxy container. The version of EQL bundled with the Proxy container is tested to work with that version of Proxy.
If you are following the getting started guide above, EQL is automatically installed for you. You can also install EQL by running the installation script as a database migration in your application.
Once you have installed EQL, you can see what version is installed by querying the database:
SELECT cs_eql_version();
This will output the version of EQL installed.
In your existing PostgreSQL database, you store your data in tables and columns.
Those columns have types like integer
, text
, timestamp
, and boolean
.
When storing encrypted data in PostgreSQL with Proxy, you use a special column type called cs_encrypted_v1
, which is provided by EQL.
cs_encrypted_v1
is a container column type that can be used for any type of encrypted data you want to store or search, whether they are numbers (int
, small_int
, big_int
), text (text
), dates and times (date
), or booleans (boolean
).
Create a table with an encrypted column for email
:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email cs_encrypted_v1
)
This creates a users
table with two columns:
id
, an autoincrementing integer column that is the primary key for the recordemail
, acs_encrypted_v1
column
There are important differences between the plaintext columns you've traditionally used in PostgreSQL and encrypted columns with CipherStash Proxy:
- Plaintext columns can be searched if they don't have an index, albeit with the performance cost of a full table scan.
- Encrypted columns cannot be searched without an encrypted index, and the encrypted indexes you define determine what kind of searches you can do on encrypted data.
In the previous step we created a table with an encrypted column, but without any encrypted indexes.
Now you can add an encrypted index for that encrypted column:
SELECT cs_add_index_v1(
'users',
'email',
'unique',
'text'
);
This statement adds a unique
index for the email
column in the users
table, which has an underlying data type of text
.
unique
indexes are used to find records with columns with unique values, like with the =
operator.
There are two other types of encrypted indexes you can use on text
data:
SELECT cs_add_index_v1(
'users',
'email',
'match',
'text'
);
SELECT cs_add_index_v1(
'users',
'email',
'ore',
'text'
);
The first SQL statement adds a match
index, which is used for partial matches with LIKE
.
The second SQL statement adds an ore
index, which is used for ordering with ORDER BY
.
Now that the indexes has been added, you must activate them:
SELECT cs_encrypt_v1();
SELECT cs_activate_v1();
This loads and activates the encrypted indexes.
You must run the cs_encrypt_v1()
and cs_activate_v1()
functions after any modifications to the encrypted indexes.
![IMPORTANT] Adding, updating, or deleting encrypted indexes on columns that already contain encrypted data will not re-index that data. To use the new indexes, you must
SELECT
the data out of the column, andUPDATE
it again.
To learn how to use encrypted indexes for other encrypted data types like text
, int
, boolean
, date
, and jsonb
, see the EQL documentation.
When deploying CipherStash Proxy into production environments with real data, we recommend that you apply these database schema changes with the normal tools and process you use for making changes to your database schema.
To see more examples of how to modify your database schema, check out the example schema from Getting started.
CipherStash Proxy includes an encrypt
tool – a CLI application to encrypt existing data, or to apply index changes after changes to the encryption configuration of a protected database.
Encrypt the source
column data in table
into the specified encrypted target
column.
The encrypt
tool connects to CipherStash Proxy using the cipherstash.toml
configuration or ENV
variables.
cipherstash-proxy encrypt [OPTIONS] --table <TABLE> --columns <SOURCE_COLUMN=TARGET_COLUMN>...
At a high-level, the process for encrypting a column in the database is:
- Add a new encrypted destination column with the appropriate encryption configuration.
- Using CipherStash Proxy to process:
- Select from the original plaintext column.
- Update the encrpted column to set the plaintext value.
- Drop the original plaintext column.
- Rename the encrypted column to the original plaintext column name.
The CipherStash Proxy encrypt
tool automates the data process to encrypt one or more columns in a table.
Updates are executed in batches of 100 records (and the batch_size
is configurable).
The process is idempotent and can be run repeatedly.
The CipherStash Proxy encrypt
tool reuses the CipherStash Proxy configuration for the Proxy connection details.
This configuration includes database server host, port, username, password, and database name.
See encrypt
tool config options for the available options.
Given a running instance of CipherStash Proxy and a users
table with:
id
– a primary key columnemail
– a source plaintext columnencrypted_email
– a destination column configured to be encrypted text.
Encrypt email
into encrypted_email
:
cipherstash-proxy encrypt --table users --columns email=encrypted_email
Specify the primary key column:
cipherstash-proxy encrypt --table users --columns email=encrypted_email --primary-key user_id
Specify multiple primary key columns (compound primary key):
cipherstash-proxy encrypt --table users --columns email=encrypted_email --primary-key user_id tenant_id
This section contains reference documentation for configuring CipherStash Proxy and its features.
You can configure CipherStash Proxy with a config file, enviroment variables, or a combination of the two – see Configuring Proxy for instructions.
The following are all the configuration options available for Proxy, with their equivalent environment variables:
[server]
# Proxy host address
# Optional
# Default: `0.0.0.0`
# Env: CS_SERVER__HOST
host = "0.0.0.0"
# Proxy host posgt
# Optional
# Default: `6432`
# Env: CS_SERVER__PORT
port = "6432"
# Enforce TLS connections
# Optional
# Default: `false`
# Env: CS_SERVER__REQUIRE_TLS
require_tls = "false"
# Shutdown timeout in ms
# Sets how long to wait for connections to drain on shutdown
# Optional
# Default: `2000`
# Env: CS_SERVER__SHUTDOWN_TIMEOUT
shutdown_timeout = "2000"
# Number of worker threads the server should use
# Optional
# Default: `NUMBER_OF_CORES/2` or `4`
# Env: CS_SERVER__WORKER_THREADS
worker_threads = "4"
# Thread stack size in bytes
# Optional
# Default: `2 * 1024 * 1024` (2MiB) or `4 * 1024 * 1024` (4MiB) if log level is DEBUG or TRACE
# Env: CS_SERVER__THREAD_STACK_SIZE
thread_stack_size = "2097152"
[database]
# Database host address
# Optional
# Default: `0.0.0.0`
# Env: CS_DATABASE__HOST
host = "0.0.0.0"
# Database host post
# Optional
# Default: `5432`
# Env: CS_DATABASE__PORT
port = "5432"
# Database name
# Env: CS_DATABASE__NAME
name = "database"
# Database username
# Env: CS_DATABASE__USERNAME
username = "username"
# Database password
# Env: CS_DATABASE__PASSWORD
password = "password"
# Connection timeout in ms
# Sets how long to hold an open idle connection
# In production environments this should be greater than the idle timeout of any connection pool in the application.
#
# Optional
# No Default (NO TIMEOUT)
# Env: CS_DATABASE__CONNECTION_TIMEOUT
connection_timeout = "300000"
# Enable TLS verification
# Optional
# Default: `false`
# Env: CS_DATABASE__WITH_TLS_VERIFICATION
with_tls_verification = "false"
# Encrypt configuration reload interval in sec
# Sets how frequently Encrypted index configuration should be reloaded
# The configuration specifies the encrypted columns in the database
# Optional
# Default: `60`
# Env: CS_DATABASE__CONFIG_RELOAD_INTERVAL
config_reload_interval = "60"
# Schema configuration reload interval in sec
# Sets how frequently the database schema should be reloaded
# The schema is used to analyse SQL statements
# Optional
# Default: `60`
# Env: CS_DATABASE__SCHEMA_RELOAD_INTERVAL
schema_reload_interval = "60"
[tls]
# Certificate path
# Env: CS_TLS__CERTIFICATE_PATH
certificate_path = "./server.cert"
# Private Key path
# Env: CS_TLS__PRIVATE_KEY_PATH
private_key_path = "./server.key"
# Certificate path
# Env: CS_TLS__CERTIFICATE_PEM
certificate_pem = "..."
# Private Key path
# Env: CS_TLS__PRIVATE_KEY_PEM
private_key_pem = "..."
[auth]
# CipherStash Workspace ID
# Env: CS_WORKSPACE_ID
workspace_id = "cipherstash-workspace-id"
# CipherStash Client Access Key
# Env: CS_CLIENT_ACCESS_KEY
client_access_key = "cipherstash-client-access-key"
[encrypt]
# CipherStash Dataset ID
# Env: CS_DEFAULT_KEYSET_ID
default_keyset_id = "cipherstash-dataset-id"
# CipherStash Client ID
# Env: CS_CLIENT_ID
client_id = "cipherstash-client-id"
# CipherStash Client Key
# Env: CS_CLIENT_KEY
client_key = "cipherstash-client-key"
[log]
# Log level
# Optional
# Valid values: `error | warn | info | debug | trace`
# Default: `info`
# Env: CS_LOG__LEVEL
level = "info"
# Log format
# Optional
# Valid values: `pretty | text | structured (json)`
# Default: `pretty` if Proxy detects during startup that a terminal is attached, otherwise `structured`
# Env: CS_LOG__FORMAT
format = "pretty"
# Log format
# Optional
# Valid values: `stdout | stderr`
# Default: `info`
# Env: CS_LOG__OUTPUT
output = "stdout"
# Enable ansi (colored) output
# Optional
# Default: `true` if Proxy detects during startup that a terminal is attached, otherwise `false`
# Env: CS_LOG__ANSI_ENABLED
ansi_enabled = "true"
[prometheus]
# Enable prometheus stats
# Optional
# Default: `false`
# Env: CS_PROMETHEUS__ENABLED
enabled = "false"
# Prometheus exporter post
# Optional
# Default: `9930`
# Env: CS_PROMETHEUS__PORT
port = "9930"
The default configuration settings are biased toward running in production environments. Although Proxy attempts to detect the environment and set a sensible default for logging, your mileage may vary.
To turn on human-friendly logging:
CS_LOG__FORMAT = "pretty"
CS_LOG__ANSI_ENABLED = "true"
If you are frequently changing the database schema or making updates to the column encryption configuration, it can be useful to reload the config and schema more frequently:
CS_DATABASE__CONFIG_RELOAD_INTERVAL = "10"
CS_DATABASE__SCHEMA_RELOAD_INTERVAL = "10"
To enable a Prometheus exporter on the default port (9930
) use either:
[prometheus]
enabled = "true"
CS_PROMETHEUS__ENABLED = "true"
When enabled, metrics can be accessed via http://localhost:9930/metrics
.
If the proxy is running on a host other than localhost, access on that host.
Name | Target | Description |
---|---|---|
cipherstash_proxy_clients_active_connections |
Gauge | Current number of connections to CipherStash Proxy from clients |
cipherstash_proxy_clients_bytes_received_total |
Counter | Number of bytes received by CipherStash Proxy from clients |
cipherstash_proxy_clients_bytes_sent_total |
Counter | Number of bytes sent from CipherStash Proxy to clients |
cipherstash_proxy_decrypted_values_total |
Counter | Number of individual values that have been decrypted |
cipherstash_proxy_decryption_duration_seconds |
Histogram | Duration of time CipherStash Proxy spent performing decryption operations |
cipherstash_proxy_decryption_duration_seconds_count |
Counter | Number of observations of requests to CipherStash ZeroKMS to decrypt values |
cipherstash_proxy_decryption_duration_seconds_sum |
Counter | Total time CipherStash Proxy spent performing decryption operations |
cipherstash_proxy_decryption_error_total |
Counter | Number of decryption operations that were unsuccessful |
cipherstash_proxy_decryption_requests_total |
Counter | Number of requests to CipherStash ZeroKMS to decrypt values |
cipherstash_proxy_encrypted_values_total |
Counter | Number of individual values that have been encrypted |
cipherstash_proxy_encryption_duration_seconds |
Histogram | Duration of time CipherStash Proxy spent performing encryption operations |
cipherstash_proxy_encryption_duration_seconds_count |
Counter | Number of observations of requests to CipherStash ZeroKMS to encrypt values |
cipherstash_proxy_encryption_duration_seconds_sum |
Counter | Total time CipherStash Proxy spent performing encryption operations |
cipherstash_proxy_encryption_error_total |
Counter | Number of encryption operations that were unsuccessful |
cipherstash_proxy_encryption_requests_total |
Counter | Number of requests to CipherStash ZeroKMS to encrypt values |
cipherstash_proxy_rows_encrypted_total |
Counter | Number of encrypted rows returned to clients |
cipherstash_proxy_rows_passthrough_total |
Counter | Number of non-encrypted rows returned to clients |
cipherstash_proxy_rows_total |
Counter | Total number of rows returned |
cipherstash_proxy_server_bytes_received_total |
Counter | Number of bytes CipherStash Proxy received from the PostgreSQL server |
cipherstash_proxy_server_bytes_sent_total |
Counter | Number of bytes CipherStash Proxy sent to the PostgreSQL server |
cipherstash_proxy_statements_duration_seconds |
Histogram | Duration of time CipherStash Proxy spent executing SQL statements |
cipherstash_proxy_statements_duration_seconds_count |
Count | Number of observations of CipherStash Proxy statement duration |
cipherstash_proxy_statements_duration_seconds_sum |
Count | Total time CipherStash Proxy spent executing SQL statements |
cipherstash_proxy_statements_encrypted_total |
Counter | Number of SQL statements that required encryption |
cipherstash_proxy_statements_passthrough_total |
Counter | Number of SQL statements that did not require encryption |
cipherstash_proxy_statements_total |
Counter | Total number of SQL statements processed by CipherStash Proxy |
cipherstash_proxy_statements_unmappable_total |
Counter | Total number of unmappable SQL statements processed by CipherStash Proxy |
Option | Description | Default |
---|---|---|
-t , --table |
Specifies the table to migrate | None (Required) |
-c , --columns |
List of columns to migrate (space-delimited key=value pairs) | None (Required) |
-k , --primary-key |
List of primary key columns (space-delimited) | id |
-b , --batch-size |
Number of records to process at once | 100 |
-d , --dry-run |
Runs without updating. Loads data but does not perform updates | None (Optional) |
-v , --verbose |
Turn on additional logging output | None (Optional) |
-h , --help |
Displays this help message | - |
Check out the Proxy development guide.