Configuring Redshift / PostgreSQL Access
This article will describe how to configure a Redshift or Data Warehouse credentials for use by Census, and why those permissions are needed.
Census reads data from one or more tables (possibly across different schemata) in your database and publishes it to the corresponding objects in external systems such as Salesforce. To limit the load on your database as well as to other apps' APIs, Census computes a “diff” to determine changes between each update. In order to compute these diffs, Census creates and writes to a set of tables to a private bookkeeping schema (2 or 3 tables for each sync job configured).
We recommend you create a dedicated
CENSUS user account with a strong, unique password. Census uses this account to connect to your Redshift or PostgreSQL database. In order for the Census connection to work correctly, the
CENSUS account must have these permissions:
- The ability to create the
CENSUSschema and full admin access to all tables within that schema (including creating tables, deleting tables, and reading and writing to all tables).
- Read-only access to any tables and views in any schemata that you would like Census to publish to your service destinations.
- If you are using Census to load service data into your warehouse, read-write access to the schema where Census should load data (note that this is not included in the sample script below).
PostgreSQL and Redshift permissions are complex and there are many ways to configure access for Census. The script below has been tested with Redshift and recent PostgreSQL versions and is known to work correctly:
-- Give the census user the ability to sign in with a password CREATE USER CENSUS WITH PASSWORD '<strong, unique password>'; -- Create a private bookkeeping schema where Census can store sync state CREATE SCHEMA CENSUS; -- Give the census user full access to the bookkeeping schema GRANT ALL ON SCHEMA CENSUS TO CENSUS; -- Ensure the census user has access to any objects that may have already existed in the bookkeeping schema GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA CENSUS TO CENSUS; -- Let the census user see this schema GRANT USAGE ON SCHEMA "<your schema>" TO CENSUS; -- Let the census user read all existing tables in this schema GRANT SELECT ON ALL TABLES IN SCHEMA "<your schema>" TO CENSUS; -- Let the census user read any new tables added to this schema ALTER DEFAULT PRIVILEGES IN SCHEMA "<your schema>" GRANT SELECT ON TABLES TO CENSUS; -- Let the census user execute any existing functions in this schema GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA "<your schema>" TO CENSUS; -- Let the census user execute any new functions added to this schema ALTER DEFAULT PRIVILEGES IN SCHEMA "<your schema>" GRANT EXECUTE ON FUNCTIONS TO CENSUS;
- If you have multiple schemata that you would like Census to read from, repeat the steps for "<your schema>" for each of them
- In Redshift and older versions of PostgreSQL, if there are views in your schema that reference tables in other schemata, you will also need to give Census read access to those other schemata. In later versions of PostgreSQL this extra read access is not required.
- If you are using Census models to execute stored procedures (this is rare and not recommended for most users) you may also need to give Census access to those procedures
All connections from the Census Data Warehouse Service to your database, as well as connections from your Redshift database to S3, are protected by TLS encryption - Census will refuse to connect to a warehouse that does not support TLS. All Census data stored in S3 is encrypted with AWS Server-Side Encryption (SSE). We recommend configuring your PostgreSQL instance to use TLS v1.2 or later for all connections.
Allowed IP Addresses
Census will always connect to your data warehouse from of these static IP addresses located within AWS:
Redshift by default prevents any external IP address from accessing your data warehouse so if you will need to add these IP addresses to your security groups. For more information, visit AWS Redshift Help Center. For Postgres data warehouses, you'll need to add these IPs in your firewall, and/or add rules to your
pg_hba.conf file to only allow the Census user to connect to your database when using these IP addresses.
Deploying Redshift within an AWS VPC
Advanced methods of Redshift deployment include deploying Redshift within an AWS VPC or private submite and limiting database access to a separate, controlled database proxy which allows external IP address access. In this configuration, you'll need to separately allow Redshift to communicate directly with S3. For more information on this process, you will need to add an S3 VPC endpoint. This is definitely an obscure feature of AWS but this article does a good job of explaining S3 VPC endpoints, why they're needed and how to set one up.