Snowflake Overview

Snowflake is data warehouse built for the cloud. Read more about Snowflake.

Integrator includes several pre-built flows optimized for Snowflake.

Connecting to Snowflake

To connect to Snowflake:

Step 1. Open Connections window, click the + button.

Step 2. Type in snowflake in the Search field.

Snowflake

Step 3. Select Snowflake connection and continue by defining connection parameters.

Snowflake connection properties

Read more about configuring database connections.

Read more about configuring connection to Snowflake using the JDBC driver.

Snowflake flows

Using Snowflake-specific flows you can extract data from any of the supported sources and load it directly into the Snowflake.

Typical Snowflake flow performs the following operations:

  1. Extracts data from the source.
  2. Creates CSV or JSON files.
  3. Compresses files using gzip algorithm.
  4. Copies files into Amazon S3 bucket used as an staging area for Snowflake.
  5. Checks if destination Snowflake tables exists, and if it does not - creates table using metadata from the source.
  6. Executes Snowflake COPY INTO command.
  7. Cleans up remaining files if needed.

Start creating Snowflake flows by opening Flow Builder, clicking the + button, and typing in snowflake in the search field:

Snowflake flows

Continue by selecting the flow type and entering flow parameters:

Snowflake flow

Sources and Destinations

Depending on flow type you can select one of the following sources (FROM) for the Snowflake flow:

For the most of the Snowflake flows, the destination (TO) is going to be Amazon S3 connection. Integrator uses Amazon S3 as a staging area for the files created during extract from the source (FROM).

Important: Your destination (TO) name is going to be the name of the file copied to the staging Amazon S3 bucket.

Note: when configuring a connection for Amazon S3, which will be used as a staging area for the Snowflake flows, it is recommend to select GZip as an value for the Archive file before copying to S3 field:

Snowflake S3

Final Destination (Snowflake Connection)

For all Snowflake flows the final destination is Snowflake.

To configure final destination, click the Connections tab and select the connection created in the step 1.

Flow Snowflake connection

Mapping

If needed you can create a mapping between source and destination (Snowflake) fields.

You create mapping between source and destination just like you usually do for any other flow type.

Note: mapping is not required but, please remember that if source field name is not supported by Snowflake, it will return an error and data will not be loaded into the database. For example, if you are loading data from the Google Analytics, the output (source) is going to include fields with a prefix ga: (ga:user, ga:browser, etc.) Unfortunately Snowflake does not support fields with : so the data will be rejected. In this case you can use mapping to rename destination fields:

Snowflake mapping

Parameters

When creating a Snowflake flow you must specify required parameters such as Snowflake Table Name. You can also add optional parameters, transformations and options:

Snowflake flow parameters

Depending on flow type, other flow parameters can be added as explained below:

Change Replication

Just like for any other flow type, it is possible to configure a change replication.

When change replication is enabled, only changed records are getting loaded into the Snowflake.

Basically all you need to do is set the high watermark field and enable change replication for the transformation:

Snowflake change replication

Flow examples

Extract from database - load into Snowflake

This flow extracts data from the relational database table and loads it directly into the Snowflake table.

Step 1. Create and test Snowflake connection as explained here.

Step 2. Create connection to Amazon S3 bucket used as a staging area as explained here. When configuring S3 connection select GZip as an value for the Archive file before copying to S3 field.

Step 3. Create connection to the source relational database (index.md#relational-databases).

Step 4. Start creating Snowflake flow as explained here.

Step 5. Select "Extract data from database, load into Snowflake".

Step 6. Continue by defining source and destination as explained here.

Step 7. For the source (FROM) select a connection created in step 3 and enter table name as a source (FROM). For example report.

Step 8. For the destination (TO) select a connection created in step 2 and enter name of the csv file as a destination (TO). For example report.csv.

Step 9. Select Snowflake connection, created in step 1, for the final destination as explained here.

Step 10. Continue by defining mapping as explained here.

Step 11. Continue by defying parameters as explained here. The only field you will need to enter data into is Snowflake Table Name.

Step 12. Configure change replication if needed.

Step 13. Save and test the flow.

Extract from Google Analytics - load into Snowflake

This flow extracts data from the Google Analytics API and loads it directly into the Snowflake table.

Step 1. Create and test Snowflake connection as explained here.

Step 2. Create connection to Amazon S3 bucket used as a staging area as explained here. When configuring S3 connection select GZip as an value for the Archive file before copying to S3 field.

Step 3. Create CSV file format as explained here.

Step 4. Create connection to Google Analytics API as explained here.

Step 5. Start creating Snowflake flow as explained here.

Step 6. Select "Extract data from API, load into Snowflake".

Step 8. Continue by defining source and destination as explained here.

Step 9. For the source (FROM) select a connection created in step 5 and enter report as a source (FROM).

Step 10. For the destination (TO) select a connection created in step 2 and enter report.csv as a destination (TO).

Step 11. Select Snowflake connection, created in step 1, for the final destination as explained here.

Step 12. Continue by defining mapping as explained here. In this particular case you will need to rename destination fields.

Step 13. Continue by defying parameters as explained here. The only field you will need to enter data into is Snowflake Table Name.

Step 14. Save and test the flow.

Load files into Snowflake

This flow loads CSV or JSON files directly into the Snowflake. The files can be anywhere, where Integrator can read them: server storage, FTP, SFTP, cloud storage, etc.

Important: this is just a bulk load - files must already exist.

Step 1. Create a source connection which can be either a file storage or a cloud storage.

Step 2. Create and test Snowflake connection as explained here.

Step 3. Create connection to Amazon S3 bucket used as a staging area as explained here. When configuring S3 connection select GZip as an value for the Archive file before copying to S3 field.

Step 4. Start creating Snowflake flow as explained here.

Step 5. Select "Load files into Snowflake".

Load files into Snowflake

Step 6. Continue by defining source and destination.

Step 7. For the source (FROM) select a connection created in step 1 and enter a file name or a wildcard file name, for example *.csv.

Step 8. For the destination (TO) select a connection created in step 3.

Step 9. Select Snowflake connection, created in step 2, for the final destination as explained here.

Step 10. Continue by defying parameters. The only required parameter is Copy INTO SQL:

Copy INTO SQL

Step 11. Define other parameters (all optional):

Step 12. Save and test the flow.