Snowflake is a cloud-based data storage and analytics service providing relational database support for both structured data, such as CSV files and tables, and semi-structured data, including JSON, Avro, Parquet, etc.

 Connector Configuration

Snowflake requires the following parameters:

  • host - a Snowflake endpoint based on account name;

  • user-name - specifies the name of the user's account (provided by Snowflake);

  • password - specifies the password for the specified user;
  • db - specifies the default database to use once connected[*];
  • warehouse  - specifies the virtual warehouse to use once connected

Example:

EXEC SYSADMIN.createConnection(name => 'snowflake', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>') ;;
EXEC SYSADMIN.createDataSource(name => 'snowflake', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'varcharReserveAdditionalSpacePercent=300,supportsNativeQueries=true') ;;
SQL

Translator Properties

Amazon AWS S3 Upload

Loading data using Amazon AWS S3 (S3LOAD) can be configured and is highly recommended for any productive usage, as inserting data into Snowflake using standard JDBC protocol can be extremely slow.

The following translator parameters are required to configure S3LOAD:                                          

  • uploadMode=s3load - explicitly specifies S3LOAD mode;
  • region - AWS S3 region endpoint;
  • bucketName - specifies a bucket name to upload data files to;
  • bucketPrefix - optional, specifies a prefix of the temporary bucket to upload data files to if bucketName is not specified; must comply with Amazon S3 bucket naming convention (nb: 36 characters would be added to the bucket prefix when creating a temporary bucket);
  • createBucket - optional, specifies if the bucket set in bucketName parameter should be created if it does not exist, the default value is false;
  • keyId - AWS S3 key ID;
  • secretKey - AWS S3 secret key.

Example:

EXEC SYSADMIN.createConnection(name => 'snowflake', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>') ;;
EXEC SYSADMIN.createDataSource(name => 'snowflake', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'varcharReserveAdditionalSpacePercent=300,supportsNativeQueries=true,uploadMode=s3Load,region=<region>,bucketName=<bucket_name>,createBucket=true,keyId=<key_ID>,secretKey="<secret_key>"') ;;
SQL

To read data from an S3 bucket, the security and access management policies on the bucket must allow Snowflake to access the bucket. An IAM user with the required permissions to access the specified S3 bucket. This one-time setup involves establishing access permissions on a bucket and associating the required permissions with an IAM user. Access management instructions can be found at Configuring AWS IAM User Credentials.

Proxy settings for Amazon S3 upload can be configured via System Properties.

Proxy settings for Amazon S3 available since 2.1.7

bucketPrefix translator property is available since 2.1.7

createBucket translator property is available since 2.1.15

Microsoft Azure Blob Storage Upload

As an alternative to  Amazon AWS S3 it is possible to configure data upload via Microsoft Azure Blob Storage.

The following translator parameters are required to configure Microsoft Azure Blob Storage:

  • uploadMode=blob_azure - Microsoft Azure Blob Storage upload mode;
  • storageAccountName - Microsoft Azure Blob Storage account name
  • storageAccountKey - Microsoft Azure Blob Storage account key
  • sasToken - the SAS Token used for accessing the Blob Storage from Snowflake
  • defaultEndpointsProtocol - endpoint protocol
  • tempFolder - specifies a path to a temporary files folder
  • keepTempFiles - if set to true temporary files would not be deleted after uploading

Example:

EXEC SYSADMIN.createConnection(name => 'snowflake', jbossCLITemplateName => 'snowflake', connectionOrResourceAdapterProperties => 'host=<your_account_name.snowflakecomputing.com,db=<database>,user-name=<user_name>,password=<user_password>,warehouse=<warehouse>') ;;
EXEC SYSADMIN.createDataSource(name => 'snowflake', translator => 'snowflake', modelProperties => 'importer.defaultSchema=PUBLIC,importer.schemaPattern=PUBLIC,importer.useFullSchemaName=FALSE,importer.tableTypes="TABLE,VIEW"', translatorProperties => 'varcharReserveAdditionalSpacePercent=300,uploadMode=blob_azure,supportsNativeQueries=true,storageAccountName=<storageAccountName>,storageAccountKey=storageAccountKey,sasToken="<sasToken>",defaultEndpointsProtocol=https,tempFolder=d:\tmp\azure\,keepTempFiles=false') ;;
CODE

Microsoft Azure Blob Storage upload available since 2.1.20

Miscellanea

Indexes and native statistics are not supported by Snowflake data sources.

[*] Data source parameter importer.catalog - a default parameter, has the same value as Database. This parameter allows loading only needed catalog in order to reduce time for initializing the data source.

importer.catalog parameter is available since v2.1.16