awswrangler.db.write_redshift_copy_manifest

awswrangler.db.write_redshift_copy_manifest(manifest_path: str, paths: List[str], use_threads: bool = True, boto3_session: Optional[boto3.session.Session] = None, s3_additional_kwargs: Optional[Dict[str, str]] = None) → Dict[str, List[Dict[str, Union[str, bool, Dict[str, int]]]]]

Write Redshift copy manifest and return its structure.

Only Parquet files are supported.

Note

In case of use_threads=True the number of threads that will be spawned will be gotten from os.cpu_count().

Parameters
  • manifest_path (str) – Amazon S3 manifest path (e.g. s3://…)

  • paths (List[str]) – List of S3 paths (Parquet Files) to be copied.

  • use_threads (bool) – True to enable concurrent requests, False to disable multiple threads. If enabled os.cpu_count() will be used as the max number of threads.

  • boto3_session (boto3.Session(), optional) – Boto3 Session. The default boto3 session will be used if boto3_session receive None.

  • s3_additional_kwargs – Forward to botocore requests. Valid parameters: “ACL”, “Metadata”, “ServerSideEncryption”, “StorageClass”, “SSECustomerAlgorithm”, “SSECustomerKey”, “SSEKMSKeyId”, “SSEKMSEncryptionContext”, “Tagging”. e.g. s3_additional_kwargs={‘ServerSideEncryption’: ‘aws:kms’, ‘SSEKMSKeyId’: ‘YOUR_KMY_KEY_ARN’}

Returns

Manifest content.

Return type

Dict[str, List[Dict[str, Union[str, bool, Dict[str, int]]]]]

Examples

Copying two files to Redshift cluster.

>>> import awswrangler as wr
>>> wr.db.write_redshift_copy_manifest(
...     path="s3://bucket/my.manifest",
...     paths=["s3://...parquet", "s3://...parquet"]
... )