awswrangler.catalog.add_csv_partitions¶
-
awswrangler.catalog.
add_csv_partitions
(database: str, table: str, partitions_values: Dict[str, List[str]], catalog_id: Optional[str] = None, compression: Optional[str] = None, sep: str = ',', boto3_session: Optional[boto3.session.Session] = None, columns_types: Optional[Dict[str, str]] = None) → Any¶ Add partitions (metadata) to a CSV Table in the AWS Glue Catalog.
Note
This functions has arguments that can has default values configured globally through wr.config or environment variables:
catalog_id
database
Check out the Global Configurations Tutorial for details.
- Parameters
database (str) – Database name.
table (str) – Table name.
partitions_values (Dict[str, List[str]]) – Dictionary with keys as S3 path locations and values as a list of partitions values as str (e.g. {‘s3://bucket/prefix/y=2020/m=10/’: [‘2020’, ‘10’]}).
catalog_id (str, optional) – The ID of the Data Catalog from which to retrieve Databases. If none is provided, the AWS account ID is used by default.
compression (str, optional) – Compression style (
None
,gzip
, etc).sep (str) – String of length 1. Field delimiter for the output file.
boto3_session (boto3.Session(), optional) – Boto3 Session. The default boto3 session will be used if boto3_session receive None.
columns_types (Optional[Dict[str, str]]) – Only required for Hive compability. Dictionary with keys as column names and values as data types (e.g. {‘col0’: ‘bigint’, ‘col1’: ‘double’}). P.S. Only materialized columns please, not partition columns.
- Returns
None.
- Return type
None
Examples
>>> import awswrangler as wr >>> wr.catalog.add_csv_partitions( ... database='default', ... table='my_table', ... partitions_values={ ... 's3://bucket/prefix/y=2020/m=10/': ['2020', '10'], ... 's3://bucket/prefix/y=2020/m=11/': ['2020', '11'], ... 's3://bucket/prefix/y=2020/m=12/': ['2020', '12'] ... } ... )