awswrangler.catalog.add_parquet_partitions

awswrangler.catalog.add_parquet_partitions(database: str, table: str, partitions_values: Dict[str, List[str]], catalog_id: Optional[str] = None, compression: Optional[str] = None, boto3_session: Optional[boto3.session.Session] = None) → Any

Add partitions (metadata) to a Parquet Table in the AWS Glue Catalog.

Note

This functions has arguments that can has default values configured globally through wr.config or environment variables:

  • catalog_id

  • database

Check out the Global Configurations Tutorial for details.

Parameters
  • database (str) – Database name.

  • table (str) – Table name.

  • partitions_values (Dict[str, List[str]]) – Dictionary with keys as S3 path locations and values as a list of partitions values as str (e.g. {‘s3://bucket/prefix/y=2020/m=10/’: [‘2020’, ‘10’]}).

  • catalog_id (str, optional) – The ID of the Data Catalog from which to retrieve Databases. If none is provided, the AWS account ID is used by default.

  • compression (str, optional) – Compression style (None, snappy, gzip, etc).

  • boto3_session (boto3.Session(), optional) – Boto3 Session. The default boto3 session will be used if boto3_session receive None.

Returns

None.

Return type

None

Examples

>>> import awswrangler as wr
>>> wr.catalog.add_parquet_partitions(
...     database='default',
...     table='my_table',
...     partitions_values={
...         's3://bucket/prefix/y=2020/m=10/': ['2020', '10'],
...         's3://bucket/prefix/y=2020/m=11/': ['2020', '11'],
...         's3://bucket/prefix/y=2020/m=12/': ['2020', '12']
...     }
... )