awswrangler.catalog.create_csv_table¶
-
awswrangler.catalog.
create_csv_table
(database: str, table: str, path: str, columns_types: Dict[str, str], partitions_types: Optional[Dict[str, str]] = None, compression: Optional[str] = None, description: Optional[str] = None, parameters: Optional[Dict[str, str]] = None, columns_comments: Optional[Dict[str, str]] = None, mode: str = 'overwrite', catalog_versioning: bool = False, sep: str = ',', skip_header_line_count: Optional[int] = None, boto3_session: Optional[boto3.session.Session] = None, projection_enabled: bool = False, projection_types: Optional[Dict[str, str]] = None, projection_ranges: Optional[Dict[str, str]] = None, projection_values: Optional[Dict[str, str]] = None, projection_intervals: Optional[Dict[str, str]] = None, projection_digits: Optional[Dict[str, str]] = None, catalog_id: Optional[str] = None) → Any¶ Create a CSV Table (Metadata Only) in the AWS Glue Catalog.
‘https://docs.aws.amazon.com/athena/latest/ug/data-types.html’
Note
This functions has arguments that can has default values configured globally through wr.config or environment variables:
catalog_id
database
Check out the Global Configurations Tutorial for details.
- Parameters
database (str) – Database name.
table (str) – Table name.
path (str) – Amazon S3 path (e.g. s3://bucket/prefix/).
columns_types (Dict[str, str]) – Dictionary with keys as column names and values as data types (e.g. {‘col0’: ‘bigint’, ‘col1’: ‘double’}).
partitions_types (Dict[str, str], optional) – Dictionary with keys as partition names and values as data types (e.g. {‘col2’: ‘date’}).
compression (str, optional) – Compression style (
None
,gzip
, etc).description (str, optional) – Table description
parameters (Dict[str, str], optional) – Key/value pairs to tag the table.
columns_comments (Dict[str, str], optional) – Columns names and the related comments (e.g. {‘col0’: ‘Column 0.’, ‘col1’: ‘Column 1.’, ‘col2’: ‘Partition.’}).
mode (str) – ‘overwrite’ to recreate any possible axisting table or ‘append’ to keep any possible axisting table.
catalog_versioning (bool) – If True and mode=”overwrite”, creates an archived version of the table catalog before updating it.
sep (str) – String of length 1. Field delimiter for the output file.
skip_header_line_count (Optional[int]) – Number of Lines to skip regarding to the header.
projection_enabled (bool) – Enable Partition Projection on Athena (https://docs.aws.amazon.com/athena/latest/ug/partition-projection.html)
projection_types (Optional[Dict[str, str]]) – Dictionary of partitions names and Athena projections types. Valid types: “enum”, “integer”, “date”, “injected” https://docs.aws.amazon.com/athena/latest/ug/partition-projection-supported-types.html (e.g. {‘col_name’: ‘enum’, ‘col2_name’: ‘integer’})
projection_ranges (Optional[Dict[str, str]]) – Dictionary of partitions names and Athena projections ranges. https://docs.aws.amazon.com/athena/latest/ug/partition-projection-supported-types.html (e.g. {‘col_name’: ‘0,10’, ‘col2_name’: ‘-1,8675309’})
projection_values (Optional[Dict[str, str]]) – Dictionary of partitions names and Athena projections values. https://docs.aws.amazon.com/athena/latest/ug/partition-projection-supported-types.html (e.g. {‘col_name’: ‘A,B,Unknown’, ‘col2_name’: ‘foo,boo,bar’})
projection_intervals (Optional[Dict[str, str]]) – Dictionary of partitions names and Athena projections intervals. https://docs.aws.amazon.com/athena/latest/ug/partition-projection-supported-types.html (e.g. {‘col_name’: ‘1’, ‘col2_name’: ‘5’})
projection_digits (Optional[Dict[str, str]]) – Dictionary of partitions names and Athena projections digits. https://docs.aws.amazon.com/athena/latest/ug/partition-projection-supported-types.html (e.g. {‘col_name’: ‘1’, ‘col2_name’: ‘2’})
boto3_session (boto3.Session(), optional) – Boto3 Session. The default boto3 session will be used if boto3_session receive None.
catalog_id (str, optional) – The ID of the Data Catalog from which to retrieve Databases. If none is provided, the AWS account ID is used by default.
- Returns
None.
- Return type
None
Examples
>>> import awswrangler as wr >>> wr.catalog.create_csv_table( ... database='default', ... table='my_table', ... path='s3://bucket/prefix/', ... columns_types={'col0': 'bigint', 'col1': 'double'}, ... partitions_types={'col2': 'date'}, ... compression='gzip', ... description='My own table!', ... parameters={'source': 'postgresql'}, ... columns_comments={'col0': 'Column 0.', 'col1': 'Column 1.', 'col2': 'Partition.'} ... )