awswrangler.postgresql.to_sql

awswrangler.postgresql.to_sql(df: DataFrame, con: Connection, table: str, schema: str, mode: str = 'append', index: bool = False, dtype: Optional[Dict[str, str]] = None, varchar_lengths: Optional[Dict[str, int]] = None, use_column_names: bool = False, chunksize: int = 200, upsert_conflict_columns: Optional[List[str]] = None, insert_conflict_columns: Optional[List[str]] = None) Any

Write records stored in a DataFrame into PostgreSQL.

Note

This function has arguments which can be configured globally through wr.config or environment variables:

  • chunksize

Check out the Global Configurations Tutorial for details.

Parameters
  • df (pandas.DataFrame) – Pandas DataFrame https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html

  • con (pg8000.Connection) – Use pg8000.connect() to use credentials directly or wr.postgresql.connect() to fetch it from the Glue Catalog.

  • table (str) – Table name

  • schema (str) – Schema name

  • mode (str) –

    Append, overwrite or upsert.

    append: Inserts new records into table. overwrite: Drops table and recreates. upsert: Perform an upsert which checks for conflicts on columns given by upsert_conflict_columns and sets the new values on conflicts. Note that upsert_conflict_columns is required for this mode.

  • index (bool) – True to store the DataFrame index as a column in the table, otherwise False to ignore it.

  • dtype (Dict[str, str], optional) – Dictionary of columns names and PostgreSQL types to be casted. Useful when you have columns with undetermined or mixed data types. (e.g. {‘col name’: ‘TEXT’, ‘col2 name’: ‘FLOAT’})

  • varchar_lengths (Dict[str, int], optional) – Dict of VARCHAR length by columns. (e.g. {“col1”: 10, “col5”: 200}).

  • use_column_names (bool) – If set to True, will use the column names of the DataFrame for generating the INSERT SQL Query. E.g. If the DataFrame has two columns col1 and col3 and use_column_names is True, data will only be inserted into the database columns col1 and col3.

  • chunksize (int) – Number of rows which are inserted with each SQL query. Defaults to inserting 200 rows per query.

  • upsert_conflict_columns (List[str], optional) – This parameter is only supported if mode is set top upsert. In this case conflicts for the given columns are checked for evaluating the upsert.

  • insert_conflict_columns (List[str], optional) – This parameter is only supported if mode is set top append. In this case conflicts for the given columns are checked for evaluating the insert ‘ON CONFLICT DO NOTHING’.

Returns

None.

Return type

None

Examples

Writing to PostgreSQL using a Glue Catalog Connections

>>> import awswrangler as wr
>>> con = wr.postgresql.connect("MY_GLUE_CONNECTION")
>>> wr.postgresql.to_sql(
...     df=df,
...     table="my_table",
...     schema="public",
...     con=con
... )
>>> con.close()