AWS SDK for pandas

9 - Redshift - Append, Overwrite and Upsert

awswrangler’s copy/to_sql function has three different mode options for Redshift.

1 - append

2 - overwrite

3 - upsert

[ ]:
# Install the optional modules first
!pip install 'awswrangler[redshift]'
[2]:
from datetime import date

import pandas as pd

import awswrangler as wr

con = wr.redshift.connect("aws-sdk-pandas-redshift")

Enter your bucket name:

[3]:
import getpass

bucket = getpass.getpass()
path = f"s3://{bucket}/stage/"
 ···········································

Enter your IAM ROLE ARN:

[4]:
iam_role = getpass.getpass()
 ····················································································

Creating the table (Overwriting if it exists)

[10]:
df = pd.DataFrame({"id": [1, 2], "value": ["foo", "boo"], "date": [date(2020, 1, 1), date(2020, 1, 2)]})

wr.redshift.copy(
    df=df,
    path=path,
    con=con,
    schema="public",
    table="my_table",
    mode="overwrite",
    iam_role=iam_role,
    primary_keys=["id"],
)

wr.redshift.read_sql_table(table="my_table", schema="public", con=con)
[10]:
id value date
0 2 boo 2020-01-02
1 1 foo 2020-01-01

Appending

[11]:
df = pd.DataFrame({"id": [3], "value": ["bar"], "date": [date(2020, 1, 3)]})

wr.redshift.copy(
    df=df, path=path, con=con, schema="public", table="my_table", mode="append", iam_role=iam_role, primary_keys=["id"]
)

wr.redshift.read_sql_table(table="my_table", schema="public", con=con)
[11]:
id value date
0 1 foo 2020-01-01
1 2 boo 2020-01-02
2 3 bar 2020-01-03

Upserting

[12]:
df = pd.DataFrame({"id": [2, 3], "value": ["xoo", "bar"], "date": [date(2020, 1, 2), date(2020, 1, 3)]})

wr.redshift.copy(
    df=df, path=path, con=con, schema="public", table="my_table", mode="upsert", iam_role=iam_role, primary_keys=["id"]
)

wr.redshift.read_sql_table(table="my_table", schema="public", con=con)
[12]:
id value date
0 1 foo 2020-01-01
1 2 xoo 2020-01-02
2 3 bar 2020-01-03

Cleaning Up

[13]:
with con.cursor() as cursor:
    cursor.execute("DROP TABLE public.my_table")
con.close()