AWS SDK for pandas

24 - Athena Query Metadata

For wr.athena.read_sql_query() and wr.athena.read_sql_table() the resulting DataFrame (or every DataFrame in the returned Iterator for chunked queries) have a query_metadata attribute, which brings the query result metadata returned by Boto3/Athena.

The expected query_metadata format is the same returned by:

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/athena.html#Athena.Client.get_query_execution

Environment Variables

[1]:
%env WR_DATABASE=default
env: WR_DATABASE=default
[2]:
import awswrangler as wr
[5]:
df = wr.athena.read_sql_query("SELECT 1 AS foo")

df
[5]:
foo
0 1

Getting statistics from query metadata

[6]:
print(f'DataScannedInBytes:            {df.query_metadata["Statistics"]["DataScannedInBytes"]}')
print(f'TotalExecutionTimeInMillis:    {df.query_metadata["Statistics"]["TotalExecutionTimeInMillis"]}')
print(f'QueryQueueTimeInMillis:        {df.query_metadata["Statistics"]["QueryQueueTimeInMillis"]}')
print(f'QueryPlanningTimeInMillis:     {df.query_metadata["Statistics"]["QueryPlanningTimeInMillis"]}')
print(f'ServiceProcessingTimeInMillis: {df.query_metadata["Statistics"]["ServiceProcessingTimeInMillis"]}')
DataScannedInBytes:            0
TotalExecutionTimeInMillis:    2311
QueryQueueTimeInMillis:        121
QueryPlanningTimeInMillis:     250
ServiceProcessingTimeInMillis: 37