awswrangler.opensearch.index_csv

awswrangler.opensearch.index_csv(client: opensearchpy.client.OpenSearch, path: str, index: str, doc_type: Optional[str] = None, pandas_kwargs: Optional[Dict[str, Any]] = None, **kwargs: Any) Dict[str, Any]

Index all documents from a CSV file to OpenSearch index.

Parameters
  • client (OpenSearch) – instance of opensearchpy.OpenSearch to use.

  • path (str) – s3 or local path to the CSV file which contains the documents.

  • index (str) – Name of the index.

  • doc_type (str, optional) – Name of the document type (for Elasticsearch versions 5.x and earlier).

  • pandas_kwargs (Dict[str, Any], optional) – Dictionary of arguments forwarded to pandas.read_csv(). e.g. pandas_kwargs={‘sep’: ‘|’, ‘na_values’: [‘null’, ‘none’]} https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html Note: these params values are enforced: skip_blank_lines=True

  • **kwargs – KEYWORD arguments forwarded to index_documents() which is used to execute the operation

Returns

Response payload https://opensearch.org/docs/opensearch/rest-api/document-apis/bulk/#response.

Return type

Dict[str, Any]

Examples

Writing contents of CSV file

>>> import awswrangler as wr
>>> client = wr.opensearch.connect(host='DOMAIN-ENDPOINT')
>>> wr.opensearch.index_csv(
...     client=client,
...     path='docs.csv',
...     index='sample-index1'
... )

Writing contents of CSV file using pandas_kwargs

>>> import awswrangler as wr
>>> client = wr.opensearch.connect(host='DOMAIN-ENDPOINT')
>>> wr.opensearch.index_csv(
...     client=client,
...     path='docs.csv',
...     index='sample-index1',
...     pandas_kwargs={'sep': '|', 'na_values': ['null', 'none']}
... )