Read JSON File | teradatamlspk | pyspark2teradataml - Read JSON File - Teradata Package for Python

Teradata® pyspark2teradataml User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2024-12-18
dita:mapPath
oeg1710443196055.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
oeg1710443196055
Product Category
Teradata Vantage

When reading JSON file in teradatamlspk, the file must be in cloud storage.

Output vary. See following examples.

PySpark

>>> spark.read.options(header=True).json('path.json').show()
+-----+----+-----+----+-----+-----+
| col1|col2|col22|col3|col32| col4|
+-----+----+-----+----+-----+-----+
| val1|val2| null|val3| null| val4|
|val12|null|val22|null|val32|val42|
+-----+----+-----+----+-----+-----+

teradatamlspk

>>> spark.read.options(authorization = {"Access_ID": id, "Access_Key": key}).json(path = "/connector/bucket.endpoint/[key_prefix]")
+--------------------+---------------+---------------+----------------+------------+----------+--------------------+
|            Location|ObjectVersionId|ObjectTimeStamp|OffsetIntoObject|ObjectLength|ExtraField|             Payload|
+--------------------+---------------+---------------+----------------+------------+----------+--------------------+
|/S3/s3.amazonaws.com|           None|           None|              67|          70|      None|{"col1": "val12", "c|
|/S3/s3.amazonaws.com|           None|           None|               1|          64|      None|{"col1": "val1", "co|
+--------------------+---------------+---------------+----------------+------------+----------+--------------------+

using format

spark.read.options(authorization = {"Access_ID": id, "Access_Key": key}).format("json").load(path = "/connector/bucket.endpoint/[key_prefix]").show()

If the JSON file is in local file system, you should load it to Vantage as shown in the following example.

spark.read.options(header=True).json('path.json').show()