User can access actual columns and path variables by creating a regular table on a SELECT query (Create Table as SELECT, CTAS) with each column from JSON or CSV data selected from foreign table. Each column must be typecast to a valid type and then aliased to the appropriate column name. This allows user to access actual columns and keys in the JSON or CSV data. It is up to the user on what must be selected in SELECT query passed to "DataFrame.from_query()": columns, attributes, keys from JSON or CSV data and path variables.
Example for JSON data
Create a regular table.
# Here is how we created a regular table from foreign table. """ CREATE TABLE riverflowprecip AS ( SELECT CAST($path.$year AS CHAR(4)) TheYear, CAST($path.$month AS CHAR(2)) TheMonth, CAST($path.$day AS CHAR(2)) TheDay, CAST(payload.site_no AS CHAR(8)) SiteNo, CAST(payload.Flow AS FLOAT) Flow, CAST(payload.GageHeight AS FLOAT) GageHeight FROM riverflow WHERE payload.Precipitation IS NOT NULL ) WITH DATA PRIMARY INDEX(SiteNo) """
Create a DataFrame on the table and display the head of the DataFrame.
# Create a DataFrame on a table. >>> wrk2dfview = DataFrame("riverflowprecip") >>> wrk2dfview.head().to_pandas()
TheMonth TheDay SiteNo Flow GageHeight TheYear 2018 07 25 09400815 0.00 0.00 2018 07 20 09400815 0.00 0.42 2018 07 20 09400815 0.00 0.00 2018 07 20 09400815 0.00 0.41 2018 07 20 09400815 0.27 0.54 2018 07 20 09400815 0.19 0.51 2018 07 20 09400815 0.05 0.45 2018 07 25 09400815 0.00 0.36 2018 07 25 09400815 0.00 0.37 2018 07 01 09400815 0.00 -0.01
Example for CSV data
Create a regular table.
# Here is how we created a regular table from foreign table. """ CREATE TABLE riverflowcsvprecip AS ( SELECT CAST($path.$year AS CHAR(4)) TheYear, CAST($path.$month AS CHAR(2)) TheMonth, CAST($path.$day AS CHAR(2)) TheDay, CAST(payload..site_no AS CHAR(8)) SiteNo, CAST(payload..Flow AS FLOAT) Flow, CAST(payload..GageHeight AS FLOAT) GageHeight FROM riverflowcsv WHERE payload..Precipitation IS NOT NULL ) WITH DATA PRIMARY INDEX(SiteNo) """
Create a DataFrame on the table and display the head of the DataFrame.
# Create a DataFrame on a table. >>> wrk2dfview = DataFrame("riverflowcsvprecip") >>> wrk2dfview.head().to_pandas()
TheYear TheMonth TheDay Flow GageHeight SiteNo 09380000 2018 07 25 13900.0 9.52 09380000 2018 07 25 12800.0 9.30 09380000 2018 07 25 12500.0 9.24 09380000 2018 07 25 12300.0 9.18 09380000 2018 07 25 11600.0 9.04 09380000 2018 07 25 11500.0 9.00 09380000 2018 07 25 11800.0 9.08 09380000 2018 07 25 13500.0 9.44 09380000 2018 07 25 14300.0 9.61 09380000 2018 07 25 15500.0 9.85