Using SageMaker PyTorch Estimator with tdapiclient | API Integration - Using SageMaker PyTorch Estimator with tdapiclient - Teradata Vantage

Teradata Vantageā„¢ - API Integration Guide for Cloud Machine Learning

Teradata Vantage
Release Number
September 2023
English (United States)
Last Update

This use case shows the steps to use SageMaker PyTorch Estimator with tdapiclient.

You can download the file in the attachment as a reference. The pytorch folder in the zip file includes a Jupyter notebook file (ipynb) and a Python file (py) required to run this notebook file.

  1. Import necessary packages.
    import os
    import getpass
    from tdapiclient import create_tdapi_context, remove_tdapi_context, TDApiClient
    from teradataml import create_context, DataFrame, load_example_data
    import pandas as pd
    from teradatasqlalchemy.types import  *
  2. Create the connection.
    host = input("Host: ")
    username = input("Username: ")
    password = getpass.getpass("Password: ")
    td_context = create_context(host=host, username=username, password=password)
  3. Create TDAPI context and TDApiClient object.
    s3_bucket = input("S3 Bucket(Please provide just the bucket name, for example: test-bucket): ")
    access_id = input("Access ID:")
    access_key = getpass.getpass("Acess Key: ")
    region = input("AWS Region: ")
    os.environ["AWS_ACCESS_KEY_ID"] = access_id
    os.environ["AWS_SECRET_ACCESS_KEY"] = access_key
    os.environ["AWS_REGION"] = region
    tdapi_context = create_tdapi_context("aws", bucket_name=s3_bucket)
    td_apiclient = TDApiClient(tdapi_context)
  4. Set up data to be used for this workflow.
    feature = ['crim', 'zn', 'indus', 'chas', 'nox', 'rm',
               'age', 'dis', 'rad', 'tax', 'ptratio', 'black', 'lstat']
    target = 'medv'
    # Load the data to run the example
    load_example_data("decisionforest", "boston")
    # Create teradataml DataFrame.
    boston = DataFrame.from_table("boston")
    The output:
    id	crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat	medv
    326	0.19186	0	7.38	0	0.493	6.431	14	5.4159	5	287	19.6	393.68	5.08	24.6
    40	0.02763	75	2.95	0	0.428	6.595	21	5.4011	3	252	18.3	395.63	4.32	30.8
    366	4.55587	0	18.1	0	0.718	3.561	87	1.6132	24	666	20.2	354.7	7.12	27.5
    265	0.55007	20	3.97	0	0.647	7.206	91	1.9301	5	264	13.0	387.89	8.1	36.5
    387	24.3938	0	18.1	0	0.7	4.652	100	1.4672	24	666	20.2	396.9	28.28	10.5
    448	9.92485	0	18.1	0	0.74	6.251	96	2.198	24	666	20.2	388.52	16.44	12.6
    244	0.12757	30	4.93	0	0.428	6.393	7	7.0355	6	300	16.6	374.71	5.19	23.7
    305	0.05515	33	2.18	0	0.472	7.236	41	4.022	7	222	18.4	393.68	6.93	36.1
    122	0.07165	0	25.65	0	0.581	6.004	84	2.1974	2	188	19.1	377.67	14.27	20.3
    183	0.09103	0	2.46	0	0.488	7.155	92	2.7006	3	193	17.8	394.12	4.82	37.9
  5. Create PyTorch SageMaker estimator instance through tdapiclient.
    exec_role_arn = "arn:aws:iam::076782961461:role/service-role/AmazonSageMaker-ExecutionRole-20210112T215668"
    FRAMEWORK_VERSION = "1.10.0"
    # Create an estimator object based on PyTorch sagemaker class
    pyTorch_estimator = td_apiclient.PyTorch(
        metric_definitions=[{"Name": "median-AE",
                             "Regex": "AE-at-50th-percentile: ([0-9.]+).*$"}],
            "epochs": 10,
            "seed": 42,
            "batch_size": 10,
            "features": "crim zn indus chas nox rm age dis rad tax ptratio black lstat",
            "target": target,
  6. Create test and training DataFrames, and start training.
    1. Create two samples of input data - sample 1 has 80% of total rows and sample 2 has 20% of total rows.
      boston_sample = boston.sample(frac=[0.8, 0.2])
      train_df = boston_sample[boston_sample.sampleid =="1"].drop("sampleid", axis=1)
      test_df = boston_sample[boston_sample.sampleid == "2"].drop("sampleid", axis=1)
    2. Show the test DataFrame.
      The output:
      id	crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat	medv
      120	0.14476	0	10.01	0	0.547	5.731	65	2.7592	6	432	17.8	391.5	13.61	19.3
      261	0.54011	20	3.97	0	0.647	7.203	81	2.1121	5	264	13.0	392.8	9.59	33.8
      118	0.15098	0	10.01	0	0.547	6.021	82	2.7474	6	432	17.8	394.51	10.3	19.2
      162	1.46336	0	19.58	0	0.605	7.489	90	1.9709	5	403	14.7	374.43	1.73	50.0
      80	0.08387	0	12.83	0	0.437	5.874	36	4.5026	5	398	18.7	396.06	9.1	20.3
      341	0.06151	0	5.19	0	0.515	5.968	58	4.8122	5	224	20.2	396.9	9.29	18.7
      259	0.66351	20	3.97	0	0.647	7.333	100	1.8946	5	264	13.0	383.29	7.79	36.0
      55	0.0136	75	4.0	0	0.41	5.888	47	7.3197	3	469	21.1	396.9	14.8	18.9
      19	0.80271	0	8.14	0	0.538	5.456	36	3.7965	4	307	21.0	288.99	11.69	20.2
      282	0.03705	20	3.33	0	0.4429	6.968	37	5.2447	5	216	14.9	392.23	4.59	35.4
    3. Show the training DataFrame.
      The output:
      id	crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat	medv
      244	0.12757	30	4.93	0	0.428	6.393	7	7.0355	6	300	16.6	374.71	5.19	23.7
      101	0.14866	0	8.56	0	0.52	6.727	79	2.7778	5	384	20.9	394.76	9.42	27.5
      427	12.2472	0	18.1	0	0.584	5.837	59	1.9976	24	666	20.2	24.65	15.69	10.2
      469	15.5757	0	18.1	0	0.58	5.926	71	2.9084	24	666	20.2	368.74	18.13	19.1
      40	0.02763	75	2.95	0	0.428	6.595	21	5.4011	3	252	18.3	395.63	4.32	30.8
      366	4.55587	0	18.1	0	0.718	3.561	87	1.6132	24	666	20.2	354.7	7.12	27.5
      162	1.46336	0	19.58	0	0.605	7.489	90	1.9709	5	403	14.7	374.43	1.73	50.0
      223	0.62356	0	6.2	1	0.507	6.879	77	3.2721	8	307	17.4	390.39	9.93	27.5
      326	0.19186	0	7.38	0	0.493	6.431	14	5.4159	5	287	19.6	393.68	5.08	24.6
      305	0.05515	33	2.18	0	0.472	7.236	41	4.022	7	222	18.4	393.68	6.93	36.1
    4. Train the model using teradataml DataFrames.{"train": train_df, "test": test_df}, content_type="csv", wait=True)
      The output:
      Updated input is : {'train': 's3://pttest-bucket/tdsg-vtkwcsmurk-csv', 'test': 's3://pttest-bucket/tdsg-darwkimmvs-csv'}
      2022-03-23 08:50:00 Starting - Starting the training job...
      2022-03-23 08:50:23 Starting - Preparing the instances for trainingProfilerReport-1648025398: InProgress
      [2022-03-23 08:53:45.626 algo-1:29 INFO] name:layers.0.weight count_params:832
      [2022-03-23 08:53:45.627 algo-1:29 INFO] name:layers.0.bias count_params:64
      [2022-03-23 08:53:45.627 algo-1:29 INFO] name:layers.2.weight count_params:2048
      [2022-03-23 08:53:45.627 algo-1:29 INFO] name:layers.2.bias count_params:32
      [2022-03-23 08:53:45.627 algo-1:29 INFO] name:layers.4.weight count_params:32
      [2022-03-23 08:53:45.627 algo-1:29 INFO] name:layers.4.bias count_params:1
      [2022-03-23 08:53:45.627 algo-1:29 INFO] Total Trainable Params: 3009
      [2022-03-23 08:53:45.627 algo-1:29 INFO] Monitoring the collections: losses
      [2022-03-23 08:53:45.631 algo-1:29 INFO] Hook is writing from the hook with pid: 29
      Epoch : 1 ,  Loss : 22.646
      Epoch : 2 ,  Loss : 22.552
      Epoch : 3 ,  Loss : 22.577
      Epoch : 4 ,  Loss : 22.304
      Epoch : 5 ,  Loss : 22.167
      Epoch : 6 ,  Loss : 21.928
      Epoch : 7 ,  Loss : 21.615
      Epoch : 8 ,  Loss : 21.226
      Epoch : 9 ,  Loss : 20.747
      Epoch : 10 ,  Loss : 20.271
      Training process has finished.
      validating model
      /opt/conda/lib/python3.8/site-packages/torch/utils/ UserWarning: Using a target size (torch.Size([10])) that is different to the input size (torch.Size([10, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
        return_value = function(*args, **kwargs)
      /opt/conda/lib/python3.8/site-packages/torch/utils/ UserWarning: Using a target size (torch.Size([1])) that is different to the input size (torch.Size([1, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
        return_value = function(*args, **kwargs)
      Mean Squared Error => 59.184
      model persisted at /opt/ml/model/model.pth
      2022-03-23 08:53:48,209 sagemaker-training-toolkit INFO     Reporting training SUCCESS
      2022-03-23 08:54:04 Uploading - Uploading generated training model
      2022-03-23 08:54:04 Completed - Training job completed
      Training seconds: 152
      Billable seconds: 152
  7. Create Serializer and Deserializer, so predictor can handle CSV input and output.
    from sagemaker.serializers import CSVSerializer
    from sagemaker.deserializers import CSVDeserializer
    csv_ser = CSVSerializer()
    csv_dser = CSVDeserializer()
    predictor = pyTorch_estimator.deploy("aws-endpoint",
                                         sagemaker_kw_args={"instance_type": "ml.m5.large", "initial_instance_count": 1, "serializer": csv_ser, "deserializer": csv_dser})
  8. Score the model using teradataml DataFrame and the predictor object created in previous step.
    1. Try the predictor with simple CSV data to see if it works as expected.
      item = '-0.4114,  1.4437, -1.1230, -0.2726, -1.0167,  0.7086, -0.9770, -0.0031,-0.5230, -0.0608, -1.5052,  0.4078, -0.8373'
      The output:
    2. Try prediction with UDF and Client options.
      input = test_df.sample(n=5).select(feature)
      crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat
      0.12204	0	2.89	0	0.445	6.625	57	3.4952	2	276	18.0	357.98	6.65
      0.06211	40	1.25	0	0.429	6.49	44	8.7921	1	335	19.7	396.9	5.98
      11.1081	0	18.1	0	0.668	4.906	100	1.1742	24	666	20.2	396.9	34.77
      0.11329	30	4.93	0	0.428	6.897	54	6.3361	6	300	16.6	391.25	11.38
      0.0578	0	2.46	0	0.488	6.98	58	2.829	3	193	17.8	396.9	5.04
      Prediction with UDF option:
      output = predictor.predict(input, mode="UDF",content_type='csv')
      The output:
      crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat	Output
      0.19186	0	7.38	0	0.493	6.431	14	5.4159	5	287	19.6	393.68	5.08	275.26831
      0.14455	12	7.87	0	0.524	6.172	96	5.9505	5	311	15.2	396.9	19.15	274.58148
      0.06888	0	2.46	0	0.488	6.144	62	2.5979	3	193	17.8	396.9	9.45	283.88132
      0.12579	45	3.44	0	0.437	6.556	29	4.5667	5	398	15.2	382.84	4.56	275.25662
      0.06664	0	4.05	0	0.51	6.546	33	3.1323	5	296	16.6	390.96	5.33	273.52322
      Prediction with Client option:
      output = predictor.predict(input, mode="Client", content_type='csv')
      The output:
      [['270.57440', '328.61087', '273.84720', '282.11841', '270.84802']]
  9. Clean up.