Using SageMaker PyTorch Estimator with tdapiclient | API Integration - Using SageMaker PyTorch Estimator with tdapiclient - Teradata Vantage

Teradata Vantageā„¢ - API Integration Guide for Cloud Machine Learning

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Vantage
Release Number
1.4
Published
September 2023
Language
English (United States)
Last Update
2023-09-28
dita:mapPath
mgu1643999543506.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
mgu1643999543506

This use case shows the steps to use SageMaker PyTorch Estimator with tdapiclient.

You can download the aws-usecases.zip file in the attachment as a reference. The pytorch folder in the zip file includes a Jupyter notebook file (ipynb) and a Python file (py) required to run this notebook file.

  1. Import necessary packages.
    import os
    import getpass
    from tdapiclient import create_tdapi_context, remove_tdapi_context, TDApiClient
    from teradataml import create_context, DataFrame, load_example_data
    import pandas as pd
    from teradatasqlalchemy.types import  *
  2. Create the connection.
    host = input("Host: ")
    username = input("Username: ")
    password = getpass.getpass("Password: ")
    td_context = create_context(host=host, username=username, password=password)
  3. Create TDAPI context and TDApiClient object.
    s3_bucket = input("S3 Bucket(Please provide just the bucket name, for example: test-bucket): ")
    access_id = input("Access ID:")
    access_key = getpass.getpass("Acess Key: ")
    region = input("AWS Region: ")
    os.environ["AWS_ACCESS_KEY_ID"] = access_id
    os.environ["AWS_SECRET_ACCESS_KEY"] = access_key
    os.environ["AWS_REGION"] = region
    tdapi_context = create_tdapi_context("aws", bucket_name=s3_bucket)
    td_apiclient = TDApiClient(tdapi_context)
  4. Set up data to be used for this workflow.
    feature = ['crim', 'zn', 'indus', 'chas', 'nox', 'rm',
               'age', 'dis', 'rad', 'tax', 'ptratio', 'black', 'lstat']
    target = 'medv'
    # Load the data to run the example
    load_example_data("decisionforest", "boston")
    # Create teradataml DataFrame.
    boston = DataFrame.from_table("boston")
    boston
    The output:
    id	crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat	medv
    326	0.19186	0	7.38	0	0.493	6.431	14	5.4159	5	287	19.6	393.68	5.08	24.6
    40	0.02763	75	2.95	0	0.428	6.595	21	5.4011	3	252	18.3	395.63	4.32	30.8
    366	4.55587	0	18.1	0	0.718	3.561	87	1.6132	24	666	20.2	354.7	7.12	27.5
    265	0.55007	20	3.97	0	0.647	7.206	91	1.9301	5	264	13.0	387.89	8.1	36.5
    387	24.3938	0	18.1	0	0.7	4.652	100	1.4672	24	666	20.2	396.9	28.28	10.5
    448	9.92485	0	18.1	0	0.74	6.251	96	2.198	24	666	20.2	388.52	16.44	12.6
    244	0.12757	30	4.93	0	0.428	6.393	7	7.0355	6	300	16.6	374.71	5.19	23.7
    305	0.05515	33	2.18	0	0.472	7.236	41	4.022	7	222	18.4	393.68	6.93	36.1
    122	0.07165	0	25.65	0	0.581	6.004	84	2.1974	2	188	19.1	377.67	14.27	20.3
    183	0.09103	0	2.46	0	0.488	7.155	92	2.7006	3	193	17.8	394.12	4.82	37.9
  5. Create PyTorch SageMaker estimator instance through tdapiclient.
    exec_role_arn = "arn:aws:iam::076782961461:role/service-role/AmazonSageMaker-ExecutionRole-20210112T215668"
    FRAMEWORK_VERSION = "1.10.0"
    # Create an estimator object based on PyTorch sagemaker class
    pyTorch_estimator = td_apiclient.PyTorch(
        entry_point="pytorch-script.py",
        role=exec_role_arn,
        instance_count=1,
        instance_type="ml.m5.large",
        py_version='py38',
        framework_version=FRAMEWORK_VERSION,
        metric_definitions=[{"Name": "median-AE",
                             "Regex": "AE-at-50th-percentile: ([0-9.]+).*$"}],
        hyperparameters={
            "epochs": 10,
            "seed": 42,
            "batch_size": 10,
            "features": "crim zn indus chas nox rm age dis rad tax ptratio black lstat",
            "target": target,
        },
    )
  6. Create test and training DataFrames, and start training.
    1. Create two samples of input data - sample 1 has 80% of total rows and sample 2 has 20% of total rows.
      boston_sample = boston.sample(frac=[0.8, 0.2])
      train_df = boston_sample[boston_sample.sampleid =="1"].drop("sampleid", axis=1)
      test_df = boston_sample[boston_sample.sampleid == "2"].drop("sampleid", axis=1)
    2. Show the test DataFrame.
      test_df
      The output:
      id	crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat	medv
      120	0.14476	0	10.01	0	0.547	5.731	65	2.7592	6	432	17.8	391.5	13.61	19.3
      261	0.54011	20	3.97	0	0.647	7.203	81	2.1121	5	264	13.0	392.8	9.59	33.8
      118	0.15098	0	10.01	0	0.547	6.021	82	2.7474	6	432	17.8	394.51	10.3	19.2
      162	1.46336	0	19.58	0	0.605	7.489	90	1.9709	5	403	14.7	374.43	1.73	50.0
      80	0.08387	0	12.83	0	0.437	5.874	36	4.5026	5	398	18.7	396.06	9.1	20.3
      341	0.06151	0	5.19	0	0.515	5.968	58	4.8122	5	224	20.2	396.9	9.29	18.7
      259	0.66351	20	3.97	0	0.647	7.333	100	1.8946	5	264	13.0	383.29	7.79	36.0
      55	0.0136	75	4.0	0	0.41	5.888	47	7.3197	3	469	21.1	396.9	14.8	18.9
      19	0.80271	0	8.14	0	0.538	5.456	36	3.7965	4	307	21.0	288.99	11.69	20.2
      282	0.03705	20	3.33	0	0.4429	6.968	37	5.2447	5	216	14.9	392.23	4.59	35.4
    3. Show the training DataFrame.
      train_df
      The output:
      id	crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat	medv
      244	0.12757	30	4.93	0	0.428	6.393	7	7.0355	6	300	16.6	374.71	5.19	23.7
      101	0.14866	0	8.56	0	0.52	6.727	79	2.7778	5	384	20.9	394.76	9.42	27.5
      427	12.2472	0	18.1	0	0.584	5.837	59	1.9976	24	666	20.2	24.65	15.69	10.2
      469	15.5757	0	18.1	0	0.58	5.926	71	2.9084	24	666	20.2	368.74	18.13	19.1
      40	0.02763	75	2.95	0	0.428	6.595	21	5.4011	3	252	18.3	395.63	4.32	30.8
      366	4.55587	0	18.1	0	0.718	3.561	87	1.6132	24	666	20.2	354.7	7.12	27.5
      162	1.46336	0	19.58	0	0.605	7.489	90	1.9709	5	403	14.7	374.43	1.73	50.0
      223	0.62356	0	6.2	1	0.507	6.879	77	3.2721	8	307	17.4	390.39	9.93	27.5
      326	0.19186	0	7.38	0	0.493	6.431	14	5.4159	5	287	19.6	393.68	5.08	24.6
      305	0.05515	33	2.18	0	0.472	7.236	41	4.022	7	222	18.4	393.68	6.93	36.1
      
    4. Train the model using teradataml DataFrames.
      pyTorch_estimator.fit({"train": train_df, "test": test_df}, content_type="csv", wait=True)
      The output:
      Updated input is : {'train': 's3://pttest-bucket/tdsg-vtkwcsmurk-csv', 'test': 's3://pttest-bucket/tdsg-darwkimmvs-csv'}
      2022-03-23 08:50:00 Starting - Starting the training job...
      2022-03-23 08:50:23 Starting - Preparing the instances for trainingProfilerReport-1648025398: InProgress
      ......
      ......
      ......
      [2022-03-23 08:53:45.626 algo-1:29 INFO hook.py:591] name:layers.0.weight count_params:832
      [2022-03-23 08:53:45.627 algo-1:29 INFO hook.py:591] name:layers.0.bias count_params:64
      [2022-03-23 08:53:45.627 algo-1:29 INFO hook.py:591] name:layers.2.weight count_params:2048
      [2022-03-23 08:53:45.627 algo-1:29 INFO hook.py:591] name:layers.2.bias count_params:32
      [2022-03-23 08:53:45.627 algo-1:29 INFO hook.py:591] name:layers.4.weight count_params:32
      [2022-03-23 08:53:45.627 algo-1:29 INFO hook.py:591] name:layers.4.bias count_params:1
      [2022-03-23 08:53:45.627 algo-1:29 INFO hook.py:593] Total Trainable Params: 3009
      [2022-03-23 08:53:45.627 algo-1:29 INFO hook.py:424] Monitoring the collections: losses
      [2022-03-23 08:53:45.631 algo-1:29 INFO hook.py:488] Hook is writing from the hook with pid: 29
      Epoch : 1 ,  Loss : 22.646
      Epoch : 2 ,  Loss : 22.552
      Epoch : 3 ,  Loss : 22.577
      Epoch : 4 ,  Loss : 22.304
      Epoch : 5 ,  Loss : 22.167
      Epoch : 6 ,  Loss : 21.928
      Epoch : 7 ,  Loss : 21.615
      Epoch : 8 ,  Loss : 21.226
      Epoch : 9 ,  Loss : 20.747
      Epoch : 10 ,  Loss : 20.271
      Training process has finished.
      validating model
      /opt/conda/lib/python3.8/site-packages/torch/utils/smdebug.py:68: UserWarning: Using a target size (torch.Size([10])) that is different to the input size (torch.Size([10, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
        return_value = function(*args, **kwargs)
      /opt/conda/lib/python3.8/site-packages/torch/utils/smdebug.py:68: UserWarning: Using a target size (torch.Size([1])) that is different to the input size (torch.Size([1, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
        return_value = function(*args, **kwargs)
      Mean Squared Error => 59.184
      model persisted at /opt/ml/model/model.pth
      2022-03-23 08:53:48,209 sagemaker-training-toolkit INFO     Reporting training SUCCESS
      
      2022-03-23 08:54:04 Uploading - Uploading generated training model
      2022-03-23 08:54:04 Completed - Training job completed
      Training seconds: 152
      Billable seconds: 152
  7. Create Serializer and Deserializer, so predictor can handle CSV input and output.
    from sagemaker.serializers import CSVSerializer
    from sagemaker.deserializers import CSVDeserializer
    csv_ser = CSVSerializer()
    csv_dser = CSVDeserializer()
    predictor = pyTorch_estimator.deploy("aws-endpoint",
                                         sagemaker_kw_args={"instance_type": "ml.m5.large", "initial_instance_count": 1, "serializer": csv_ser, "deserializer": csv_dser})
  8. Score the model using teradataml DataFrame and the predictor object created in previous step.
    1. Try the predictor with simple CSV data to see if it works as expected.
      item = '-0.4114,  1.4437, -1.1230, -0.2726, -1.0167,  0.7086, -0.9770, -0.0031,-0.5230, -0.0608, -1.5052,  0.4078, -0.8373'
      print(predictor.cloudObj.accept)
      print(predictor.cloudObj.predict(item))
      The output:
      ('text/csv',)
      [['4.00261']]
    2. Try prediction with UDF and Client options.
      Input:
      input = test_df.sample(n=5).select(feature)
      input
      crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat
      0.12204	0	2.89	0	0.445	6.625	57	3.4952	2	276	18.0	357.98	6.65
      0.06211	40	1.25	0	0.429	6.49	44	8.7921	1	335	19.7	396.9	5.98
      11.1081	0	18.1	0	0.668	4.906	100	1.1742	24	666	20.2	396.9	34.77
      0.11329	30	4.93	0	0.428	6.897	54	6.3361	6	300	16.6	391.25	11.38
      0.0578	0	2.46	0	0.488	6.98	58	2.829	3	193	17.8	396.9	5.04
      
      Prediction with UDF option:
      output = predictor.predict(input, mode="UDF",content_type='csv')
      output
      The output:
      crim	zn	indus	chas	nox	rm	age	dis	rad	tax	ptratio	black	lstat	Output
      0.19186	0	7.38	0	0.493	6.431	14	5.4159	5	287	19.6	393.68	5.08	275.26831
      0.14455	12	7.87	0	0.524	6.172	96	5.9505	5	311	15.2	396.9	19.15	274.58148
      0.06888	0	2.46	0	0.488	6.144	62	2.5979	3	193	17.8	396.9	9.45	283.88132
      0.12579	45	3.44	0	0.437	6.556	29	4.5667	5	398	15.2	382.84	4.56	275.25662
      0.06664	0	4.05	0	0.51	6.546	33	3.1323	5	296	16.6	390.96	5.33	273.52322
      
      Prediction with Client option:
      output = predictor.predict(input, mode="Client", content_type='csv')
      output
      The output:
      [['270.57440', '328.61087', '273.84720', '282.11841', '270.84802']]
  9. Clean up.
    predictor.cloudObj.delete_model()
    predictor.cloudObj.delete_endpoint()
    remove_tdapi_context(tdapi_context)