Load ONNX Embedding Models | Vantage BYOM - Load ONNX Embedding Models - Teradata Vantage

Teradata Vantageā„¢ - Bring Your Own Model User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
Lake
VMware
Product
Teradata Vantage
Release Number
6.0
Published
March 2025
ft:locale
en-US
ft:lastEdition
2025-03-21
dita:mapPath
fee1607120608274.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
fee1607120608274
Before you begin, select a pre-trained model from Huggingface downloads. In the following example, we will use the BAAI/bge model with the Optimum utility to convert the model into the ONNX (Open Neural Network Exchange) format.
optimum-cli export onnx --opset 16 --trust-remote-code -m BAAI/bge-small-en-v1.5 bge-small-en-v1.5-onnx
After the conversion to ONNX, perform the following:
  • Update the dynamic dimensions on the input and output to ensure compatibility with different input sizes.

    ONNXEmbeddings are compatible with symbolic dimensions on input.

  • Fix the opset in the ONNX file for compatibility with ONNX runtime.
  • Remove the tokens embedding in the output to save I/O during processing to optimize the model's run efficiency.
You can use the following Python codes to perform the updates:
import onnx
import onnxruntime as rt

import transformers
from onnxruntime.tools.onnx_model_utils import *

from sentence_transformers.util import cos_sim
from sentence_transformers import SentenceTransformer

import teradataml as tdml

import getpass

op = onnx.OperatorSetIdProto()
op.version = 16

model = onnx.load('bge-small-en-v1.5-onnx/model.onnx')

model_ir8 = onnx.helper.make_model(model.graph, ir_version = 8, opset_imports = [op]) #to be sure that we have compatible opset and IR version


# fixing the variable dim sizes in our mode
rt.tools.onnx_model_utils.make_dim_param_fixed(model_ir8.graph, "batch_size", 1) 
rt.tools.onnx_model_utils.make_dim_param_fixed(model_ir8.graph, "sequence_length", 512)
rt.tools.onnx_model_utils.make_dim_param_fixed(model_ir8.graph, "Divsentence_embedding_dim_1", 384)


#remove useless token_embeddings output from the model
for node in model_ir8.graph.output:
    if node.name == "token_embeddings":
        model_ir8.graph.output.remove(node)

#saving the model
onnx.save(model_ir8, 'bge-small-en-v1.5-onnx/model_fixed.onnx')
Once you have saved your model, you can load it into the database as you would any other model. The conversion with optimum-cli will also produce a tokenizer.json file which is needed to load into the database just like models in the previous example using the following table definition:
CREATE SET TABLE embedding_tokenizers (
tokenizer_id VARCHAR (30),
tokenizer BLOB 
)
PRIMARY INDEX (tokenizer_id); 
Once both the model and tokenizer are loaded you can start running queries against your text input.