Using a sparse map table can reduce the permanent space allocated to a model's table. This method is effective for large models but may decrease system performance based on the sparse map size. If the performance impact is acceptable, you can create a sparse model of 8 AMPS as follows:
-- 8 AMPS table drop table onnxembeddings_models_8Amps; CREATE MAP sparse_map_8Amps FROM td_map1 SPARSE ampcount=8; DROP TABLE onnxembeddings_models_8Amps; CREATE TABLE onnxembeddings_models_8Amps, MAP = sparse_map_8Amps ( model_id VARCHAR(30) ,model BLOB(2097088000) ) PRIMARY INDEX (model_id)
Insert the model in the newly created sparse map table and verify that the model was correctly added to the table:
.import vartext file /var/opt/teradata/ONNXEmbeddings/load_onnxembeddings_model.txt .repeat * USING (c1 VARCHAR(40), c2 BLOB AS DEFERRED BY NAME) INSERT INTO onnxembeddings_models_8Amps(:c1, :c2); select * from onnxembeddings_models_8Amps;
Using a sparse map model table is the same as with non-sparse model tables but with an additional argument:
EXECUTE MAP = sparse_map_8Amps
In the BYOM query, add the additional argument as follows (the EXECUTE MAP value depends on the size of the sparse map used for the model's table):
select count() from td_mldb.ONNXEmbeddings(
on (SELECT top 100 CAST(id AS VARCHAR(8)) AS id, CAST(text AS VARCHAR(100)) AS txt FROM amazon_reviews500000)
on (select model_id, model from onnxembeddings_models_8Amps where model_id = 'bge-small-en-v1.5') DIMENSION
on (select tokenizer from embeddings_tokenizers where tokenizer_id = 'bge-small-en-v1.5') DIMENSION
EXECUTE MAP = sparse_map_8Amps
USING
ACCumulate('')
ModelOutputTensor('sentence_embedding')
isdebug('true')
EnableMemoryCheck('false')
) as td order by 1;
Related content