SQL Statements to Compute the Shapley Value - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Assume that the ShapleyAddOnePlayer input and output tables have the names InputTable and OutputTable, respectively.

To compute a table that contains the Shapley value of each player:

  1. Create a table that contains the weighted payoff produced by adding a player to each combination. For example:
    CREATE MULTISET TABLE stratum AS (
      SELECT player,
        (InputTable.value - OutputTable.value) / divisor AS partial_value,
          OutputTable.size AS size
      FROM inputTable INNER JOIN
      OutputTable ON (InputTable.comb = OutputTable.comb2)
    ) WITH DATA PARTITION BY player;
  2. Create a table that contains the partial Shapley value produced when each player is added to a combination of a given size. For example:
    CREATE MULTISET TABLE stratum_avg AS (
      SELECT player, size, SUM(partial_value) AS partial_avg
      FROM stratum GROUP BY player, size
    ) WITH DATA PARTITION BY player;
  3. Create a table that lists the Shapley value for each player. For example:
    CREATE MULTISET TABLE shapley_values AS (
      SELECT player, SUM(partial_avg) / numberOfPlayers AS shapley_value
      FROM stratum_avg GROUP BY player
    ) WITH DATA PARTITION BY player;

Alternatively, combine the preceding statements into this statement:

CREATE MULTISET TABLE shapley_values AS (
  SELECT player, SUM(partial_avg) / numberOfPlayers AS shapley_value
  FROM (
    SELECT player, size, SUM(partial_value) AS partial_avg
    FROM (
      SELECT player,
        (inputTable.value - outputTable.value) / divisor AS partial_value,
        outputTable.size AS size
      FROM inputTable INNER JOIN
        outputTable ON (inputTable.comb=outputTable.comb2)
      ) AS stratum
      GROUP BY player, size
    ) AS stratum_avg
  GROUP BY player
) WITH DATA PARTITION BY player;

To normalize the Shapley values, so that their sum is 1:

SELECT player, shapley_value / (
  SELECT SUM(shapley_value)
  FROM shapley_values
  ) AS normalized__shapley_values
FROM shapley_values;