7.00.02 - Part 1 of the Example Template File - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Content Type
Programming Reference
User Guide
Publication ID
B700-1022-700K
Language
English (United States)
Last Update
2018-04-17

Part 1 of the example template file declares three extractor classes—Defaul_Token, Begin_with_Uppercase, and com.asterdata.ner.SuffixExtractor, with serial numbers 0, 1, and 2, respectively. (Serial numbers must start with 0 and be incremented by 1.)

Defaul_Token and Begin_with_Uppercase are predefined extractor classes.

The third class, com.asterdata.ner.SuffixExtractor, is an example of a user-defined class. User-defined classes must be created in Java and must implement the Extractor interface, which is:

package com.asterdata.sqlmr.text_analysis.ner;
import java.io.Serializable;
import java.util.List;

/**
 * Implement this interface to define a
 * function that generates features from a sequence
 */

public interface Extractor extends Serializable
{
  /**
  * extract the feature of a token
  * @param sequence
  * @param i, the index
  * @return the feature flag
  */
 String extract(List String sequence, int i);
}

The Java class SuffixExtractor in this example returns the last character of the current token. The code for this class is:

public class SuffixExtractor implements Extractor
{
  @Override
  public String extract(List String sequence, int i)
  {
    String token = sequence.get(i);
    return token.substring(token.length() - 1);
  }
}

Suppose that the function applies the extractor classes in the example template file to the input text "More restaurants open in San Diego." For the token "More":

  • Defaul_Token returns the token itself, "More".
  • Begin_with_Uppercase returns "T" because the token begins with an uppercase letter.
  • com.asterdata.ner.SuffixExtractor returns "e", the last character of the token.

This table shows the features that each extractor class returns for the entire input text:

Defaul_Token Begin_with_Uppercase com.asterdata.ner.SuffixExtractor
More T e
restaurants F s
open F n
in F n
San T n
Diego T o
. F .