Use the following archive or delete functions for a component in FeatureStore.
- You cannot archive a component which is a part of Feature Group. You will have to remove the component from Feature Group using FeatureGroup.remove(), upload modified Feature Group using FeatureStore.apply(), then archive the component.
- The delete function removes only the archived component. It won't act on the non-archived component.
archive_feature() archives a feature from FeatureStore.
Archiving a Feature will not remove the Feature completely from FeatureStore. They are marked as unavailable for any further processing. You can still see the archived Features using list_features() by setting the additional argument archived to True.
>>> from teradataml import DataFrame, Feature, FeatureStore >>> df = DataFrame("sales") # Create Feature for Column 'Feb'. >>> feature = Feature(name="sales_data_Feb", column=df.Feb) # Create FeatureStore for the repo 'staging_repo'. >>> fs = FeatureStore("staging_repo") # Apply the Feature to FeatureStore. >>> fs.apply(feature) True # List the available Features. # Note that group_name is None. So 'sales_data_Feb' feature is not associated with any group. # Since it is not part of any group, it can be archived. >>> fs.list_features() column_name description creation_time modified_time tags data_type feature_type status group_name name sales_data_Feb Feb None 2024-10-03 18:21:03.720464 None None FLOAT CONTINUOUS ACTIVE None # Archive Feature with name "sales_data_Feb". >>> fs.archive_feature(feature=feature) Feature 'sales_data_Feb' is archived. True # List the available Features after archive. >>> fs.list_features() Empty DataFrame Columns: [column_name, description, creation_time, modified_time, tags, data_type, feature_type, status, group_name] Index: [] >>> # List all the archived Features. >>> fs.list_features(archived=True) name column_name description creation_time modified_time tags data_type feature_type status archived_time group_name 0 sales_data_Feb Feb None 2024-10-03 18:21:03.720464 None None FLOAT CONTINUOUS ACTIVE 2024-09-30 11:30:49.160000 sales >>>
delete_feature() deletes an archived Feature from FeatureStore.
>>> fs.delete_feature(feature=feature) Feature 'sales_data_Feb' is deleted. True >>>
archive_entity() archives an entity from FeatureStore.
Archiving an Entity will not remove the Entity completely from Feature Store. They are marked as unavailable for any further processing. You can still see the archived Entitities using list_entities() by setting the additional argument archived to True.
Example archiving an Entity and listing all archived Entities in repo 'vfs_v1':
An Entity cannot be archived if it is a part of FeatureGroup. You must create another Entity, update FeatureGroup with other Entity, then archive Entity 'sales'.
>>> entity = Entity('store_sales', columns=df.accounts) # Update new entity to FeatureGroup. >>> fg.apply(entity) # Update FeatureGroup to FeatureStore. This will update Entity # from 'sales' to 'store_sales' for FeatureGroup 'sales'. >>> fs.apply(fg) True # Let's archive Entity 'sales' since it is not part of any FeatureGroup. >>> fs.archive_entity('sales') Entity 'sales' is archived. True >>> # List the archived entities. >>> fs.list_entities(archived=True) name description creation_time modified_time archived_time entity_column 0 sales None 2024-10-18 05:41:36.932856 None 2024-10-18 05:50:00.930000 accounts >>>
delete_entity() deletes an archived Entity from FeatureStore.
>>> fs.delete_entity( Entity 'sales_data' is deleted. True >>>
archive_data_source() archive a Data Source from FeatureStore.
Archiving a Data Source will not remove the Data Source completely from Feature Store. They are marked as unavailable for any further processing. You can still see the archived Data Sources using list_data_sources() by setting the additional argument archived to True.
# Archive a Data Source and list all the archived DataSources in the repo 'vfs_v1'. # Let's first archive the DataSource. >>> fs.archive_data_source('admissions') DataSource 'admissions' is archived. True # List archived DataSources. >>> fs.list_data_sources(archived=True) description timestamp_col_name source archived_time name admissions None None select * from "admissions_train" 2024-09-30 12:05:39.220000 >>>
delete_data_source() deletes an archived Data Source from FeatureStore.
>>> fs.delete_data_source("sales_data") DataSource 'sales_data' is deleted. True >>>
archive_feature_group() archives a Feature Group from FeatureStore.
Archiving a Feature Group will not remove the Feature Group completely from Feature Store. They are marked as unavailable for any further processing. You can still see the archived Feature Groups using list_feature_groups() by setting the additional argument archived to True.
>>> from teradataml import FeatureGroup, FeatureStore, load_example_data >>> admissions=DataFrame("admissions_train") # Create FeatureStore for repo 'vfs_v1'. >>> fs = FeatureStore("vfs_v1") # Create a FeatureGroup from DataFrame. >>> fg = FeatureGroup.from_DataFrame("admissions", df=admissions, entity_columns='id') # Apply FeatureGroup to FeatureStore. >>> fs.apply(fg) True # List all the effective FeatureGroups in the repo 'vfs_v1'. >>> fs.list_feature_groups() description data_source_name entity_name name admissions None admissions admissions >>> # Let's archive the FeatureGroup. >>> fs.archive_feature_group("admissions") True >>> # List archived FeatureGroups. >>> fs.list_feature_groups(archived=True) name description data_source_name entity_name archived_time 0 admissions None admissions admissions 2024-09-30 12:05:39.220000 >>> # List archived Features. >>> fs.list_features(archived=True) name column_name description tags data_type feature_type status creation_time modified_time archived_time group_name 0 gpa gpa None None FLOAT CONTINUOUS ACTIVE 2024-11-05 06:45:10.368228 None 2024-11-05 06:47:05.480000 admissions 1 admitted admitted None None INTEGER CONTINUOUS ACTIVE 2024-11-05 06:45:10.431708 None 2024-11-05 06:47:05.480000 admissions 2 masters masters None None VARCHAR CONTINUOUS ACTIVE 2024-11-05 06:45:10.306466 None 2024-11-05 06:47:05.480000 admissions 3 stats stats None None VARCHAR CONTINUOUS ACTIVE 2024-11-05 06:45:10.389375 None 2024-11-05 06:47:05.480000 admissions 4 programming programming None None VARCHAR CONTINUOUS ACTIVE 2024-11-05 06:45:10.410277 None 2024-11-05 06:47:05.480000 admissions # List archived Entities. >>> fs.list_entities(archived=True) name description creation_time modified_time archived_time entity_column 0 admissions None 2024-11-05 06:45:10.453333 None 2024-11-05 06:47:05.550000 id # List archived Data Sources. >>> fs.list_data_sources(archived=True) name description timestamp_col_name source creation_time modified_time archived_time 0 admissions None None select * from "admissions_train" 2024-11-05 06:45:10.578087 None 2024-11-05 06:47:05.600000
delete_feature_group() deletes an archived Feature Group from FeatureStore.
Unlike archive_feature_group(), delete_feature_group() won’t delete underlying Features, Data Source, and Entity.
>>> from teradataml import DataFrame, FeatureGroup, FeatureStore >>> df = DataFrame("sales") # Create FeatureGroup from teradataml DataFrame. >>> fg = FeatureGroup.from_DataFrame(name="sales", entity_columns="accounts", df=df, timestamp_col_name="datetime") # Create FeatureStore for the repo 'staging_repo'. >>> fs = FeatureStore("staging_repo") # Apply FeatureGroup to FeatureStore. >>> fs.apply(fg) True # Let's first archive FeatureGroup with name "sales". >>> fs.archive_feature_group(feature_group='sales') FeatureGroup 'sales' is archived. True # Delete FeatureGroup with name "sales". >>> fs.delete_feature_group(feature_group='sales') FeatureGroup 'sales' is deleted. True >>>