Feature Generation Algorithm: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
A [[Feature Generation Algorithm]] is a [[data processing algorithm]] that can be implemented by a [[feature generation system]] to solve [[feature generation task]] (sot create new [[ML feature]]s). | A [[Feature Generation Algorithm]] is a [[data processing algorithm]] that can be implemented by a [[feature generation system]] to solve [[feature generation task]] (sot create new [[ML feature]]s). | ||
* <B>AKA:</B> [[Feature Creation Algorithm|Feature Extraction Algorithm]]. | |||
* <B>Context:</B> | * <B>Context:</B> | ||
** It can support [[ML Predictive Quality Improvement]]s and [[Data Preprocessing Processing Improvement]]s. | ** It can support [[ML Predictive Quality Improvement]]s and [[Data Preprocessing Processing Improvement]]s. | ||
** It can involve techniques such as [[Feature Extraction]], [[Feature Selection]], and [[Dimensionality Reduction]]. | ** It can involve techniques such as [[Feature Extraction]], [[Feature Selection]], and [[Dimensionality Reduction]]. | ||
** It can be critical in domains where raw data is complex and high-dimensional. | ** It can be critical in domains where raw data is complex and high-dimensional. | ||
** It can use domain knowledge to create | ** It can use domain knowledge to create more relevant features to specific [[Machine Learning Task]]s. | ||
** It can range from being a [[Heuristic Feature Creation Algorithm]] to being a [[Data-Driven Feature Creation Algorithm]]. | |||
** It can range from being a [[Low-Level Feature Creation Algorithm]] to being a [[High-Level Feature Creation Algorithm]]. | |||
** ... | ** ... | ||
* <B>Example(s):</B> | * <B>Example(s):</B> | ||
** | ** A [[Text Classification Feature Generation Algorithm]], that extracts [[n-gram]]s or [[sentiment scores]]. | ||
** In [[finance]], it might generate features like moving averages or volatility measures from stock price data. | ** In [[finance]], it might generate features like moving averages or volatility measures from stock price data. | ||
** In [[image recognition]], algorithms could generate features by identifying edges, textures, or color histograms in images. | ** In [[image recognition]], algorithms could generate features by identifying edges, textures, or color histograms in images. | ||
** ... | ** ... | ||
* <B>Counter-Example(s):</B> | * <B>Counter-Example(s):</B> | ||
** [[Feature Normalization Algorithm]] | ** a [[Feature Normalization Algorithm]]. | ||
** [[Data Cleaning Algorithm]] | ** a [[Data Cleaning Algorithm]]. | ||
* <B>See:</B> [[Feature Engineering]], [[Machine Learning Pipeline]], [[Data Preprocessing]]. | ** a [[Feature Dimensionality-Reduction Algorithm]], such as a [[feature selection algorithm]]. | ||
* <B>See:</B> [[Feature Engineering]], [[Machine Learning Pipeline]], [[Data Preprocessing]], [[Feature Weighting Algorithm]], [[Text Item Feature]]. | |||
---- | ---- | ||
---- | ---- | ||
== References == | |||
=== 2003 === | |||
* ([[2003_FeatureExtractionbyNonParametri|Torkkola, 2003]]) ⇒ [[Kari Torkkola]]. ([[2003]]). “[http://machinelearning.wustl.edu/mlpapers/paper_files/Torkkola03.pdf Feature Extraction by Non Parametric Mutual Information Maximization].” In: The Journal of Machine Learning Research, 3. | |||
** QUOTE: [[We]] present a [[method for learning discriminative feature transforms]] using as criterion the [[mutual information]] between [[class label]]s and [[transformed feature]]s. </s> Instead of a commonly used [[mutual information measure]] based on [[Kullback-Leibler divergence]], [[we]] use a [[quadratic divergence measure]], which allows us to make an [[efficient]] [[non-parametric implementation]] and requires no [[prior assumption]]s about [[class densiti]]es. </s> | |||
---- | |||
__NOTOC__ | __NOTOC__ | ||
[[Category:Concept]] | [[Category:Concept]] | ||
[[Category:Machine Learning]] | [[Category:Machine Learning]] | ||
[[Category:Quality Silver]] | [[Category:Quality Silver]] |
Latest revision as of 08:25, 15 January 2024
A Feature Generation Algorithm is a data processing algorithm that can be implemented by a feature generation system to solve feature generation task (sot create new ML features).
- AKA: Feature Extraction Algorithm.
- Context:
- It can support ML Predictive Quality Improvements and Data Preprocessing Processing Improvements.
- It can involve techniques such as Feature Extraction, Feature Selection, and Dimensionality Reduction.
- It can be critical in domains where raw data is complex and high-dimensional.
- It can use domain knowledge to create more relevant features to specific Machine Learning Tasks.
- It can range from being a Heuristic Feature Creation Algorithm to being a Data-Driven Feature Creation Algorithm.
- It can range from being a Low-Level Feature Creation Algorithm to being a High-Level Feature Creation Algorithm.
- ...
- Example(s):
- A Text Classification Feature Generation Algorithm, that extracts n-grams or sentiment scores.
- In finance, it might generate features like moving averages or volatility measures from stock price data.
- In image recognition, algorithms could generate features by identifying edges, textures, or color histograms in images.
- ...
- Counter-Example(s):
- See: Feature Engineering, Machine Learning Pipeline, Data Preprocessing, Feature Weighting Algorithm, Text Item Feature.
References
2003
- (Torkkola, 2003) ⇒ Kari Torkkola. (2003). “Feature Extraction by Non Parametric Mutual Information Maximization.” In: The Journal of Machine Learning Research, 3.
- QUOTE: We present a method for learning discriminative feature transforms using as criterion the mutual information between class labels and transformed features. Instead of a commonly used mutual information measure based on Kullback-Leibler divergence, we use a quadratic divergence measure, which allows us to make an efficient non-parametric implementation and requires no prior assumptions about class densities.