Feature Generation Algorithm: Difference between revisions

Latest revision as of 08:25, 15 January 2024

A Feature Generation Algorithm is a data processing algorithm that can be implemented by a feature generation system to solve feature generation task (sot create new ML features).

AKA: Feature Extraction Algorithm.
Context:
- It can support ML Predictive Quality Improvements and Data Preprocessing Processing Improvements.
- It can involve techniques such as Feature Extraction, Feature Selection, and Dimensionality Reduction.
- It can be critical in domains where raw data is complex and high-dimensional.
- It can use domain knowledge to create more relevant features to specific Machine Learning Tasks.
- It can range from being a Heuristic Feature Creation Algorithm to being a Data-Driven Feature Creation Algorithm.
- It can range from being a Low-Level Feature Creation Algorithm to being a High-Level Feature Creation Algorithm.
- ...
Example(s):
- A Text Classification Feature Generation Algorithm, that extracts n-grams or sentiment scores.
- In finance, it might generate features like moving averages or volatility measures from stock price data.
- In image recognition, algorithms could generate features by identifying edges, textures, or color histograms in images.
- ...
Counter-Example(s):
- a Feature Normalization Algorithm.
- a Data Cleaning Algorithm.
- a Feature Dimensionality-Reduction Algorithm, such as a feature selection algorithm.
See: Feature Engineering, Machine Learning Pipeline, Data Preprocessing, Feature Weighting Algorithm, Text Item Feature.

References

2003

(Torkkola, 2003) ⇒ Kari Torkkola. (2003). “Feature Extraction by Non Parametric Mutual Information Maximization.” In: The Journal of Machine Learning Research, 3.
- QUOTE: We present a method for learning discriminative feature transforms using as criterion the mutual information between class labels and transformed features. Instead of a commonly used mutual information measure based on Kullback-Leibler divergence, we use a quadratic divergence measure, which allows us to make an efficient non-parametric implementation and requires no prior assumptions about class densities.

@@ Line 1: / Line 1: @@
 A [[Feature Generation Algorithm]] is a [[data processing algorithm]] that can be implemented by a [[feature generation system]] to solve [[feature generation task]] (sot create new [[ML feature]]s).
+* <B>AKA:</B> [[Feature Creation Algorithm|Feature Extraction Algorithm]].
 * <B>Context:</B>
 ** It can support [[ML Predictive Quality Improvement]]s and [[Data Preprocessing Processing Improvement]]s.
 ** It can involve techniques such as [[Feature Extraction]], [[Feature Selection]], and [[Dimensionality Reduction]].
 ** It can be critical in domains where raw data is complex and high-dimensional.
-** It can use domain knowledge to create features that are more relevant to specific [[Machine Learning Task]]s.
+** It can use domain knowledge to create more relevant features to specific [[Machine Learning Task]]s.
+** It can range from being a [[Heuristic Feature Creation Algorithm]] to being a [[Data-Driven Feature Creation Algorithm]].
+** It can range from being a [[Low-Level Feature Creation Algorithm]] to being a [[High-Level Feature Creation Algorithm]].
 ** ...
 * <B>Example(s):</B>
-** In [[text mining]], a feature generation algorithm might extract [[n-gram]]s or [[sentiment scores]] from raw text.
+** A [[Text Classification Feature Generation Algorithm]], that extracts [[n-gram]]s or [[sentiment scores]].
 ** In [[finance]], it might generate features like moving averages or volatility measures from stock price data.
 ** In [[image recognition]], algorithms could generate features by identifying edges, textures, or color histograms in images.
 ** ...
 * <B>Counter-Example(s):</B>
-** [[Feature Normalization Algorithm]]
+** a [[Feature Normalization Algorithm]].
-** [[Data Cleaning Algorithm]]
+** a [[Data Cleaning Algorithm]].
-* <B>See:</B> [[Feature Engineering]], [[Machine Learning Pipeline]], [[Data Preprocessing]].
+** a [[Feature Dimensionality-Reduction Algorithm]], such as a [[feature selection algorithm]].
+* <B>See:</B> [[Feature Engineering]], [[Machine Learning Pipeline]], [[Data Preprocessing]], [[Feature Weighting Algorithm]], [[Text Item Feature]].
 ----
 ----
+== References ==
+=== 2003 ===
+* ([[2003_FeatureExtractionbyNonParametri|Torkkola, 2003]]) ⇒ [[Kari Torkkola]]. ([[2003]]). “[http://machinelearning.wustl.edu/mlpapers/paper_files/Torkkola03.pdf Feature Extraction by Non Parametric Mutual Information Maximization].” In: The Journal of Machine Learning Research, 3.
+** QUOTE: [[We]] present a [[method for learning discriminative feature transforms]] using as criterion the [[mutual information]] between [[class label]]s and [[transformed feature]]s. </s> Instead of a commonly used [[mutual information measure]] based on [[Kullback-Leibler divergence]], [[we]] use a [[quadratic divergence measure]], which allows us to make an [[efficient]] [[non-parametric implementation]] and requires no [[prior assumption]]s about [[class densiti]]es. </s>
+----
 __NOTOC__
 [[Category:Concept]]
 [[Category:Machine Learning]]
 [[Category:Quality Silver]]

Feature Generation Algorithm: Difference between revisions

Latest revision as of 08:25, 15 January 2024

References

2003

Navigation menu

Search