Relevance learning for redundant features

Pfannschmidt L (2021)
Bielefeld: Universität Bielefeld.

Bielefelder E-Dissertation | Englisch
 
Download
OA 960.59 KB
Gutachter*in / Betreuer*in
Abstract / Bemerkung
Feature selection is a widely used strategy in machine learning for the reduction of feature sets to their relevant essence to improve predictions and performance. It is also employed for knowledge discovery in applied disciplines such as biology and medicine to find potentially causal factors. But machine learning models often do not represent a unique solution to a given problem, especially in high dimensional settings where redundant factors are likely and spurious correlations exist.

Basing decisions about causal elements on feature selection is therefore inaccurate or wrong when not considering the presence of redundant but also relevant features. Most existing selection algorithms are specifically removing redundancies and not suitable for the task of all-relevant feature selection, or they require careful parametrization and are hard to interpret, which makes them difficult to use.

This thesis is focused on feature selection methods for the analytical use case to facilitate understanding of potential causal factors, for linear and non-linear problems. We propose several new algorithms and methods for all-relevant feature selection to improve knowledge discovery, enabled by statistical methods to improve the accuracy of existing solutions and allow the differentiation between different types of relevance. Furthermore, we offer a new heuristic to automatically group related features together, and we analyse the definition of relevance in the context of privileged information, where data is only available in training.

We also introduce software implementations, which were specifically designed to be modular, efficient and able to parallelize for applications in high dimensional problems. The methods and implementations were evaluated on a wide range of synthetic and real datasets to show their performance in comparison with existing algorithms.
Jahr
2021
Seite(n)
120
Page URI
https://pub.uni-bielefeld.de/record/2959861

Zitieren

Pfannschmidt L. Relevance learning for redundant features. Bielefeld: Universität Bielefeld; 2021.
Pfannschmidt, L. (2021). Relevance learning for redundant features. Bielefeld: Universität Bielefeld. https://doi.org/10.4119/unibi/2959861
Pfannschmidt, Lukas. 2021. Relevance learning for redundant features. Bielefeld: Universität Bielefeld.
Pfannschmidt, L. (2021). Relevance learning for redundant features. Bielefeld: Universität Bielefeld.
Pfannschmidt, L., 2021. Relevance learning for redundant features, Bielefeld: Universität Bielefeld.
L. Pfannschmidt, Relevance learning for redundant features, Bielefeld: Universität Bielefeld, 2021.
Pfannschmidt, L.: Relevance learning for redundant features. Universität Bielefeld, Bielefeld (2021).
Pfannschmidt, Lukas. Relevance learning for redundant features. Bielefeld: Universität Bielefeld, 2021.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Creative Commons Namensnennung 4.0 International Public License (CC-BY 4.0):
Volltext(e)
Name
960.59 KB
Access Level
OA Open Access
Zuletzt Hochgeladen
2021-12-12T18:47:01Z
MD5 Prüfsumme
5f9416f168e0ebca02f9f0bbe51df3d0

Export

Markieren/ Markierung löschen
Markierte Publikationen

Open Data PUB

Suchen in

Google Scholar