Active data selection in supervised and unsupervised learning
Hasenjäger M (2000)
Bielefeld: Bielefeld University.
Bielefelder E-Dissertation | Englisch
Download
Autor*in
Hasenjäger, Martina
Gutachter*in / Betreuer*in
Ritter, Helge
Einrichtung
Abstract / Bemerkung
In the context of computer science, learning is applied in situations that are so complex that conventional programming techniques are either unavailable or not practical. But often empirical knowledge on the phenomenon/process under consideration is available in the form of data from repeated measurements. Learning then means extracting the basic regularities from these empirical data and thus can be seen as building an abstraction of the phenomenon that yields a complete and robust description of its interesting aspects.
The success of the learning process depends on a number of factors such as the concrete form of the learning system, the procedure used for learning and, finally, the data that are at the learner's disposal. It is this last point of data selection that is the main focus of this thesis. In general, the data are selected at random in such a way that they capture the interesting aspects of the phenomenon. This is not necessarily the most efficient way of data acquisition, since the data selection procedure does not receive feedback from the learner. The data may therefore not be in tune with his current state of knowledge and the learning process is not as efficient as it could be.
In this thesis, we discuss a new paradigm for learning that aims at improving the efficiency of neural network training procedures: active learning. Here, the learner is enabled to make use of the information that is already available to select those training data that he expects to be most informative. In this case, the learner is no longer a passive recipient of information, but takes an active role in the selection of the training data.
After a review of the state of the art in active learning, we turn to active learning in binary classification tasks. Here, we study in detail an approach to the problem that is based on concepts from information theory. We then develop a new heuristic algorithm for data selection in local models, a class of learners that up to now has not been considered in this context. Finally, we extend the area of application of active learning techniques to unsupervised learning: we propose an algorithm for active data selection in topographic pairwise clustering that is founded on statistical decision theory.
Our results show that active learning may be computationally expensive but that, in comparison to random data selection, these active strategies lead to a considerable reduction in the number of necessary training samples. This makes active data selection a viable alternative, especially when the cost of data acquisition is high.
Stichworte
Maschinelles Lernen;
Wissenserwerb;
Neuronales Netz;
Stochastisches Modell;
Neuronale Netze;
Maschinelles Lernen;
Aktive Datenauswahl;
Fragenbasiertes Lernen
Jahr
2000
Page URI
https://pub.uni-bielefeld.de/record/2302013
Zitieren
Hasenjäger M. Active data selection in supervised and unsupervised learning. Bielefeld: Bielefeld University; 2000.
Hasenjäger, M. (2000). Active data selection in supervised and unsupervised learning. Bielefeld: Bielefeld University.
Hasenjäger, Martina. 2000. Active data selection in supervised and unsupervised learning. Bielefeld: Bielefeld University.
Hasenjäger, M. (2000). Active data selection in supervised and unsupervised learning. Bielefeld: Bielefeld University.
Hasenjäger, M., 2000. Active data selection in supervised and unsupervised learning, Bielefeld: Bielefeld University.
M. Hasenjäger, Active data selection in supervised and unsupervised learning, Bielefeld: Bielefeld University, 2000.
Hasenjäger, M.: Active data selection in supervised and unsupervised learning. Bielefeld University, Bielefeld (2000).
Hasenjäger, Martina. Active data selection in supervised and unsupervised learning. Bielefeld: Bielefeld University, 2000.
Alle Dateien verfügbar unter der/den folgenden Lizenz(en):
Copyright Statement:
Dieses Objekt ist durch das Urheberrecht und/oder verwandte Schutzrechte geschützt. [...]
Volltext(e)
Name
Access Level
Open Access
Zuletzt Hochgeladen
2019-09-06T08:57:39Z
MD5 Prüfsumme
d947b36c075f3a4bfc646c0a32f25943
Automatisch aus der Originaldatei erzeugtes PDF
Name
diss.pdf
1.79 MB
Access Level
Open Access
Zuletzt Hochgeladen
2023-08-03T15:15:17Z
MD5 Prüfsumme
189dd3e5eb93e1922e37cf7554297a6a