Semi-supervised learning methods widely used include:
1.EM with generative mixture models
2.self-training
3.co-training
4.transductive support vector machines
5.graph-based methods
self-training:
A classifier is first traind with the small amount of labeled data. The classifier is then used to classify the unlabeled
data. Typically the most confident unlabeled data points, together with their predicted labels, are added to the
training set. The classifier is re-trained and the procedure repeated.
When the existing supervised classifier is complicated and hard to modify, self-training is a practical wrapper method.
applied to several natural language processing tasks, word sense disambiguation, parsing, machine translation and
object detection system from images.
co-training
Co-training assumes that features can be split into two sets. Each sub-features is sufficient to train a good classifier.
The two sets sre conditionally independent given the class. Initially two seperate classifiers are trained with the
labeled data, on the two sub-features sets respectively. Each classifier then classifies the unlabeled data, and
'teaches' the other classifier with the few unlabeled examples(and the predicted labels) they feel most confident.
Each classifier is retrained with the additional training examples given by the other classifer, and the process
repeats.
When the features naturally split into two sets, co-training may be appropriate.
Reference:
Xiaojin Zhu. Semi-Supervised Learning with Graphs.
