self-training and co-training

    技术2022-05-19  22

    Semi-supervised learning methods widely used include:

     

    1.EM with generative mixture models

     

    2.self-training

     

    3.co-training

     

    4.transductive support vector machines

     

    5.graph-based methods

     

    self-training:

     

    A classifier is first traind with the small amount of labeled data. The classifier is then used to classify the unlabeled

     

    data. Typically the most confident unlabeled data points, together with their predicted labels, are added to the

     

    training set. The classifier is re-trained and the procedure repeated.

     

    When the existing supervised classifier is complicated and hard to modify, self-training is a practical wrapper method.

     

    applied to several natural language processing tasks, word sense disambiguation, parsing, machine translation and

     

    object detection system from images.

     

    co-training

    Co-training assumes that features can be split into two sets. Each sub-features is sufficient to train a good classifier.

     

    The two sets sre conditionally independent given the class. Initially two seperate classifiers are trained with the

     

    labeled data, on the two sub-features sets respectively. Each classifier then classifies the unlabeled data, and

     

    'teaches' the other classifier with the few unlabeled examples(and the predicted labels) they feel most confident.

     

    Each classifier is retrained with the additional training examples given by the other classifer, and the process

     

    repeats.

     

    When the features naturally split into two sets, co-training may be appropriate.

     

    Reference:

     

    Xiaojin Zhu. Semi-Supervised Learning with Graphs.


    最新回复(0)