编辑: 没心没肺DR | 2019-07-11 |
nju.edu.cn A Framework for Modeling Positive Class Expansion with Single Snapshot Yang Yu and Zhi-Hua Zhou LAMDA Group National Key Laboratory for Novel Software Technology Nanjing University, China http://lamda.nju.edu.cn Motivating task 3G 2G 1G evolution of mobile telecom network http://lamda.nju.edu.cn Motivating task 3G 2G 1G evolution of mobile telecom network we are at the moment of moving towards 3G http://lamda.nju.edu.cn Motivating task 3G 2G 1G evolution of mobile telecom network we are at the moment of moving towards 3G predict the 2G users that will turn to use 3G http://lamda.nju.edu.cn Analysis of the task 2G starts 2G dominates 3G starts 3G dominates time line: event: http://lamda.nju.edu.cn Analysis of the task class distribution: 2G starts 2G dominates 3G starts 3G dominates time line: event: http://lamda.nju.edu.cn Analysis of the task class distribution: 2G starts 2G dominates 3G starts 3G dominates time line: event: when we train the model what we want to predict http://lamda.nju.edu.cn Analysis of the task class distribution: 2G starts 2G dominates 3G starts 3G dominates time line: event: when we train the model what we want to predict positive class expansion with single snapshot (PCES) problem http://lamda.nju.edu.cn Outline ? A new data mining problem: PCES ? Why we need the PCES problem ? A solution to the PCES problem ? Results ? Conclusion http://lamda.nju.edu.cn Outline ? A new data mining problem: PCES ? Why we need the PCES problem ? A solution to the PCES problem ? Results ? Conclusion http://lamda.nju.edu.cn Formulation of classical learning ? i.i.d. instances ? training set drawn from a distribution ? fixed labeling function a learning algorithm outputs a function to minimize: can not model a changing labeling function http://lamda.nju.edu.cn ? labeling function at training time ? labeling function at testing time a learning algorithm outputs a function to minimize: Formulation of PCES with a constraint: for convenience, we assume: http://lamda.nju.edu.cn Another example positive class: hot items negative class: not hot items http://lamda.nju.edu.cn Another example the PCES problem , only one snapshot the positive class is expanding positive class: hot items negative class: not hot items http://lamda.nju.edu.cn Further example positive class: hot items negative class: not hot items http://lamda.nju.edu.cn Further example the PCES problem , only one snapshot the positive class is expanding positive class: hot items negative class: not hot items http://lamda.nju.edu.cn Outline ? A new data mining problem: PCES ? Why we need the PCES problem ? A solution to the PCES problem ? Results ? Conclusion http://lamda.nju.edu.cn Related learning frameworks ? PU-Learning (learning with positive and unlabeled data) ? Concept drift ? Covariance shift http://lamda.nju.edu.cn PU-Learning Setting: only positive instances and unlabeled instances are in the training data Assumption: the positive instances are representatives of the positive class concept [Liu et al, ICML02][Yu et al, KDD02] PCES: positive class is in expansion PU-Learning could not catch expanded class concept http://lamda.nju.edu.cn Concept Drift Setting: instances are coming sequentially batch by batch, the target concept may change in the coming batch Assumption: a series of data samples are available for drift detection [Klinkenberg &
Joachims, ICDM00][Kolter &
Maloof, ICML03] PCES: only a single snapshot is available concept drift approaches are disabled http://lamda.nju.edu.cn Assumption: the labeling function is fixed Covariance Shift (or sample selection bias [Shimodaira, JSPI00]) Setting: training and test instances are drawn from different distributions, i.e., is in changing PCES: is fixed but is in change covariance shift approaches are disabled http://lamda.nju.edu.cn Outline ? A new data mining problem: PCES ? Why we need the PCES problem ? A solution to the PCES problem ? Results ? Conclusion http://lamda.nju.edu.cn Optimized by SGBDota The proposed approach Learn from pure data Incorporate preference bias Combined objective http://lamda.nju.edu.cn Learn from pure data Observation: a desired leaner ranks positive training instances higher than negative training instances exactly expressed by the AUC (area under ROC) criterion: http://lamda.nju.edu.cn Learn from pure data smoothed loss function: instance-wise loss function: http://lamda.nju.edu.cn Incorporate preference bias User can provide preferences by ? indicating preferences on randomly sampled instance pairs ? applying a priori rules that indicate the preferences In either way, we can have a preference function Loss function http://lamda.nju.edu.cn Incorporate preference bias smoothed loss function instance-wise loss function http://lamda.nju.edu.cn Combine the two objectives the combined loss function the learning problem thus is http://lamda.nju.edu.cn Optimization Gradient Boosting [Friedman, AnnStat01, CSDA02] http://lamda.nju.edu.cn Optimization Gradient Boosting [Friedman, AnnStat01, CSDA02] Gradient Boosting fits y, but we need to fit both y and k http://lamda.nju.edu.cn Optimization with double targets SGBDota (Stochastic Gradient Boosting with DOuble TArgets) http://lamda.nju.edu.cn SGBDota Optimize by SGBDota Learn from pure data Incorporate preference bias Combined objective http://lamda.nju.edu.cn Outline ? A new data mining problem: PCES ? Why we need the PCES problem ? A solution to the PCE........