PRIVACY PRESERVING BAG PREPARATION FOR LEARNING FROM LABEL PROPORTION

creator Yan, Xinzhou PRIVACY PRESERVING BAG PREPARATION FOR LEARNING FROM LABEL PROPORTION 2018 Fall 2018 Thesis Illinois Institute of Technology CS / Computer Science advisor Culotta, Aron Computer science Data Security Differential Privacy K-anonymous Label Proportion Machine Learning Privacy-preserving en We apply Privacy-preserving data mining standards (PPDM) to the Learning from label proportion (LLP) model to create the Private-preserving machine learning framework. We design the data preparation step for the LLP framework to meet the PPDM standards. In the data preparation step, we develop a bag selection method to boost the accuracy of the LLP model by more than 7%. Besides that, we propose three K- anonymous aggregation methods for the datasets which have almost zero accuracy loss and very robust. After the K-anonymous step, we apply Differential privacy to the LLP model and ensure a low accuracy loss for the LLP modelBecause of the LLP model’s special loss function, not only it is possible to replace all the feature vectors with the mean feature vector within each bag, but also the accuracy loss caused by Differential privacy can be bounded by a small number. The loss function ensures low accuracy loss when training LLP model on PPDM dataset. We evaluate the PPDM LLP model on two datasets, one is the Adult dataset and the other is the Instagram comment dataset. Both of them give empirical evidence of the low accuracy loss after applying the PPDM LLP model. born digital application/pdf In Copyright http://rightsstatements.org/page/InC/1.0/ Restricted Access http://hdl.handle.net/10560/islandora:1001314