<?xml version='1.0' encoding='utf-8'?>
<mods xmlns="http://www.loc.gov/mods/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="3.7" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-7.xsd">
   <name>
      <role>
         <roleTerm type="text" authority="marcrelator" authorityURI="http://id.loc.gov/vocabulary/relators" valueURI="http://id.loc.gov/vocabulary/relators/cre">creator</roleTerm>
      </role>
      <namePart>Yan, Xinzhou</namePart>
   </name>
   <titleInfo>
      <title>PRIVACY PRESERVING BAG PREPARATION FOR LEARNING FROM LABEL PROPORTION</title>
   </titleInfo>
   <originInfo>
      <dateCreated keyDate="yes">2018</dateCreated>
   </originInfo>
   <note displayLabel="Degree Awarded">Fall 2018</note>
   <typeOfResource authority="aat" valueURI="http://vocab.getty.edu/page/aat/300028029">Thesis</typeOfResource>
   <name type="corporate">
      <affiliation>Illinois Institute of Technology</affiliation>
   </name>
   <name type="corporate">
      <namePart>CS / Computer Science</namePart>
   </name>
   <name authority="wikidata" authorityURI="https://www.wikidata.org" valueURI="https://www.wikidata.org/wiki/Q77831200">
      <role>
         <roleTerm type="text" authority="marcrelator" authorityURI="http://id.loc.gov/vocabulary/relators" valueURI="http://id.loc.gov/vocabulary/relators/cre">advisor</roleTerm>
      </role>
      <namePart>Culotta, Aron</namePart>
   </name>
   <subject>
      <topic>Computer science</topic>
   </subject>
   <subject>
      <topic>Data Security</topic>
   </subject>
   <subject>
      <topic>Differential Privacy</topic>
   </subject>
   <subject>
      <topic>K-anonymous</topic>
   </subject>
   <subject>
      <topic>Label Proportion</topic>
   </subject>
   <subject>
      <topic>Machine Learning</topic>
   </subject>
   <subject>
      <topic>Privacy-preserving</topic>
   </subject>
   <language>
      <languageTerm type="code" authority="rfc3066">en</languageTerm>
   </language>
   <abstract>We apply Privacy-preserving data mining standards (PPDM) to the Learning from label proportion (LLP) model to create the Private-preserving machine learning framework. We design the data preparation step for the LLP framework to meet the PPDM standards. In the data preparation step, we develop a bag selection method to boost the accuracy of the LLP model by more than 7%. Besides that, we propose three K- anonymous aggregation methods for the datasets which have almost zero accuracy loss and very robust. After the K-anonymous step, we apply Differential privacy to the LLP model and ensure a low accuracy loss for the LLP modelBecause of the LLP model’s special loss function, not only it is possible to replace all the feature vectors with the mean feature vector within each bag, but also the accuracy loss caused by Differential privacy can be bounded by a small number. The loss function ensures low accuracy loss when training LLP model on PPDM dataset. We evaluate the PPDM LLP model on two datasets, one is the Adult dataset and the other is the Instagram comment dataset. Both of them give empirical evidence of the low accuracy loss after applying the PPDM LLP model.</abstract>
   <physicalDescription>
      <digitalOrigin>born digital</digitalOrigin>
      <internetMediaType>application/pdf</internetMediaType>
   </physicalDescription>
   <accessCondition type="useAndReproduction" displayLabel="rightsstatements.org">In
                Copyright</accessCondition>
   <accessCondition type="useAndReproduction" displayLabel="rightsstatements.orgURI">http://rightsstatements.org/page/InC/1.0/</accessCondition>
   <accessCondition type="restrictionOnAccess">Restricted Access</accessCondition>
<identifier type="hdl">http://hdl.handle.net/10560/islandora:1001314</identifier></mods>