A Generalized Model of PAC Learning and its Applicability
Institut für Informatik, Georg-August-Universität Göttingen, Goldschmidtstr. 7, 37077 Göttingen, Germany .
e-mail: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org
Accepted: 5 February 2014
We combine a new data model, where the random classification is subjected to rather weak restrictions which in turn are based on the Mammen−Tsybakov [E. Mammen and A.B. Tsybakov, Ann. Statis. 27 (1999) 1808–1829; A.B. Tsybakov, Ann. Statis. 32 (2004) 135–166.] small margin conditions, and the statistical query (SQ) model due to Kearns [M.J. Kearns, J. ACM 45 (1998) 983–1006] to what we refer to as PAC + SQ model. We generalize the class conditional constant noise (CCCN) model introduced by Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to the noise model orthogonal to a set of query functions. We show that every polynomial time PAC + SQ learning algorithm can be efficiently simulated provided that the random noise rate is orthogonal to the query functions used by the algorithm given the target concept. Furthermore, we extend the constant-partition classification noise (CPCN) model due to Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to what we call the constant-partition piecewise orthogonal (CPPO) noise model. We show how statistical queries can be simulated in the CPPO scenario, given the partition is known to the learner. We show how to practically use PAC + SQ simulators in the noise model orthogonal to the query space by presenting two examples from bioinformatics and software engineering. This way, we demonstrate that our new noise model is realistic.
Mathematics Subject Classification: 68Q32 / 62P10 / 68N30
Key words: PAC learning with classification noise / Mammen−Tsybakov small margin conditions / statistical queries / noise model orthogonal to a set of query functions / bioinformatics / software engineering
© EDP Sciences 2014