TY - JOUR
T1 - Group-based active query selection for rapid diagnosis in time-critical situations
AU - Bellala, Gowtham
AU - Bhavnani, Suresh K.
AU - Scott, Clayton
N1 - Funding Information:
Manuscript received November 21, 2009; revised April 01, 2011; accepted July 06, 2011. Date of current version January 06, 2012. This work was supported in part by NSF Awards 0830490 and 0953135, in part by NIH Grant UL1RR024986, and in part by CDC/NIOSH Grant R21 OH009441-01A2. The material in this paper was presented in part at the Neural Information Processing Systems Conference, Vancouver, BC, Canada, December 2010.
PY - 2012/1
Y1 - 2012/1
N2 - In applications such as active learning and disease/fault diagnosis, one often encounters the problem of identifying an unknown object through a minimal number of queries. This problem has been referred to as query learning or object/entity identification. We consider three extensions of this fundamental problem that are motivated by practical considerations in real-world,time- critical identification tasks such as emergency response. First, we consider the problem where the objects are partitioned into groups, and the goal is to identify only the group to which the object belongs. Second, we address the situation where the queries are partitioned into groups, and an algorithm may suggest a group of queries to a human user, who then selects the actual query. Third, we consider the problem of object identification in the presence of persistent query noise, and relate it to group identification. To address these problems we show that a standard algorithm for object identification, known as generalized binary search, may be viewed as a generalization of Shannon-Fano coding. We then extend this result to the group-based settings, leading to new algorithms, whose performance is demonstrated through a logarithmic approximation bound, and through experiments on simulated data and a database used for toxic chemical identification.
AB - In applications such as active learning and disease/fault diagnosis, one often encounters the problem of identifying an unknown object through a minimal number of queries. This problem has been referred to as query learning or object/entity identification. We consider three extensions of this fundamental problem that are motivated by practical considerations in real-world,time- critical identification tasks such as emergency response. First, we consider the problem where the objects are partitioned into groups, and the goal is to identify only the group to which the object belongs. Second, we address the situation where the queries are partitioned into groups, and an algorithm may suggest a group of queries to a human user, who then selects the actual query. Third, we consider the problem of object identification in the presence of persistent query noise, and relate it to group identification. To address these problems we show that a standard algorithm for object identification, known as generalized binary search, may be viewed as a generalization of Shannon-Fano coding. We then extend this result to the group-based settings, leading to new algorithms, whose performance is demonstrated through a logarithmic approximation bound, and through experiments on simulated data and a database used for toxic chemical identification.
KW - Active learning
KW - Shannon-Fano coding
KW - decision trees
KW - generalized binary search
KW - persistent noise
KW - submodularity
UR - http://www.scopus.com/inward/record.url?scp=84855672674&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84855672674&partnerID=8YFLogxK
U2 - 10.1109/TIT.2011.2169296
DO - 10.1109/TIT.2011.2169296
M3 - Article
AN - SCOPUS:84855672674
SN - 0018-9448
VL - 58
SP - 459
EP - 478
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 1
M1 - 6121981
ER -