TY - JOUR
T1 - Mahalanobis encodings for visual categorization
AU - Matsuzawa, Tomoki
AU - Relator, Raissa
AU - Takei, Wataru
AU - Omachi, Shinichiro
AU - Kato, Tsuyoshi
N1 - Publisher Copyright:
© 2015 Information Processing Society of Japan.
PY - 2015
Y1 - 2015
N2 - Nowadays, the design of the representation of images is one of the most crucial factors in the performance of visual categorization. A common pipeline employed in most of recent researches for obtaining an image representation consists of two steps: the encoding step and the pooling step. In this paper, we introduce the Mahalanobis metric to the two popular image patch encoding modules, Histogram Encoding and Fisher Encoding, that are used for Bag-of-Visual-Word method and Fisher Vector method, respectively. Moreover, for the proposed Fisher Vector method, a close-form approximation of Fisher Vector can be derived with the same assumption used in the original Fisher Vector, and the codebook is built without resorting to time-consuming EM (Expectation-Maximization) steps. Experimental evaluation of multi-class classification demonstrates the effectiveness of the proposed encoding methods.
AB - Nowadays, the design of the representation of images is one of the most crucial factors in the performance of visual categorization. A common pipeline employed in most of recent researches for obtaining an image representation consists of two steps: the encoding step and the pooling step. In this paper, we introduce the Mahalanobis metric to the two popular image patch encoding modules, Histogram Encoding and Fisher Encoding, that are used for Bag-of-Visual-Word method and Fisher Vector method, respectively. Moreover, for the proposed Fisher Vector method, a close-form approximation of Fisher Vector can be derived with the same assumption used in the original Fisher Vector, and the codebook is built without resorting to time-consuming EM (Expectation-Maximization) steps. Experimental evaluation of multi-class classification demonstrates the effectiveness of the proposed encoding methods.
KW - Bag-of-Visual-Word
KW - Fisher vector
KW - Mahalanobis metric
KW - Visual categorization
UR - http://www.scopus.com/inward/record.url?scp=84982821541&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84982821541&partnerID=8YFLogxK
U2 - 10.2197/ipsjtcva.7.69
DO - 10.2197/ipsjtcva.7.69
M3 - Article
AN - SCOPUS:84982821541
SN - 1882-6695
VL - 7
SP - 69
EP - 73
JO - IPSJ Transactions on Computer Vision and Applications
JF - IPSJ Transactions on Computer Vision and Applications
ER -