Text Detection by Faster R-CNN with Multiple Region Proposal Networks

Yoshito Nagaoka, Tomo Miyazaki, Yoshihiro Sugaya, Shinichiro Omachi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Citations (Scopus)

Abstract

We propose an end-to-end consistently trainable text detection method based on the Faster R-CNN. The original Faster R-CNN is an end-to-end CNN for fast and accurate object detection. By considering the characteristics of texts, a novel architecture that make use of its ability on object detection is proposed. Although the original Faster R-CNN generates region of interests (RoIs) by a region proposal network (RPN) using the feature map of the last convolutional layer, the proposed method generates RoIs by multiple RPNs using the feature maps of multiple convolutional layers. This method uses multiresolution feature maps to detect texts of various sizes simultaneously. To aggregate the RoIs, we introduce RoI-merge layer, and this layer enables to select valid RoIs from multiple RPNs effectively. In addition, a training strategy is proposed for realizing end-to-end training and making each RPN be specialized in text region size. Experimental results using ICDAR2013/2015 RRC test dataset show that the proposed Multi-RPN method improved detection scores and kept almost the same detection speed as compared to the original Faster R-CNN and recent methods.

Original languageEnglish
Title of host publicationProceedings - 7th International Workshop on Camera-Based Document Analysis and Recognition, CBDAR 2017
PublisherIEEE Computer Society
Pages15-20
Number of pages6
ISBN (Electronic)9781538635865
DOIs
Publication statusPublished - 2018 Jan 25
Event7th International Workshop on Camera-Based Document Analysis and Recognition, CBDAR 2017 - Kyoto, Japan
Duration: 2017 Nov 11 → …

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Volume6
ISSN (Print)1520-5363

Other

Other7th International Workshop on Camera-Based Document Analysis and Recognition, CBDAR 2017
Country/TerritoryJapan
CityKyoto
Period17/11/11 → …

Keywords

  • Faster R-CNN
  • Region Proposal Network
  • Text detection

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Text Detection by Faster R-CNN with Multiple Region Proposal Networks'. Together they form a unique fingerprint.

Cite this