Changqian Yu is a second-year graduate student majoring in Computer Vision at Huazhong University of Science and Technology. He is pursuing his Master Degree under the supervision of A.P. Changxin Gao and Prof. Nong Sang (2016-2019). He works as a research intern at Megvii (Face++) Inc. mentored by Dr. Gang Yu and supervised by Dr. Jian Sun. His research interests focus on the Computer Vision and Artificial Intelligence, specifically on the topic of Segmentation.
[ Résumé ]
News[Jul. 2018] One paper accepted to ECCV18 in in München, Germany!
[Apr. 2018] I have presented my poster on the Vision and Learning Seminar(VALSE 2018) in Dalian, LiaoNing Province, China.
[Feb. 2018] Our paper DFN is accepted to CVPR 2018 in Salt Lake City, Ultah State!
[Jul. 2017] I joined Megvii (Face++) as a Research Intern. During the time, I mainly researched on the topic of Segmentation.
Most existing methods of semantic segmentation still suffer from two aspects of challenges: intra-class inconsistency and inter-class indistinction. To tackle these two problems, we propose a Discriminative Feature Network (DFN), which contains two sub-networks: Smooth Network and Border Network. Specifically, to handle the intra-class inconsistency problem, we specially design a Smooth Network with Channel Attention Block and global average pooling to select the more discriminative features. Furthermore, we propose a Border Network to make the bilateral features of boundary distinguishable with deep semantic boundary supervision. Based on our proposed DFN, we achieve state-of-the-art performance 86.2% mean IOU on PASCAL VOC 2012 and 80.3% mean IOU on Cityscapes dataset.
Semantic segmentation requires both rich spatial information and sizeable receptive field. However, modern approaches usually compromise spatial resolution to achieve real-time inference speed, which leads to poor performance. In this paper, we address this dilemma with a novel Bilateral Segmentation Network (BiSeNet). We first design a Spatial Path with a small stride to preserve the spatial information and generate high-resolution features. Meanwhile, a Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field. On top of the two paths, we introduce a new Feature Fusion Module to combine features efficiently. The proposed architecture makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets. Specifically, for a 2048x1024 input, we achieve 68.4% Mean IOU on the Cityscapes test dataset with speed of 105 FPS on one NVIDIA Titan XP card, which is significantly faster than the existing methods with comparable performance.