Pedestrian detection, which has wide applications on surveillance, automatic driving and robotics, plays a significant role in computer vision... Show morePedestrian detection, which has wide applications on surveillance, automatic driving and robotics, plays a significant role in computer vision. Among all kinds of pedestrian detection methods, stereo based method achieves an accurate and efficient detection result by exploiting depth and color information. However, many stereo based systems fail at considering motion information which is important in locating and detecting an object. For many pedestrian detection systems, adding extra data like motion is one of the most effective ways to improve the performance. Therefore, this thesis proposes a multi-cue pedestrian detection system which integrates optical flow based and stereo based modules for combining motion, depth and color information. In the proposed system, optical flow and disparity value are estimated by using the frames which obtained from a stereo camera. In order to obtain accurate pedestrian motion, ego motion is compensated by using motion clustering, affine model and RANSAC. After that, the motion and the depth information are exploited for ROI generation. Finally, SVM is trained by the combination of motion feature and HOG feature. Experimental results show that the use of high-accuracy optical flow along with depth and color information improves the performance of multi-cue pedestrian detection system. M.S. in Electrical Engineering, December 2015 Show less
Recently, there has been a rapid development in monocular depth estimation based on self-supervised learning. However, these existing self... Show moreRecently, there has been a rapid development in monocular depth estimation based on self-supervised learning. However, these existing self-supervised learning methods are insufficient for estimating motion objects, occlusions, and large static areas. Uncertainty or vanishing easily occurs during depth inferencing. To address this problem, the model proposed in this thesis further explores the consistency in video and builds a multi-frame model for depth estimation; secondly, by taking advantage of the optical flow, a motion mask is generated, with additional photometric loss applied for those masked regions. Experiments are carried out on the KITTI dataset. The proposed model performs better than the baseline model in quantitative results, and as seen from the depth map, the scale uncertainty and depth incomplete situations are improved in motion objects and occlusions explicitly. Show less