Search results

Title: REAL-TIME FACEDETECTION ANDRECOGNITION SYSTEM INCOMPLEX BACKGROUNDS
Creator: Zhang, Xin
Date: 2015, 2015-07
Description: This report provides a fast and reliable system for real-time face detection and recognition in complex backgrounds. Most current face...
Show moreThis report provides a fast and reliable system for real-time face detection and recognition in complex backgrounds. Most current face recognition systems identify faces under constrained conditions, such as constant lighting condition, the same background. In the real world, people need to be recognized in complex backgrounds under different conditions, such as tilted head poses, various facial expressions, dark or strong lighting conditions. Meanwhile, because of large amounts of real-time applications for face recognition, such as intelligent robot, unmanned vehicle, security monitor, the fast face recognition rate needs to be satisfied for the real-time requirement. In this project, a fast and reliable system is designed to real-time detect and recognize faces under various conditions. Frames are obtained directly from VGA camera. Image preprocessing and face detection, collection, recognition are sequentially implemented on the frames. Local binary patterns and Haar features are used for face detection and eye detection. Local binary pattern encodes every pixel of the image for texture extraction, which is several times faster than Haar feature detection. Adaptive boosting algorithm is used for selecting the best weak classifiers and cascading method divides the extracted best classifiers into several stages to enhance detection rate. Affine transformation is implemented to unify the size of detected facial images and align two eyes to the desired position for improving recognition accuracy. 3􀁵3 Gaussian filter is designed to remove noises of the pre-processed facial images. Principal component analysis (PCA) is used for face recognition, which is fast to identify high-dimensional faces with few principal components.
M.S. in Electrical Engineering, July 2015
Show less

Title: EMBEDDED SYSTEM DESIGN FOR TRAFFIC SIGN RECOGNITION USING MACHINE LEARNING ALGORITHMS
Creator: Han, Yan
Date: 2016, 2016-12
Description: Traffic sign recognition system, taken as an important component of an intelligent vehicle system, has been an active research area and it has...
Show moreTraffic sign recognition system, taken as an important component of an intelligent vehicle system, has been an active research area and it has been investigated vigorously in the last decade. It is an important step for introducing intelligent vehicles into the current road transportation systems. Based on image processing and machine learning technologies, TSR systems are being developed cautiously by many manufacturers and have been set up on vehicles as part of a driving assistant system in recent years. Traffic signs are designed and placed in locations to be easily identified from its surroundings by human eyes. Hence, an intelligent system that can identify these signs as good as a human, needs to address a lot of challenges. Here, ―good‖ can be interpreted as accurate and fast. Therefore, developing a reliable, real-time and robust TSR system is the main motivation for this dissertation. Multiple TSR system approaches based on computer vision and machine learning technologies are introduced and they are implemented on different hardware platforms. Proposed TSR algorithms are comprised of two parts: sign detection based on color and shape analysis and sign classification based on machine learning technologies including nearest neighbor search, support vector machine and deep neural networks. Target hardware platforms include Xilinx ZedBoard FPGA and NVIDIA Jetson TX1 that provides GPU acceleration. Overall, based on a well-known benchmark suite, 96% detection accuracy is achieved while executing at 1.6 frames per seconds on the GPU board.
Ph.D. in Computer Engineering, December 2016
Show less

Title: HARDWARE/SOFTWARE CO-DESIGN PARTITIONING ALGORITHM FOR MACHINE VISION APPLICATIONS
Creator: Gonnot, Thomas
Date: 2017, 2017-05
Description: Advancements in FPGA technologies now allows the implementation of machine vision using hardware component rather than processors for...
Show moreAdvancements in FPGA technologies now allows the implementation of machine vision using hardware component rather than processors for increased efficiency. The combination of hardware and software implementations, however, can provide even more efficient results by combining the advantages of both technologies. This leads to the problem of partitioning the machine vision algorithms between hardware and software. The hardware/software partition problem is NP-hard, which means that a solution to the problem can be checked in polynomial time, but the time to find the solution is not predictable. Automated methods based on a genetic algorithm or discrete particle swarm optimization algorithm allow a designer to implement computer vision algorithms without concerns for the hardware/software partitioning. Their reliance on randomness to explore different partitioning selections, however, means that the optimum result might not be reached and that the processing time cannot be predicted. This dissertation introduces a model for image processing and computer vision algorithms in a set of elementary blocks, each of which is assigned one or more configuration. This configuration can be either hardware or software and is linked to the corresponding resource utilization and performance. A procedure is also introduced to allocate the different blocks to either hardware or software, and a cost function is defined to evaluate the relevance of the generated design. The implementation of the model and procedure allows for the partitioning of any image processing in polynomial time by checking various implementations and selecting the optimum solution. This thesis includes two test cases used to test the efficiency of the method. The shift-invariant features transform is used to demonstrate the viability of the partitioning results on an algorithm containing multiple image convolution operations in parallel. The neural network, on the other hand, is used to demonstrate the performances of the procedure when machine vision algorithm contains many blocks. Finally, this dissertation present a set of machine vision applications, such as object tracking, object recognition, optical character recognition, facial recognition, and visually impaired assistance. The proposed model and procedure could be included in the design flow of hardware/software co-design tools and provide a library of image processing blocks ready to be implemented. This would allow image processing and computer vision designers would be able to implement any algorithm efficiently in hardware/software co-design without the need to know how to partition it.
Ph.D. in Electrical Engineering, May 2017
Show less

Title: DEPTH MAP ENHANCEMENT FOR REAL-TIME 3D RECONSTRUCTION
Creator: Lee, Kitae
Date: 2015, 2015-07
Description: In this paper, we present a novel depth map enhancement for real-time 3D reconstruction by the Microsoft Kinect. The Kinect sensor is...
Show moreIn this paper, we present a novel depth map enhancement for real-time 3D reconstruction by the Microsoft Kinect. The Kinect sensor is relatively affordable and capable of generating high-resolution color image and depth maps of the scene at realtime rates. However, owning the low- cost, there are several artifacts. Generated depth map contains lots of holes, which they are missing information around object boundaries and mis-alignment with color image. The objective of 3D reconstruction is to recreate a real scene, as accurate as possible within a virtual three-dimensional space using a computer. The algorithm of 3D-recosntrution is highly based on the quality of the depth map. This poor depth map could not be applied in potential real-time 3D reconstruction. We present novel multi-step upsampling-based our novel anisotropic diffusion algorithms with generated depth map and color image by Kinect. This method has better performance than existed bilateral filtering and original anisotropic filtering in terms of filling holes, sharpening the boundaries of objects and alignment between depth map and color image. We compare the performance of these filters. It is difficult to do a meaningful comparison of two algorithms with using output of Kinect sensor directly; as for each observation of the same scene, we will get different sensed value. In order to circumvent this problem and to achieve an accurate comparison process, we used dataset from Computer Vision Group at Munchen Technology Universty(TUM). This dataset and the scripts is related to quantitative error metrics are avail at http://vision.in.tum.de/data/datasets/rgbd-‐dataset. We also contribute making our project parallel and GPU computing to satisfy real-time system condition.
M.S. in Electrical Engineering, July 2015
Show less

Title: STEREO-BASED DEPTH MAP PROCESSING: ESTIMATION AND REFINEMENT
Creator: Loghman, Maziar
Date: 2016, 2016-12
Description: During the past decade, research in 3D video has become a hot topic owing to advancements in both hardware and software. Amongst different...
Show moreDuring the past decade, research in 3D video has become a hot topic owing to advancements in both hardware and software. Amongst different methods proposed for representing 3D data, multi-view video plus depth (MVD) format has gained a lot of attention. Most of such 3D algorithms rely on a per-pixel depth representation of the scene called a depth map. Depth maps are very useful for rendering virtual views and have lead to advancements in 3D compression algorithms. Generating an accurate and dense depth map is one of the important prerequisite for many 3D video applications. In this thesis, we highlight the following major problems in MVD. * Depth map estimation * Depth map refinement * Depth map coding In order to generate an accurate depth map, we propose a method based on Census transform with adaptive window patterns and semi-global optimization. A modified cross-based cost aggregation technique is proposed which helps to calculate a more reliable depth map. In order to further enhance the quality of the generated depth map, a novel multi-resolution anisotropic diffusion based algorithm is presented. The proposed depth refinement algorithm computes a dense depth map in which the holes have been filled and the object boundaries are sharpened. The next part of the research is based on depth map coding. In depth map coding, a considerable amount of time is required to investigate the mode decision pro- cess for every block of depth pixels. However, in real-time purposes, we can partially skip the mode selection step. In this thesis, we propose a novel depth intra-coding scheme for 3D video coding based on HEVC standard. The core idea of the proposed method is motivated by the fact that depth maps have specific characteristics that distinguish them from those of color images. By analyzing the reference depth maps based on homogeneousness of different regions, for some particular blocks, the DMM full-RD search is skipped and the mode is selected based on the previous similar tree- blocks. By this means, the time complexity of the encoding process is significantly reduced.
Ph.D. in Electrical Engineering, December 2016
Show less

repository.iit

Search the repository

Enabled Filters

Refine Results

Type

Date

Subject

Creator

Rights