Search results
(1 - 2 of 2)
- Title
- Distribution-aware Visual Semantic Understanding
- Creator
- Chen, Ying
- Date
- 2021
- Description
-
Understanding visual semantics, including change detection and semantic segmentation, is an essential task in many computer vision and image...
Show moreUnderstanding visual semantics, including change detection and semantic segmentation, is an essential task in many computer vision and image processing applications. Examples of visual semantics understanding in images include land cover monitoring, urban expansion evaluation, autonomous driving, and scene understanding. The goal is to locate and recognize appropriate pixel-wise semantic labels in images. Classical computer vision algorithms involve sophisticated semi-heuristic pre-processing steps and potentially manual interaction. In this thesis, I propose and evaluate end-to-end deep neural approaches for processing images which achieve better performance compared with existing approaches. Supervised semantic segmentation has been widely studied and achieved great success with deep learning. However, existing deep learning methods typically suffer from generalization issues where a well-trained model may not work well on unseen samples from a different dataset. This is due to a distribution change or domain shift between the training and test sets that can degrade performance. Providing more labeled samples covering many possible variations can further improve the generalization of models, but acquiring labeled data is typically time-consuming, labor-intensive and requires domain knowledge. To tackle this label scarcity bottleneck for supervised learning, we propose to apply unsupervised domain adaptation, semi-supervised learning, and semi-supervised domain adaptation for neural semantic segmentation. The motivation behind unsupervised domain adaptation for semantic segmentation is to transfer learned knowledge from one or more source domains with sufficient labeled samples to a different but relevant target domain where labeled data is sparse or non-existent. The adaptation algorithm tends to learn a common representation space where the distributions over both source and target domains are matched. In this way, we expect a classifier working well in the source domain to generalize well to the target domain. More specifically, we try to learn class-aware source-target domain distribution differences, and transfer the knowledge learned from labeled synthetic data on the source domain to the unlabeled real data on the target domain. Different from domain adaptation, semi-supervised semantic segmentation aims at utilizing a large amount of unlabeled data to improve semantic classification trained on a small amount of labeled data from the same distribution. Specifically, supervised semantic segmentation is trained together with an unsupervised model by applying perturbations on encoded states of the network instead of the input, or using mask-based data augmentation techniques to encourage consistent predictions over mixed samples. In this way, learned representation which capture many kinds of unseen variations in unlabeled data, benefit the supervised semantic classifier. We propose a mask-based data augmentation semi-supervised learning network to utilize structure information from a variety of unlabeled examples to improve the learning on a limited number of labeled examples.Both unsupervised domain adaptation (UDA) with full source supervision but without target supervision and semi-supervised learning (SSL) with partial supervision have shown to be able to address the generalization problem to some extent. While such methods are effective at aligning different feature distributions, their inability to efficiently exploit unlabeled data leads to intra-domain discrepancy in the target domain, where the target domain is separated into two unaligned sub-distributions due to source-aligned and target-aligned data. That is, enforcing partial alignment between full labeled source data and a few labeled target data does not guarantee that the remaining unlabeled target samples will be aligned with source feature clusters, thus leaving them unaligned. Hence, I propose methods for incorporating the advantages of both UDA and SSL, termed semi-supervised domain adaptation (SSDA), with a goal to align cross-domain features as well as addressing the intra-domain discrepancy within the target domain. I propose a simple yet effective semi-supervised domain adaptation approach by utilizing a two-step domain adaptation addressing both cross-domain and intra-domain shifts.
Show less
- Title
- DEEP LEARNING AND COMPUTER VISION FOR INDUSTRIAL APPLICATIONS: CELLULAR MICROSCOPIC IMAGE ANALYSIS AND ULTRASOUND NONDESTRUCTIVE TESTING
- Creator
- Yuan, Yu
- Date
- 2022
- Description
-
For decades, researchers have sought to develop artificial intelligence (AI) systems that can help human beings on decision making, data...
Show moreFor decades, researchers have sought to develop artificial intelligence (AI) systems that can help human beings on decision making, data analysis and pattern recognition applications where analytical methods are ineffective. In recent years, Deep Learning (DL) has been proven to be an effective AI technique that can outperform other methods in applications such as computer vision, natural language processing, autonomous driving. Realizing the potential of deep learning techniques, researchers have also started to apply deep learning on other industrial applications. Today, deep learning based models are used to innovate and accelerate automation, guidance, and decision making in various industries including automotive industry, pharmaceutical industry, finance, agriculture and more. In this research, several important industrial applications (on Biomedicine and Non-Destructive Testing) utilizing deep learning algorithms will be introduced and analyzed. The first biopharmaceutical application focuses on developing a deep learning based model to automate the visual inspection process in Median Tissue Culture Infectious Dose(TCID50). TCID50 is one of the most popular methods for viral quantification. An important step of TCID50 is to visually inspect the sample and decide if it exhibits cytopathic effect(CPE) or not. Two novel models have been developed to detect CPE in microscopic images of cell culture in 96 well-plates. The first model consists of a convolutional neural network (CNN) and support vector machine(SVM). The second model is a fully convolutional network (FCN) followed by morphological post-processing steps. The models are tested on 4 cell lines and achieve very high accuracy. Another biopharmaceutical application developed for cellular microscopic images is the clonal selection. Clonal selection is one of the mandatory steps in cell line development process. It focuses on verifying the clonality of the cell culture. The researchers used to visually inspect the microscopic images to verify the clonality. In this work, a novel deep learning based model and a workflow is developed to accelerate the process. This algorithm consists of multiple steps, including image analysis after incubation to detect the cell colonies, and verify its clonality in day0 image. The results and common mis-classification cases are shown in this thesis. Image analysis method is not the only technology that has been advancing for cellular image analysis in biopharmaceutical industry. A new class of instruments are currently used in biopharmaceutical industry which enable more opportunities for image analysis. To make the most of these new instruments, a convolutional neural network based architecture is used to perform accurate cell counting and cell morphology based segmentation. This analysis can provide more insight of the cells at very early stage in characterization process of cell line development. The architecture and the testing results are presented in this work. The proposed algorithm has achieved very high accuracy on both applications, and the cell morphology based segmentation enables a brand new feature for scientists to predict the potential productivity of the cells. Next part of this dissertation is focused on hardware implementation of Ultrasonic Non-Destructive Testing (NDT) methods based on deep learning, which can be highly useful in flaw detection and classification applications. With the help of a smart and mobile Non-Destructive Testing device, engineers can accurately detect and locate the flaws inside the materials without reliance on high performance computation resources. The first NDT application presents a hardware implementation of a deep learning algorithm on Field-programmable gate array(FPGA) for Ultrasound flaw detection. The Ultrasound flaw detection algorithm consists of a wavelet transform followed by a LeNet inspired convolutional neural network called Ultra-LeNet. This work is focused on implementing the computationally difficult part of this algorithm: Ultra-LeNet, so that it can be used in the field where high performance computation resources (e.g., AWS) are not accessible. The implementation uses resource partitioning to design two dedicated pipelined accelerators for convolutional layers and fully connected layers respectively. Both accelerators utilize loop unrolling, loop pipelining and batch processing techniques to maximize the throughput. The comparison to other work has shown that the implementation has achieved higher hardware utilization efficiency. The second NDT application is also focused on implementing a deep learning based algorithm for Ultrasound flaw detection on a FPGA. Instead of implementing the Ultra-LeNet, the deep learning model used in this application is Meta-learning based Siamese Network, which is capable for multi-class classification and it can also classify a new class even if it does not appear in the training dataset with the help of automated learning features. The hardware implementation is significantly different than the previous algorithm. In order to improve the inference operation efficiency, the model is compressed with both pruning and quantization, and the FPGA implementation is specifically designed to accelerate the compressed CNN with high efficiency. The CNN model compression method and hardware design are novel methods introduced in this work. Comparison against other compressed CNN accelerators is also presented.
Show less