Semantic segmentation is an important but challenging task in computer vision because it aims to assign each pixel a category label accurately... Show moreSemantic segmentation is an important but challenging task in computer vision because it aims to assign each pixel a category label accurately. Nowadays, applications such as autonomous driving, path navigation, image search engine, or augmented reality require accurate semantic analysis and efficient segmentation mechanisms. In this thesis, we propose multiple models to improve the performance of semantic segmentation. In the first part, we focus on the single-task network, which aims to improve the performance of semantic segmentation. Our research includes exploiting context information using mixed spatial pyramid pooling to extract dense context-embedded features in FCN-based semantic segmentation. We also propose a GAF module to generate a global context-based attention map to guide the shallow-layer feature maps for better pixel localization. In the second part, we focus on a multi-task network that incorporates semantic segmentation to improve other computer vision tasks such as object detection. Specifically, a multi-task network, along with a learning strategy is designed to let semantic segmentation and object detection assist each other since they are highly correlated. Also, we include weakly-supervised multi-label semantic segmentation learning to deal with the shortage of high-quality training examples and to improve the performance of cross-domain object detection. In the third part, we focus on improving the performance of video panoptic segmentation, which is a unified network that incorporates semantic segmentation and instance segmentation using video streams. We design a new ConvLSTM pyramid to transmit spatio-temporal contextual information in our video panoptic segmentation network. Specifically, we propose a modified ConvLSTM to generate temporal contextual information. Also, we design an MSTPP module to obtain mixed spatio-temporal context-embedded feature maps. Experimental results on different datasets show that our proposed method achieves better performance compared with the state-of-the-art methods. Show less