DFANet

Posted on 2019-05-23 In Tech , AI Views: Disqus:

测试DFANet语义分割网络，基于论文DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation，主要特点在于它的实时性：

测试使用cityscapes数据集，可以在这里下载。

服务器及数据准备

假设有已有一个远程docker服务器root@0.0.0.0 -p 9999。

Dependence

pytorch==1.0.0
python==3.6
numpy
torchvision
matplotlib
opencv-python
tensorflow
tensorboardX

apt install -y libsm6 libxext6
pip3 install opencv-python
pip3 install pyyaml
pip3 install tensorboardX

检查pytorch版本：

1 2	import torch print(torch.__version__)

1
2
3

# Linux, pip, Python 3.6, CUDA 9
pip3 install --upgrade pip
pip3 install --upgrade torch torchvision

使用scp指令将本地程序和数据集上传到服务器：

1	scp -P 9999 local_file root@0.0.0.0:remote_directory

解压缩zip文件

1
2
3

apt-get update
apt-get install zip -y
unzip local_file

下载DFANet

1 2	git clone https://github.com/huaifeng1993/DFANet.git cd DFANet

Pretrained model

打开utils/preprocess_data.py，修改dataset位置：

1 2	cityscapes_data_path = "/home/luohanjie/Documents/SLAM/data/cityscapes" cityscapes_meta_path = "/home/luohanjie/Documents/SLAM/data/cityscapes/gtFine"

运行脚本，生成labels：

1	python3 utils/preprocess_data.py

main.py

打开main.py，修改dataset位置：

train_dataset = DatasetTrain(cityscapes_data_path="/home/luohanjie/Documents/SLAM/data/cityscapes",
                             cityscapes_meta_path="/home/luohanjie/Documents/SLAM/data/cityscapes/gtFine/")
                             
val_dataset = DatasetVal(cityscapes_data_path="/home/luohanjie/Documents/SLAM/data/cityscapes",
                         cityscapes_meta_path="/home/luohanjie/Documents/SLAM/data/cityscapes/gtFine/")

2019.4.24 An function has been writed to load the pretrained model which was trained on imagenet-1k.The project of training the backbone can be Downloaded from here -https://github.com/huaifeng1993/ILSVRC2012. Limited to my computing resources(only have one RTX2080),I trained the backbone on ILSVRC2012 with only 22 epochs.But it have a great impact on the results.

由于我们没有ILSVRC2012的pretrained model，所以需要关掉标志位：

1	net = dfanet(pretrained=False, num_classes=20)

ERROR: TypeError: init() got an unexpected keyword argument 'log_dir'

打开train.py，修改为：

1	writer = SummaryWriter(logdir=self.log_dir)

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm)

出现这个错误的情况是，在服务器上的docker中运行训练代码时，batch size设置得过大，shared memory不够（因为docker限制了shm）.解决方法是，将Dataloader的num_workers设置为0¹。

打开main.py，修改：

train_loader = DataLoader(dataset=train_dataset,
                          batch_size=10, shuffle=True,
                          num_workers=0)
val_loader = DataLoader(dataset=val_dataset,
                        batch_size=10, shuffle=False, 
                        num_workers=0)

Train

1	python3 main.py

https://blog.csdn.net/hyk_1996/article/details/80824747↩︎