SSD Tensorflow Implementation
Github 上早已经开源了 SSD TensorFlow 版,因为其上没有详细说明该如何训练自己的数据,于是熟悉之后在此做一些记录,方便后期查看。
将项目 clone 下来之后,首先可以跑一跑 demo,看下 SSD 检测的效果,解压已经训练好后保存下来的 SSD 模型的 checkpoint 文件(在 ./checkpoint 文件夹中):
unzip ssd_300_vgg.ckpt.zip
接着就可以回到项目根目录运行 demo 的 jupyter 了:
jupyter notebook notebooks/ssd_notebook.ipynb
如果出现找不到 datasets 的报错,确保 datasets 文件夹下有 __init__.py
文件,并且在 jupyter 中加入了 sys
导入当前目录。
在 jupyter 中可以看到测试的效果,给一张照片能够较为准确地找到要检测的目标。当然,这只是测试,那么我们如何利用数据训练对应的模型呢?
数据处理
首先要去官网下载 Pascal VOC 2007 数据,然后将该数据转换成 TensorFlow 能够处理 TF-Records 格式,在项目根目录下创建 tfrecords 文件夹,然后对应到你存放下载好的 voc2007 数据集的地址:
DATASET_DIR=./VOC2007/
OUTPUT_DIR=./tfrecords
python tf_convert_data.py \
--dataset_name=pascalvoc \
--dataset_dir=${DATASET_DIR} \
--output_name=voc_2007_train \
--output_dir=${OUTPUT_DIR}
这里指的注意的是,我们要训练自己的数据集的话,首先也要将数据整理成 Pascal VOC 2007 的格式。
- Annotations
- ImageSets
- JPEGImages
- SegmentationClass
- SegmentationObject
train 和 val 共5012 张图片,看了下其中的图片,发现大小是不一样的,而且是已经标注好的。可以看到通过转换,将所有 jpg 格式的图片都转换成 TensorFlow 中的 tf-records 类型的数据:
想用一下命令来测试下 Pascal VOC 2007 的模型:
DATASET_DIR=./tfrecords
EVAL_DIR=./logs/
CHECKPOINT_PATH=./checkpoints/VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt
python eval_ssd_network.py \
--eval_dir=${EVAL_DIR} \
--dataset_dir=${DATASET_DIR} \
--dataset_name=pascalvoc_2007 \
--dataset_split_name=test \
--model_name=ssd_300_vgg \
--checkpoint_path=${CHECKPOINT_PATH} \
--batch_size=1
报了一下错误:
ValueError: No data files found in ./tfrecords/voc_2007_test_*.tfrecord
得去 Pascal VOC 官网再下载测试的数据,按上述流程将测试数据也转换成 tf-records 格式:
DATASET_DIR=./VOC2007/test/
OUTPUT_DIR=./tfrecords
python tf_convert_data.py \
--dataset_name=pascalvoc \
--dataset_dir=${DATASET_DIR} \
--output_name=voc_2007_test \
--output_dir=${OUTPUT_DIR}
接着又报了一个错误:
Traceback (most recent call last):
File "eval_ssd_network.py", line 346, in <module>
tf.app.run()
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "eval_ssd_network.py", line 240, in main
tp_fp_metric = tfe.streaming_tp_fp_arrays(num_gbboxes, tp, fp, rscores)
File "/home/binli/SSD-Tensorflow/tf_extended/metrics.py", line 151, in streaming_tp_fp_arrays
name=scope)
File "/home/binli/SSD-Tensorflow/tf_extended/metrics.py", line 178, in streaming_tp_fp_arrays
v_nobjects = _create_local('v_num_gbboxes', shape=[], dtype=tf.int64)
File "/home/binli/SSD-Tensorflow/tf_extended/metrics.py", line 56, in _create_local
validate_shape=validate_shape)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 185, in __call__
return cls._variable_v2_call(*args, **kwargs)
TypeError: _variable_v2_call() got an unexpected keyword argument 'collections'
降级 conda 中的 TensorFlow 到1.6版本后:
conda install tensorflow-1.6.0-py27_0.tar.bz2
又冒出错误:
Traceback (most recent call last):
File "eval_ssd_network.py", line 346, in <module>
tf.app.run()
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "eval_ssd_network.py", line 320, in main
session_config=config)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/evaluation.py", line 212, in evaluate_once
config=session_config)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/training/evaluation.py", line 188, in _evaluate_once
eval_step_value = _get_latest_eval_step_value(eval_ops)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/training/evaluation.py", line 76, in _get_latest_eval_step_value
with ops.control_dependencies(update_ops):
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 4746, in control_dependencies
return get_default_graph().control_dependencies(control_inputs)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 4446, in control_dependencies
c = self.as_graph_element(c)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3459, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3548, in _as_graph_element_locked
types_str))
TypeError: Can not convert a tuple into a Tensor or Operation.
然后做一下如下操作:
# import
from compiler.ast import flatten
# change two eval_op
eval_op=list(names_to_updates.values()),
eval_op=flatten(list(names_to_updates.values())),
然后又报错了……
NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./checkpoints/VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT64, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
这要就我们把 VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt.zip 解压出来的两个文件放到一个名叫 VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt
的文件夹里,并至于 checkpoints 文件夹下。
CHECKPOINT_PATH=./checkpoints/VGG_VOC0712_SSD_300x300_ft_iter_120000.ckpt
python eval_ssd_network.py \
--eval_dir=${EVAL_DIR} \
--dataset_dir=${DATASET_DIR} \
--dataset_name=pascalvoc_2007 \
--dataset_split_name=test \
--model_name=ssd_300_vgg \
--checkpoint_path=${CHECKPOINT_PATH} \
--batch_size=1
然后终于可以测试了🤣。
蓝鹅,太天真了,跑出了 mAP 后果然又报错了:
Traceback (most recent call last):
File "eval_ssd_network.py", line 346, in <module>
tf.app.run()
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "eval_ssd_network.py", line 320, in main
session_config=config)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/evaluation.py", line 212, in evaluate_once
config=session_config)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/training/evaluation.py", line 212, in _evaluate_once
session.run(eval_ops, feed_dict)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 658, in __exit__
self._close_internal(exception_type)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/training/monitored_session.py", line 690, in _close_internal
h.end(self._coordinated_creator.tf_sess)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/contrib/training/python/training/evaluation.py", line 311, in end
summary_str = session.run(self._summary_op, self._feed_dict)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 905, in run
run_metadata_ptr)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1137, in _run
feed_dict_tensor, options, run_metadata)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run
options, run_metadata)
File "/home/binli/anaconda2/envs/py2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Duplicate tag ssd_losses/cross_entropy_pos/value_1 found in summary inputs
[[Node: Merge/MergeSummary = MergeSummary[N=93, _device="/job:localhost/replica:0/task:0/device:CPU:0"](pascalvoc_2007_data_provider/parallel_read/filenames/fraction_of_32_full, pascalvoc_2007_data_provider/parallel_read/fraction_of_2_full, batch/fraction_of_5_full, ssd_losses/cross_entropy_pos/value_1, ssd_losses/cross_entropy_pos/value_1, ssd_losses/localization/value_1, ssd_losses/localization/value_1, ssd_losses/cross_entropy_neg/value_1, ssd_losses/cross_entropy_neg/value_1, AP_VOC07/1, AP_VOC07/1, AP_VOC12/1, AP_VOC12/1, AP_VOC07/2, AP_VOC07/2, AP_VOC12/2, AP_VOC12/2, AP_VOC07/3, AP_VOC07/3, AP_VOC12/3, AP_VOC12/3, AP_VOC07/4, AP_VOC07/4, AP_VOC12/4, AP_VOC12/4, AP_VOC07/5, AP_VOC07/5, AP_VOC12/5, AP_VOC12/5, AP_VOC07/6, AP_VOC07/6, AP_VOC12/6, AP_VOC12/6, AP_VOC07/7, AP_VOC07/7, AP_VOC12/7, AP_VOC12/7, AP_VOC07/8, AP_VOC07/8, AP_VOC12/8, AP_VOC12/8, AP_VOC07/9, AP_VOC07/9, AP_VOC12/9, AP_VOC12/9, AP_VOC07/10, AP_VOC07/10, AP_VOC12/10, AP_VOC12/10, AP_VOC07/11, AP_VOC07/11, AP_VOC12/11, AP_VOC12/11, AP_VOC07/12, AP_VOC07/12, AP_VOC12/12, AP_VOC12/12, AP_VOC07/13, AP_VOC07/13, AP_VOC12/13, AP_VOC12/13, AP_VOC07/14, AP_VOC07/14, AP_VOC12/14, AP_VOC12/14, AP_VOC07/15, AP_VOC07/15, AP_VOC12/15, AP_VOC12/15, AP_VOC07/16, AP_VOC07/16, AP_VOC12/16, AP_VOC12/16, AP_VOC07/17, AP_VOC07/17, AP_VOC12/17, AP_VOC12/17, AP_VOC07/18, AP_VOC07/18, AP_VOC12/18, AP_VOC12/18, AP_VOC07/19, AP_VOC07/19, AP_VOC12/19, AP_VOC12/19, AP_VOC07/20, AP_VOC07/20, AP_VOC12/20, AP_VOC12/20, AP_VOC07/mAP, Print, AP_VOC12/mAP, Print_1)]]
暂且先不管了吧,用自己的数据出问题了再说……😓
模型训练
Fine-tuning existing SSD checkpoints
DATASET_DIR=./tfrecords
TRAIN_DIR=./logs/
CHECKPOINT_PATH=./checkpoints/ssd_300_vgg.ckpt
python train_ssd_network.py \
--train_dir=${TRAIN_DIR} \
--dataset_dir=${DATASET_DIR} \
--dataset_name=pascalvoc_2012 \
--dataset_split_name=train \
--model_name=ssd_300_vgg \
--checkpoint_path=${CHECKPOINT_PATH} \
--save_summaries_secs=60 \
--save_interval_secs=600 \
--weight_decay=0.0005 \
--optimizer=adam \
--learning_rate=0.001 \
--batch_size=32
如果想训练自己的模型,即从头开始训练的话,就取消 CHECKPOINT_PATH 的设置即可。
DATASET_DIR=./tfrecords
TRAIN_DIR=./logs/
python train_ssd_network.py \
--train_dir=${TRAIN_DIR} \
--dataset_dir=${DATASET_DIR} \
--dataset_name=pascalvoc_2007 \
--dataset_split_name=train \
--model_name=ssd_300_vgg \
--save_summaries_secs=60 \
--save_interval_secs=600 \
--weight_decay=0.0005 \
--optimizer=adam \
--learning_rate=0.001 \
--batch_size=32
nvcc 和 nvdia-smi 都搞定后,跑 train 的命令行,又出现了错误:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Default MaxPoolingOp only supports NHWC on device type CPU
利用自己的数据进行训练
转化数据
数据放在了 DATASET数据放在了 DATASET_DIR=/mnt/archive/binli/oil_thermostat/ 中,首先将数据重命名,然后划分出训练集、验证集和测试集。
Mac 上如何关掉这个端口呢?
sudo lsof -nPi :yourPortNumber
# then
sudo kill -9 yourPIDnumber
nvidia-smi
是由 cuda 和驱动提供的,于是就要对应的找资源尝试安装 cuda_10.0.130_410.48_linux.run 和 NVIDIA-Linux-x86_64-410.78.run。
结果又报错了,提示没有装 gcc,于是下了 gcc-4.8.5-7.tar.bz2 用 conda 离线安装,报如下错误:
Can't install the gcc package unless your system has crtXXX.o.
因为服务器没联网总是用离线包装,老出问题,然后想办法连上了网,重装了系统的 gcc 就 OK 了。
然后开始装 cuda 的驱动,后一个其实可以不用装,头一个上已经囊括了。装好了要手动配置下路径:
export PATH=$PATH:/usr/local/cuda-10.0/bin
export CUDADIR=/usr/local/cuda-10.0
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-10.0/lib64
然后就可以用 nvcc
和 nvdia-smi
来测试下环境和驱动有没有装好。
基于自己的数据训练模型
数据转化成 PasalVOC 数据格式
首先第一步是建立好文件夹:
- Annotations
- ImageSets
- JPEGImages
- SegmentationClass
- SegmentationObject
1、重命名
然后要对文件名进行重命名,xml 和 jpg 文件名字要对应上,图省事就直接用了现成的代码:
import os
class BatchRename():
def __init__(self):
self.path = '../'
self.type = '.jpg'
def rename(self):
filelist = os.listdir(self.path)
total_num = len(filelist)
i = 1
n = 6
for item in filelist:
if item.endswith(self.type):
n = 6 - len(str(i))
src = os.path.join(os.path.abspath(self.path), item)
dst = os.path.join(os.path.abspath(self.path), str(0)*n + str(i) + self.type)
try:
os.rename(src, dst)
print 'converting %s to %s ...' % (src, dst)
i = i + 1
except:
continue
print 'total %d to rename & converted %d jpgs' % (total_num, i)
if __name__ == '__main__':
demo = BatchRename()
demo.rename()
重命名需要注意的是,最好在所有的 xml 和 jpg 都在同 一个文件夹里的时候进行重命名操作。
2、划分数据
因为已经标记好了数据,接下来就是要生成对应的训练验证数据的文件,方便模型找到对应的文件:
import os
import random
xmlfilepath=r'/Users/Bin/Downloads/oil/Annotations'
ImageSetsPath=r"/Users/Bin/Downloads/oil/ImageSets/"
trainval_percent=0.7
train_percent=0.7
total_xml = os.listdir(xmlfilepath)
num=len(total_xml)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)
print("train and val size",tv)
print("train size",tr)
ftrainval = open(os.path.join(ImageSetsPath,'Main/trainval.txt'), 'w')
ftest = open(os.path.join(ImageSetsPath,'Main/test.txt'), 'w')
ftrain = open(os.path.join(ImageSetsPath,'Main/train.txt'), 'w')
fval = open(os.path.join(ImageSetsPath,'Main/val.txt'), 'w')
for i in list:
name=total_xml[i][:-4]+'\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest .close()
3、修改文件
修改 /datasets/pascalvoc_common.py 文件
修改类别,改成自己数据的类别:
# before
# VOC_LABELS = {
# 'none': (0, 'Background'),
# 'aeroplane': (1, 'Vehicle'),
# 'bicycle': (2, 'Vehicle'),
# 'bird': (3, 'Animal'),
# 'boat': (4, 'Vehicle'),
# 'bottle': (5, 'Indoor'),
# 'bus': (6, 'Vehicle'),
# 'car': (7, 'Vehicle'),
# 'cat': (8, 'Animal'),
# 'chair': (9, 'Indoor'),
# 'cow': (10, 'Animal'),
# 'diningtable': (11, 'Indoor'),
# 'dog': (12, 'Animal'),
# 'horse': (13, 'Animal'),
# 'motorbike': (14, 'Vehicle'),
# 'person': (15, 'Person'),
# 'pottedplant': (16, 'Indoor'),
# 'sheep': (17, 'Animal'),
# 'sofa': (18, 'Indoor'),
# 'train': (19, 'Vehicle'),
# 'tvmonitor': (20, 'Indoor'),
# }
# after changing
VOC_LABELS = {
'none': (0, 'Background'),
'oil_thermostat': (1, 'oil_thermostat'),
'white_indicator': (2, 'indicator'),
'red_indicator': (3, 'indicator')
}
修改 datasets/pascalvoc_to_tfrecords.py 文件
修改 83、84 行,图片类型以及文件格式。
# Read the image file.
filename = directory + DIRECTORY_IMAGES + name + '.jpg'
image_data = tf.gfile.FastGFile(filename, 'rb').read()
然后在 67 行可以修改转换之后每一个 tfrecoard 文件对应多少张图片:
# TFRecords convertion parameters.
RANDOM_SEED = 4242
SAMPLES_PER_FILES = 100
修改 nets/ssd_vgg_300.py 文件
95 和 96 行的类别个数按照自己数据的情况来做修改,大小为类别+1:
# before
# num_classes=21,
# no_annotation_label=21,
# change to 4
修改 ./train_ssd_network.py 文件
对应地修改训练配置,包括迭代次数(154行),batch 大小,GPU 用量等。
修改 ./eval_ssd_network.py 类别个数
修改 datasets/pascalvoc_2007.py 配置
TRAIN_STATISTICS = {
'none': (0, 0),
'pantograph': (1000, 1000),
}
TEST_STATISTICS = {
'none': (0, 0),
'pantograph': (1000, 1000),
}
SPLITS_TO_SIZES = {
'train': 500,
'test': 500,
}
4、训练
不要用 checkpoint 就可以从头开始训练了。
DATASET_DIR=./tfrecords
TRAIN_DIR=./logs/
python train_ssd_network.py \
--train_dir=${TRAIN_DIR} \
--dataset_dir=${DATASET_DIR} \
--dataset_name=pascalvoc_2007 \
--dataset_split_name=train \
--model_name=ssd_300_vgg \
--save_summaries_secs=60 \
--save_interval_secs=600 \
--weight_decay=0.0005 \
--optimizer=adam \
--learning_rate=0.001 \
--batch_size=32
python train_ssd_network.py \
--train_dir=./logs/ \
--dataset_dir=./tfrecords \
--dataset_name=pascalvoc_2007 \
--dataset_split_name=train \
--model_name=ssd_300_vgg \
--save_summaries_secs=60 \
--save_interval_secs=600 \
--weight_decay=0.0005 \
--optimizer=adam \
--learning_rate=0.001 \
--batch_size=32
5、测试
DataLossError: Unable to open table file ../checkpoints/model.ckpt-2938.ckpt: Failed precondition: ../checkpoints/model.ckpt-2938.ckpt: perhaps your file is in a different file format and you need to use a different restore operator?
这个问题的解决很傻……只要去掉文件夹结尾的.ckpt
就可以了,不需要新建一个文件夹。
然而继续运行的时候又出现了一个问题:
Cannot feed value of shape (1080, 1920, 4) for Tensor u'Placeholder_5:0', which has shape '(?, ?, 3)'
发现原来是读 png 图片有四个通道,除了 RGB 还有一个 alpha 通道,在 jupyter 上做下面的修改:
if len(img.shape) > 2 and img.shape[2] == 4:
#convert the image from RGBA2RGB
img = cv2.cvtColor(img, cv2.COLOR_BGRA2BGR)
此时虽然没有报错了,但是跑出来的结果是根本没有结果,丫根本没有检测🤷♀️!猜想可能是因为训练的程度不够,毕竟只用了5000轮迭代,于是想改成五万试一下,结果准备试的时候就报了 OOM 的错误,原来是因为开了一个 jupyter 测试训练好的模型就内存告急了,一张卡真是可怜,当然也可以减少一点 batch size 的量,耗时点。
一遍测试不到结果可以增加迭代次数同时降低 learning rate 试一下,昨晚试了下迭代次数调到五万,震荡得比较厉害,结果仍然检测不到,早上来又试了下降低 lr 试下,依然不行,后来终于找到问题之所在了……
rclasses, rscores, rbboxes = process_image(img, select_threshold=0.2)
这里可以在 tensorboard 上查看 loss 的变化:
tensorboard --logdir=./logs --port=9998
重新尝试训练表盘类的识别时,在测试训练好的模型时又有报错:
InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [8] rhs shape= [12]
这个解决办法居然是在训练模型时设定参数要加上 num_classes,怀疑是在某处修改时没有改全:
python train_ssd_network.py \
--train_dir=./logs/ \
--dataset_dir=./tfrecords \
--dataset_name=pascalvoc_2007 \
--dataset_split_name=train \
--model_name=ssd_300_vgg \
--save_summaries_secs=60 \
--save_interval_secs=600 \
--weight_decay=0.0005 \
--optimizer=adam \
--learning_rate=0.001 \
--batch_size=32 \
--ignore_missing_vars=True \
--num_classes=2 \
--checkpoint_exclude_scopes=ssd_300_vgg/conv6,ssd_300_vgg/conv7,ssd_300_vgg/block8,ssd_300_vgg/block9,ssd_300_vgg/block10,ssd_300_vgg/block11,ssd_300_vgg/ block4_box,ssd_300_vgg/block7_box,ssd_300_vgg/block8_box,ssd_300_vgg/block9_box,ssd_300_vgg/block10_box,ssd_300_vgg/block11_box
使用 batch_size = 2,好像震荡很厉害?!但是结果却异常的好,震荡不震荡,还是要用 tensorboard 看。数据量比较小时还是要用比较小的 batch_size。
优化
直接通过标注数据识别表盘和指针效果非常差,因为图片比较少,结果是画出的图置信度很小,而且框很小,处于乱框的状态。
于是重新训练模型,单独识别表盘,发现识别表盘的准确率极高,基本上接近于1。于是继续重新训练模型,单独识别指针,发现纯指针的识别率很低。那么我就想到通过在识别率接近 1 的表盘的基础上截下表盘局部的图来,在此基础上再来训练识别指针的结果,结果发现虽然置信度不是很高,但是基本上还是能框住指针的。
效果:绝对误差不超过正负4,误差均方根大概1.87。
Recording
import matplotlib.image as mpimg 用 mpimg.imsave 比用 cv2.imwrite 存的图片效果好很多!。
对 xml 文件进行处理
修改 xml
# coding: utf8
# import lxml.etree as etree
import xml.etree.ElementTree as ET
import os
class ChangeXML(object):
"""docstring for ChangeXML"""
def __init__(self):
super(ChangeXML, self).__init__()
self.path = '../Annotations/'
self.type = '.xml'
def drop(self, except_item=''):
file_list = os.listdir(self.path)
for file in file_list:
if not file.endswith(self.type):
continue
with open(self.path + file, 'r') as xml_file:
tree = ET.parse(xml_file)
# print_tree = etree.parse(self.path + file)
# print(tree.getroot().tag)
root = tree.getroot()
# print(etree.tostring(print_tree))
for obj in root.iter('object'):
if obj.find('name').text == 'oil_thermostat':
# if obj.find('name').text == '变压器用油面温控器'.decode('utf-8'):
continue
root.remove(obj)
tree.write(self.path + file, encoding='UTF-8')
if __name__ == '__main__':
cXML = ChangeXML()
cXML.drop()
rename ‘s/.jpeg/.jpg/’ *.jpeg