当前位置: 代码迷 >> 综合 >> 用keras Faster RCNN训练wider face,实现人脸检测
  详细解决方案

用keras Faster RCNN训练wider face,实现人脸检测

热度:90   发布时间:2023-11-14 14:26:13.0

数据集下载

wider face数据集下载链接: http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/

keras Faster RCNN训练源码

https://github.com/jiaka/faster_rcnn_keras_wider_face.git

将label改为VOC格式

解压下载好的label文件:wider_face_split,找到wider_face_train_bbx_gt.txt文件,打开发现label的形式是

0–Parade/0_Parade_marchingband_1_799.jpg (图片路径)
21 (此图片中的面孔数量)
78 221 7 8 2 0 0 0 0 0 (第一个人脸标注框)
78 238 14 17 2 0 0 0 0 0 (第二个人脸标注框)

标注框:
x1, y1, w, h, blur, expression, illumination, invalid, occlusion, pose(左上点横坐标,左上点纵坐标,框的宽度,框的高度,框的模糊程度,…(其他的查看readme文件))

而VOC2012的数据集格式(非图像分割)如下:

将wider face的label转换为VOC2012格式的代码

from skimage import io
import shutil
import random
import os
import stringheadstr = """\ <annotation><folder>VOC2007</folder><filename>%06d.jpg</filename><source><database>My Database</database><annotation>PASCAL VOC2007</annotation><image>flickr</image><flickrid>NULL</flickrid></source><owner><flickrid>NULL</flickrid><name>company</name></owner><size><width>%d</width><height>%d</height><depth>%d</depth></size><segmented>0</segmented> """
objstr = """\<object><name>%s</name><pose>Unspecified</pose><truncated>0</truncated><difficult>0</difficult><bndbox><xmin>%d</xmin><ymin>%d</ymin><xmax>%d</xmax><ymax>%d</ymax></bndbox></object> """tailstr = '''\ </annotation> '''def all_path(filename):return os.path.join('widerface', filename)def writexml(idx, head, bbxes, tail):filename = all_path("Annotations/%06d.xml" % (idx))f = open(filename, "w")f.write(head)for bbx in bbxes:f.write(objstr % ('face', bbx[0], bbx[1], bbx[0] + bbx[2], bbx[1] + bbx[3]))f.write(tail)f.close()def clear_dir():if shutil.os.path.exists(all_path('Annotations')):shutil.rmtree(all_path('Annotations'))if shutil.os.path.exists(all_path('ImageSets')):shutil.rmtree(all_path('ImageSets'))if shutil.os.path.exists(all_path('JPEGImages')):shutil.rmtree(all_path('JPEGImages'))shutil.os.mkdir(all_path('Annotations'))shutil.os.makedirs(all_path('ImageSets/Main'))shutil.os.mkdir(all_path('JPEGImages'))def excute_datasets(idx, datatype):f = open(all_path('ImageSets/Main/' + datatype + '.txt'), 'a')f_bbx = open('wider_face_split/wider_face_' + datatype + '_bbx_gt.txt', 'r')while True:filename = f_bbx.readline().strip('\n')print(filename)if not filename:breakim = io.imread('widerface/WIDER_' + datatype + '/images/' + filename)head = headstr % (idx, im.shape[1], im.shape[0], im.shape[2])nums = f_bbx.readline().strip('\n')bbxes = []for ind in range(int(nums)):bbx_info = f_bbx.readline().strip(' \n').split(' ')bbx = [int(bbx_info[i]) for i in range(len(bbx_info))]if bbx[7] == 0:bbxes.append(bbx)writexml(idx, head, bbxes, tailstr)shutil.copyfile('widerface/WIDER_' + datatype + '/images/' + filename, all_path('JPEGImages/%06d.jpg' % (idx)))f.write('%06d\n' % (idx))idx += 1f.close()f_bbx.close()return idxif __name__ == '__main__':clear_dir()idx = 1idx = excute_datasets(idx, 'train')

附处理好的VOC2012格式的wider face数据集下载https://pan.baidu.com/s/1n1NKd0DCWTqhmDrVAMJfCw 提取码:q2dj

训练结果

由于只训练了30轮左右,预测结果还不是很好,挑出比较好的一张在这里插入图片描述

  相关解决方案