问题背景
代码地址 https://github.com/BLKStone/EasyPyPR
根据之前的基于边缘检测(Sobel算子)的车牌定位模块PlateLocater,我们能获取到一系列可能是车牌的136×36的图片。接下来的工作则是如下图所示,我们要训练一个SVM模型,区分出车牌与非车牌图片。
训练数据说明
在项目的train/svm
目录下,has
目录下有大量包含车牌(蓝牌和黄牌)的图片,no
目录下则有大量非车牌图片。如下图所示:
has
目录
no
目录
特征提取
这里使用EasyPR中的特征(Histogram Feature),具体操作为先读取136×36的图片,将图片灰度化(cvtColor),再使用大津算法(OTSU)将图片二值化,然后统计图片水平方向与垂直方向非0值的数量,将得到的尺寸为136×1的矩阵与尺寸为36×1的矩阵横向拼为172×1的矩阵,最后用最大值归一化该172×1的矩阵作为最终特征。
我认为这这种特征提取的方式未必能很好的反映车牌图片与非车牌图片的区别,如果你有更好的特征提取方式欢迎留言讨论。
具体调用如下:
灰度化,大津算法二值化
1 2 3 4 5 6 7 8 9 10 11
| def getHistogramFeatures(img): imgGray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) retval,imgThres = cv2.threshold(imgGray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU) features = getProjectFeatures(imgThres) return features
|
分别统计水平与垂直方向的像素灰度值大于阈值的像素数量
之后再做归一化
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
| def ProjectedHistogram(img,project_type): threshold = 20 mhist = np.zeros((1,size)) if project_type == 0: size = img.shape[1] else: size = img.shape[0] for i in range(0,size): if project_type == 0: oneLine = img[:,i] mhist[0,i] = countOfBigValue(oneLine,project_type,threshold) else: oneLine = img[i,:] mhist[0,i] = countOfBigValue(oneLine,project_type,threshold) min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(mhist) mhist = np.float64(mhist) mhist = mhist / max_val return mhist
|
统计图像某一行或某一列像素灰度值大于阈值的像素数量的具体方法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| def countOfBigValue(oneLine,project_type,threshold): count = 0 if project_type == 0: for pixel in oneLine: if pixel > threshold: count += 1 else: for pixel in oneLine: if pixel > threshold: count += 1 return count
|
数据预处理
根据之前的训练数据说明
的目录读取数据
我们假设有车牌的标签为1,无车牌的标签为0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
| # platePath = 'train/svm/has' # loadTrainData(platePath) -> traindata,label def loadPositiveTrainData(platePath): traindata = np.ones((1,172)) label = np.array([1]) isFirst = True for root, dirs, files in os.walk( platePath ): # 开始遍历目录下所有文件 for file in files: path = os.path.join( root, file ) print 'loading....',path imgPlate = cv2.imread(path,cv2.IMREAD_COLOR) feature = PlateJudger.getHistogramFeatures(imgPlate) if isFirst: traindata = feature isFirst = False else: traindata = np.vstack([traindata,feature]) label = np.hstack([label,1]) return traindata,label
|
无车牌的读取方式与之前的读取方式基本一致,只是在标签处修改一下。
1
| label = np.hstack([label,0])
|
初步读取全部样本数据
1 2 3 4 5 6 7 8 9 10 11
| def preLoadPicture(): platePath = 'train/svm/has' traindata,label = loadPositiveTrainData(platePath) joblib.dump(traindata, 'data/data_has.pkl') joblib.dump(label,'data/label_has.pkl')
platePath = 'train/svm/no' traindata,label = loadNegativeTrainData(platePath) joblib.dump(traindata, 'data/data_no.pkl') joblib.dump(label,'data/label_no.pkl')
|
训练集与测试集的划分
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
| def dataPreProcess(): rate = 0.7 label_has = joblib.load('data/label_has.pkl') data_has = joblib.load('data/data_has.pkl') label_no = joblib.load('data/label_no.pkl') data_no = joblib.load('data/data_no.pkl') slice_index = int(data_has.shape[0]*rate) data_has_train = data_has[0:slice_index,:] label_has_train = label_has[0:slice_index] data_has_test = data_has[slice_index:,:] label_has_test = label_has[slice_index:] slice_index = int(data_no.shape[0]*rate) data_no_train = data_no[0:slice_index,:] label_no_train = label_no[0:slice_index] data_no_test = data_no[slice_index:,:] label_no_test = label_no[slice_index:] data_train = np.vstack([data_has_train,data_no_train]) label_train = np.hstack([label_has_train,label_no_train]) data_test = np.vstack([data_has_test,data_no_test]) label_test = np.hstack([label_has_test,label_no_test]) print '测试集数量',data_test.shape[0] joblib.dump(data_train, 'data/data_train.pkl') joblib.dump(label_train, 'data/label_train.pkl') joblib.dump(data_test, 'data/data_test.pkl') joblib.dump(label_test, 'data/label_test.pkl')
|
K折交叉验证
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
| def KFolderCrossValidation(): clf = svm.SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.1, degree=0.1, gamma=1.0, kernel='rbf', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=True) K = 5 data_train = joblib.load('data/data_train.pkl') label_train = joblib.load('data/label_train.pkl') np.random.seed(0) indices = np.random.permutation(data_train.shape[0]) data_train = data_train[indices,:] label_train = label_train[indices] data_folds = np.array_split(data_train, K) label_folds = np.array_split(label_train, K) scores = list() for i in range(0,K): x_train = list(data_folds) x_test = x_train.pop(i) x_train = np.concatenate(x_train) y_train = list(label_folds) y_test = y_train.pop(i) y_train = np.concatenate(y_train) clf.fit(x_train, y_train) evaluateModel(clf) joblib.dump(clf, 'model/svm'+str(i)+'.pkl') scores.append(clf.score(x_test, y_test)) print scores
|
模型评估指标
这里主要说明
ptrue_rtrue
指的是预测标签为1实际标签为1
ptrue_rfalse
指的是预测标签为1实际标签为0
pfalse_rtrue
指的是预测标签为0实际标签为1
pfalse_rfalse
值的是预测标签为0实际标签为0
这里我的训练的问题在于pfalse_rtrue的值在测试集中太多了。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
| def evaluateModel(clf): data_test = joblib.load('data/data_test.pkl') label_test = joblib.load('data/label_test.pkl') predict = clf.predict(data_test) testset_size = predict.shape[0] ptrue_rtrue = 0. ptrue_rfalse = 0. pfalse_rtrue = 0. pfalse_rfalse = 0. for i in range(0,testset_size): if label_test[i] == 1 and predict[i] == 1: ptrue_rtrue += 1 elif label_test[i] == 1 and predict[i] == 0: pfalse_rtrue += 1 elif label_test[i] == 0 and predict[i] == 1: ptrue_rfalse += 1 elif label_test[i] == 0 and predict[i] == 0: pfalse_rfalse += 1 print 'ptrue_rtrue:',int(ptrue_rtrue) print 'ptrue_rfalse:',int(ptrue_rfalse) print 'pfalse_rtrue:',int(pfalse_rtrue) print 'pfalse_rfalse',int(pfalse_rfalse) precise = ptrue_rtrue / (ptrue_rtrue + ptrue_rfalse) recall = ptrue_rtrue / (ptrue_rtrue + pfalse_rtrue) print 'precise:',precise print 'recall:',recall Fsocre = 2 * (precise * recall) / (precise + recall) print 'Fscore:',Fsocre return
|
log
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
| [LibSVM]...*.* optimization finished, obj = -884.703345, rho = -0.322080 nSV = 1997, nBSV = 764 Total nSV = 1997 ptrue_rtrue: 5 ptrue_rfalse: 0 pfalse_rtrue: 416 pfalse_rfalse 653 precise: 1.0 recall: 0.0118764845606 Fscore: 0.0234741784038 [LibSVM]...* optimization finished, obj = -901.123734, rho = -0.286911 nSV = 1999, nBSV = 790 Total nSV = 1999 ptrue_rtrue: 5 ptrue_rfalse: 0 pfalse_rtrue: 416 pfalse_rfalse 653 precise: 1.0 recall: 0.0118764845606 Fscore: 0.0234741784038 [LibSVM]...*.* optimization finished, obj = -896.907917, rho = -0.292929 nSV = 1994, nBSV = 784 Total nSV = 1994 ptrue_rtrue: 4 ptrue_rfalse: 0 pfalse_rtrue: 417 pfalse_rfalse 653 precise: 1.0 recall: 0.00950118764846 Fscore: 0.0188235294118 [LibSVM]...*.* optimization finished, obj = -899.204787, rho = -0.283542 nSV = 1994, nBSV = 785 Total nSV = 1994 ptrue_rtrue: 5 ptrue_rfalse: 0 pfalse_rtrue: 416 pfalse_rfalse 653 precise: 1.0 recall: 0.0118764845606 Fscore: 0.0234741784038 [LibSVM]...*.* optimization finished, obj = -883.073401, rho = -0.316142 nSV = 1994, nBSV = 767 Total nSV = 1994 ptrue_rtrue: 4 ptrue_rfalse: 0 pfalse_rtrue: 417 pfalse_rfalse 653 precise: 1.0 recall: 0.00950118764846 Fscore: 0.0188235294118 [0.57799999999999996, 0.63600000000000001, 0.622, 0.63200000000000001, 0.58199999999999996]
|
关于支持向量机的tips
scikit-learn的文档上说,NuSVM与SVC实质上,只是调整的参数不一样,一个是nu,一个C。
支持向量的数量有多少一般会看数据本身的性质和kernel,另外与C/nu也有一定关系。