[Python] 利用Keras數字辨識_辨識多種電腦字體的數字part1(資料備置、圖片處理) ~ 芬妮跑跳誌

由於工作上需要用到發票數字辨識，所以就努力做功課了一下，最後終於被我試出來了!!QQ

對於不是很熟悉python的狀態 (也就是才用過Python寫過幾個專案的情況下)

實在需要下一番功夫找些資料...

上網爬文，幾乎都是使用最有名的手寫資料(MNIST)，

這個資料集都已經被切割好是個別數字的狀態，

如果要使用自己攝影的照片到底要怎麼進行辨識!?

爬遍網路跟書籍，沒有幾篇寫到自製資料集來辨識的QQ

後來想到可以把自己的照片變成跟MNIST一樣的大小型態來試，

於是開始我的測試路程...(好險也是沒有太長...)

這是第一個測試，為了往後回顧，或許還不小心可以幫助也正在找方法的人，

因此會盡量寫的詳細。

以下是我的進行流程 ↓

1. 前置資料準備

先用word打出各種字體的數字1~9(因為我想模擬類似發票這樣一排的狀況)，

把資料裁切成一排一排，再分成訓練資料與測試資料。

截取部分如下圖

接著把資料存放路徑設定為D:/2019/numbers/data/，新增train跟test資料夾，

資料夾內各自分成0-9的資料集，存放各種的數字字體

再把數字做切割，切割的部分另分一個章節說明
[Python] 發票數字圖片切割

命名規則就是"正確數字-第幾張"，舉例如上圖。"9-3.png"就是第3張9。

2. 圖片預處理(模擬MNIST資料集處理資料)

先Import需要的套件

import numpy as np
import matplotlib.pyplot as plt
import os
import pandas as pd
from PIL import Image
from keras.utils import np_utils
from keras.datasets import mnist
from keras.layers import Dense,Dropout,Flatten,Conv2D,MaxPooling2D
from keras.models import Sequential

切割後的數字圖片再調整成跟MNIST資料集一樣的大小28x28

#設定路徑
os.chdir('D:\\2019\\numbers\\data\\train')
mypic_train=len(os.listdir('D:\\2019\\numbers\\data\\train\\1'))
targets=[1,2,3,4,5,6,7,8,9,0]

x_train=np.array([])
y_train=np.array([])
for n in range(0,len(targets)):
    for p in range(0,mypic_train):
        #把照片轉為黑白
        img=np.array(Image.open('%s/%s-%s.png'%(targets[n],targets[n],p+1)).convert('L'))
        
        #圖片二值化
        rows,cols=img.shape
        for i in range(rows):
            for j in range(cols):
                if (img[i,j]<=128):
                    img[i,j]=1
                else:
                    img[i,j]=0
        
        recordx_nor=img.reshape(784,)
        x_train=np.hstack([x_train,recordx_nor])
        y_train=np.hstack([y_train,targets[n]])

#重新設定大小28x28
x_train=x_train.reshape(len(targets)*mypic_train,28,28).astype('float32')
y_train=y_train.astype('float32')

訓練資料一樣也做這樣的處理，數字切割後調整成28x28的大小

os.chdir('D:\\2019\\numbers\\data\\test')
mypic_test=len(os.listdir('D:\\2019\\numbers\\data\\test\\1'))

x_test=np.array([])
y_test=np.array([])

for n in range(0,len(targets)):
    for p in range(0,mypic_test):
        img=np.array(Image.open('%s/%s-%s.png'%(targets[n],targets[n],p+1)).convert('L'))
        
        rows,cols=img.shape
        for i in range(rows):
            for j in range(cols):
                if (img[i,j]<=128):
                    img[i,j]=1
                else:
                    img[i,j]=0
        
        recordy_nor=img.reshape(784,)
        x_test=np.hstack([x_test,recordy_nor])
        y_test=np.hstack([y_test,targets[n]])

x_test=x_test.reshape(len(targets)*mypic_test,28,28).astype('float32')
y_test=y_test.astype('float32')

# 隨機抽取測試資料，讓他間隔四個取一次
x_test=x_test[::4]
y_test=y_test[::4]

訓練跟測試資料都切割調整後，最後重新塑形這個資料

x_train4=x_train.reshape(x_train.shape[0],28,28,1).astype('float32')
x_test4=x_test.reshape(x_test.shape[0],28,28,1).astype('float32')

#這個地方只是把資料以別的名稱備份來做處理，我沒有讓他正規化
x_train4_nor = x_train4
x_test_nor=x_test4

最後，把測試資料的標籤都更改為one hot形式即可。

簡單來說就是以0和1做為對應，例如標籤如果是3，

先創一個10個數都是0的陣列，代表0-9，

那就會變成[0,0,0,1,0,0,0,0,0,0]，

這就是把資料標籤改成one hot形式。

y_train_one=np_utils.to_categorical(y_train)
y_test_one=np_utils.to_categorical(y_test)

資料都前處理完之後，我們就可以建模並開始學習辨識數字囉！

請看下一篇

[Python] 利用Keras數字辨識_辨識多種電腦字體的數字part1(資料備置、圖片處理)

1. 前置資料準備

2. 圖片預處理(模擬MNIST資料集處理資料)

About Finny

Blog Archive

Popular Posts

Recent Posts

Categories

Pages