Reshape(-1) images in a h5 dataset

2 years ago

#288858

lee en

I am taking a Pattern recognition subject in this semester. I have a project to do face detection system from 3000++ images. I am using python for this project.

What I have done so far is convert the image into numpy array and store inside a list using code below:

 # convert to numpy array, then grayscale, then resize, then vectorize, finally store in 
 # a list

 for file in sorted(img_path):
    img = cv2.imread(file)
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_gray = cv2.resize(img_gray, dsize=(150, 150), interpolation=cv2.INTER_CUBIC)
    img_gray = img_gray.reshape(-1)
    imagesData.append(img_gray)

# save to .h5 file, not yet do for label dataset

hf = h5py.File(save_path, 'a')
dset = hf.create_dataset('dataset',data=imagesData)
hf.close()

There is a small question here, is reshape(-1) mean vectorize? I try imagesData.shape, it print out (22500,), originally (150,150)

print(imagesData[0].shape)

The images are from a google drive folder(consisit of .png image). I am using sorted in looping because I want to arrange and store the numpy array in list from first to last images (1223 - 5222). Why I do this because I was given a text file containing some features that arranged from (1223-5222) and I going to store both dataset (imagesData) and label datasets (features) inside a .h5 file. The features text file as below:

text file

Am I right? because after store both dataset and label datasets into .h5 file, I will load them out and start some machine algorithm for my project, so I have to make sure each row of sample match correct label.

python

numpy

machine-learning

h5py

pattern-recognition

0 Answers

Your Answer

Posts

Questions

Blogs

Jobs