1 year ago
#288858
lee en
Reshape(-1) images in a h5 dataset
I am taking a Pattern recognition subject in this semester. I have a project to do face detection system from 3000++ images. I am using python for this project.
What I have done so far is convert the image into numpy array and store inside a list using code below:
# convert to numpy array, then grayscale, then resize, then vectorize, finally store in
# a list
for file in sorted(img_path):
img = cv2.imread(file)
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_gray = cv2.resize(img_gray, dsize=(150, 150), interpolation=cv2.INTER_CUBIC)
img_gray = img_gray.reshape(-1)
imagesData.append(img_gray)
# save to .h5 file, not yet do for label dataset
hf = h5py.File(save_path, 'a')
dset = hf.create_dataset('dataset',data=imagesData)
hf.close()
There is a small question here, is reshape(-1) mean vectorize? I try imagesData.shape, it print out (22500,), originally (150,150)
print(imagesData[0].shape)
The images are from a google drive folder(consisit of .png image). I am using sorted in looping because I want to arrange and store the numpy array in list from first to last images (1223 - 5222). Why I do this because I was given a text file containing some features that arranged from (1223-5222) and I going to store both dataset (imagesData) and label datasets (features) inside a .h5 file. The features text file as below:
Am I right? because after store both dataset and label datasets into .h5 file, I will load them out and start some machine algorithm for my project, so I have to make sure each row of sample match correct label.
python
numpy
machine-learning
h5py
pattern-recognition
0 Answers
Your Answer