1 year ago
#379174
user3448011
TensorFlow 2 keras model training very slow on CPU and most cpu cores (>95% cores) are idle
I am trying to train a neural network model (TensorFlow 2.8) on a CPU (EC2 instance m4.10 with about 160GB and 40 CPU cores) from the Jupyter notebook. The training data is loaded from 300+ gzip files (each file is 200+ MB) and processed as a dataset. But, the training process is very slow. It cost 75 mins per epoch. The code:
import tensorflow.keras as keras
tf.comfig.run_function_eagerly(True)
model.compile(optimizer=Adam(learning_rate), loss=keras.BinaryCrossEntropy),
run_eagerly=True,
metrics=[keras.metrics.BinaryAccuracy()])
model.fit(train_data, epoch=10, steps_per_epoch=1000, validation_steps=100,
workers=16, use_muiltiprocessing=True)
When the model is being trained, only 1 or 2 CPU cores are busy and all other 38 cores are idle.
I have tried eager=False
, but no use. I have checked some posts about why tf2
is slower than tf1
, but, none of them talk about why most CPU cores are idle.
Please let me know what I missed here?
tensorflow
machine-learning
keras
tensorflow2.0
cpu
0 Answers
Your Answer