1 year ago
#344301
Quan Nguyen Ha
Multi-Class Document Classification with both known and un-known classes
Currently, I am building a multi-class document classifier which has to classify either 3 known classes, namely "Financial Report", "Insurance_Sheet", "Endorsement", and 1 unknown class which is "Random Doc". The following methods have been trialed, but did not prove a good result as quite a number of random documents have been classified as the known classes: "Financial Report", "Insurance_Sheet", "Endorsement".
- Method 1: TD-IDF + Linear SVC
- Method 2: Word2Vec for word embedding, then average those word-embedding to get the embedding vector for each document then feed to a classification model.
- Method 3: Doc2Vec to get the embedding vector for each document and then feed to a classification model.
Can you help suggest a good approach for this case ? Thanks a lot.
text-classification
multilabel-classification
multiclass-classification
document-classification
0 Answers
Your Answer