1 year ago

#344301

test-img

Quan Nguyen Ha

Multi-Class Document Classification with both known and un-known classes

Currently, I am building a multi-class document classifier which has to classify either 3 known classes, namely "Financial Report", "Insurance_Sheet", "Endorsement", and 1 unknown class which is "Random Doc". The following methods have been trialed, but did not prove a good result as quite a number of random documents have been classified as the known classes: "Financial Report", "Insurance_Sheet", "Endorsement".

  • Method 1: TD-IDF + Linear SVC
  • Method 2: Word2Vec for word embedding, then average those word-embedding to get the embedding vector for each document then feed to a classification model.
  • Method 3: Doc2Vec to get the embedding vector for each document and then feed to a classification model.

Can you help suggest a good approach for this case ? Thanks a lot.

text-classification

multilabel-classification

multiclass-classification

document-classification

0 Answers

Your Answer

Accepted video resources