Quantcast
Channel: Pentaho Community Forums
Viewing all articles
Browse latest Browse all 16689

Problems with text classifier

$
0
0
First, sorry for my english,

Hi, i'm developing a text classifier for adverts. I have never used weka before but i have tried with it and i am having few problems.

I am going to explain my issues to see if somebody kindly can help me with it.

I have classified 100.000 adverts in a dataset. (field 1 -> advert text, field 2 -> category). Then i do this:

Apply filter NominalToString to field 1.

Apply filter StringToWordVector to field 1 with stopwords and stemmer.

I remove the attributes that are not useful.

I apply/train with NaiveBayesMultinomialUpdateable algorithm and dataset but when do this, i have an error of memory heap size (with -Xmx (8gb)) and not finish.

Would anybody provide me any suggestion, documentation, links...

Thanks in advance.

Viewing all articles
Browse latest Browse all 16689

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>