Quantcast
Channel: Pentaho Community Forums
Viewing all articles
Browse latest Browse all 16689

Urgent question on normalizaing training and test set seperately

$
0
0
First of all thanks for creating a great tool. I have a fairly simple question, yet my colleague and I could not find the answer.

I want to normalize and perform input selection on a dataset, BUT training and test set seperately (also in the case of 10-fold classification).

I already found that if I use the experimenter, I can add features selection like this:
-> simple -> Algorithms -> add new -> Meta -> AttributeSelectodClassifier.

It would be nice if an expert on Weka could confirm if the above method performs the attribute selection on the training set (not the full training + test set).

For the second part of my question. I have not yet found how to do this for the normalization (standardization actually). I know certain classifiers include normalization, but I am using multiple ones and not all of them have it. Can I put a normalization filter somewhere? My reviewer is asking this and I do not feel like doing the 10 folds manually. I actually really urgently need this function, otherwise I have to do my experiment and normalize the set in excel for each of the 10 folds seperately training/test. It should be possible to do this seperately on the training set no? I use simpleLogistic, naiveBayes and SMO (ok, this has normalisation).

Viewing all articles
Browse latest Browse all 16689

Trending Articles