Hi,
I'm training an algorithm to detect whether a twittermessage is positive or negative. I have a trainingset (arff file) with 400 messages (200 labeled positive and 200 negative). The weird thing is that when I change the order of the messages in the training file, my precision also changes. For example when I first have 200 positve messages and then 200 negative messages I have a precision of 80% but when I have an arff file with first 50 positive, then 50 negative, then 50 positive and then again 50 negative, my precision is 75%. Why does the order of your messages in the trainingfile matter? and which order gives the best precision?
Thnx!
I'm training an algorithm to detect whether a twittermessage is positive or negative. I have a trainingset (arff file) with 400 messages (200 labeled positive and 200 negative). The weird thing is that when I change the order of the messages in the training file, my precision also changes. For example when I first have 200 positve messages and then 200 negative messages I have a precision of 80% but when I have an arff file with first 50 positive, then 50 negative, then 50 positive and then again 50 negative, my precision is 75%. Why does the order of your messages in the trainingfile matter? and which order gives the best precision?
Thnx!