Hi Guys,
I am extremely new to WEKA/Statistical Analysis. My background is in accounting and as such I don't understand statistics at an extremely high level and am trying to learn it. I have been watching the WEKA courses(on the weka website) related to supervised learning and creating some J48 trees using their data sets. I was starting to get a hang of creating the trees, however, I felt that I needed to create my own data set and experiment with it in order to understand it further. So I made up a small set of data and placed it into a .csv file:
Number 1,Number 2,Buy
1,2,yes
2,1,no
1,3,yes
5,6,yes
6,2,no
3,4,yes
4,5,yes
6,1,no
Then I made a j48 tree based on this. The rules I got were that if number2<=2 then no, otherwise yes. I wanted the j48 to do more than that. For example, in my data set, I specifically made it so that if number 1 was less than number 2, then it would be a yes, other wise no.
How does J48 work and what is it doing on my data? Can anyone explain this? I apologize if I am asking a very basic question. I am new to statistics and the wikipedia pages are extremely complicated to understand. Also, the weka videos are made in such a way that you are automatically supposed to understand what a j48 is doing to your data set. This is why I built my own small data set to experiment and understand what a j48 is doing to it. However, I don't think I truly get it.
If someone can explain J48 on the data set above, I would be very thankful. I appreciate all the help provided in my quest to learn data analysis. Thanks :).
I am extremely new to WEKA/Statistical Analysis. My background is in accounting and as such I don't understand statistics at an extremely high level and am trying to learn it. I have been watching the WEKA courses(on the weka website) related to supervised learning and creating some J48 trees using their data sets. I was starting to get a hang of creating the trees, however, I felt that I needed to create my own data set and experiment with it in order to understand it further. So I made up a small set of data and placed it into a .csv file:
Number 1,Number 2,Buy
1,2,yes
2,1,no
1,3,yes
5,6,yes
6,2,no
3,4,yes
4,5,yes
6,1,no
Then I made a j48 tree based on this. The rules I got were that if number2<=2 then no, otherwise yes. I wanted the j48 to do more than that. For example, in my data set, I specifically made it so that if number 1 was less than number 2, then it would be a yes, other wise no.
How does J48 work and what is it doing on my data? Can anyone explain this? I apologize if I am asking a very basic question. I am new to statistics and the wikipedia pages are extremely complicated to understand. Also, the weka videos are made in such a way that you are automatically supposed to understand what a j48 is doing to your data set. This is why I built my own small data set to experiment and understand what a j48 is doing to it. However, I don't think I truly get it.
If someone can explain J48 on the data set above, I would be very thankful. I appreciate all the help provided in my quest to learn data analysis. Thanks :).