Hello, here is the details:
I use PDI 4.4.0 community binary on Linux
I have a data file in text format, which contain about 50,000,000 rows.
I load the data and send it to a Sort Row step, I keep sort cache remain on 1,000,000
Then I find the result is wrong.
I change the sort cache to 100,000,000, then I got the right result.
I see on Jira some people report a issue relative sort cache, is this similar? http://jira.pentaho.com/browse/PDI-10580
I use PDI 4.4.0 community binary on Linux
I have a data file in text format, which contain about 50,000,000 rows.
I load the data and send it to a Sort Row step, I keep sort cache remain on 1,000,000
Then I find the result is wrong.
I change the sort cache to 100,000,000, then I got the right result.
I see on Jira some people report a issue relative sort cache, is this similar? http://jira.pentaho.com/browse/PDI-10580