Using PDI (Kettle) I am filling the entry-stage of my database by utilizing a CSV Inputand Table Output step. This works great, however, I also want to make sure that the data that was just inserted fulfills certain criteria, e.g. fields not being NULL, etc.
Normally this would be a job for database constraints, however, we want to keep the data in the database even if its faulty (for debugging purposes. It is a pain trying to debug a .csv file...). As it is just a staging table anyway it doesn't cause any troubles for integrity, etc.
So to do just that, I wrote some SELECT Count(*) as test123 ... statements that instantly show if something is wrong or not and are easy to handle (if the value of test123 is 0 all is good, else the job needs to be aborted).
I am executing these statements using a Execute SQL Statements step within a PDI transformation. I expected the result to be automatically passed to my datastream, so I also used a Copy rows to result step to pass it up to the executing job.
This is the point where the problem is most likely located. I think that the result of the SELECT statement was not automatically passed to my datastream, because when I do a Simple evaluation in the main job using the variable ${test123} (which I thought would be implicitly created by executing SELECT Count(*) as test123 ...) I never get the expected result.
I couldn't really find any clues to this problem in the PDI documentation so I hope that someone here has some experience with PDI and might be able to help. If something is still unclear, just hint at it and I will edit the post with more information.
best regards,
daZza