Hi there-
Currently my instance of Spoon is installed/ running on a machine separate from the destination data warehouse database (for security purposes apparently).
So it must send data/ write records across a network/ web connection, unfortunately.
Due to this final step (write the data to a database over the web/ network) being a bottleneck, I have used parallelization to speed things up. I have Spoon "round-robin" the data to 15 copies of the "Table Output" SQL step. It has worked just fine in this regard. 15 connections are opened, the data is written in 20 seconds, and this process is repeated about 1,500 times for the initial data load.
Recently, we have switched our date warehouse from Machine 1 to Machine 2. Now I'm getting connection errors:
Spoon error:
I/O Error: Connection reset by peer: socket write error
MS SQL Studio:
A transport-level error has occurred when receiving results from the server. Error: 0 – the semaphore timeout period has expired. (Miscrosoft SQL Server, Error: 121).
The process works just fine on Machine 1. It's suddenly failing on Machine 2. I'm inclined to believe that the server settings are the issue.
After talking to our DBA, I was informed that the suspected difference is that Machine 1 has service pack 1 installed for windows and Machine 2 has service pack 2 installed for windows. Apparently the latter may be blocking connections after a certain point to prevent a DDOS attack --- or enough open connections may cause the firewall to jump in.
Here are my thoughts. The connections are closing just fine. I know this because (for testing) - I used a MySQL DB on HostGator which limits me to 24 simultaneous connections (and would let me know if I exceeded it).
But maybe merely opening connections repeatedly, or sessions (?) -- is triggering some kind of security/ rate-limiting on this server.
My question is --- how do I proceed from here?
The way I see it, I have two options:
1. Make Spoon close enough to the destination DB so that parallelization (15 connections) is not necessary. So far I've been sending data across the US (ping 48ms) and across the Atlantic (ping 112ms). I'm not certain if getting a response time of 1ms (possible) by putting the two endpoints in the same room would make the SQL write speed much faster. I would love for it to be on the same computer, but getting pushback.
2. Somehow circumvent this Machine 2's response of trying to cut off my connection. I mean, sure, maybe it's some sort of spam/ DDOS defense. But honestly --- I thought it was quite common for a server to take lots and lots of connections and queries like this.
Currently my instance of Spoon is installed/ running on a machine separate from the destination data warehouse database (for security purposes apparently).
So it must send data/ write records across a network/ web connection, unfortunately.
Due to this final step (write the data to a database over the web/ network) being a bottleneck, I have used parallelization to speed things up. I have Spoon "round-robin" the data to 15 copies of the "Table Output" SQL step. It has worked just fine in this regard. 15 connections are opened, the data is written in 20 seconds, and this process is repeated about 1,500 times for the initial data load.
Recently, we have switched our date warehouse from Machine 1 to Machine 2. Now I'm getting connection errors:
Quote:
Spoon error:
I/O Error: Connection reset by peer: socket write error
MS SQL Studio:
A transport-level error has occurred when receiving results from the server. Error: 0 – the semaphore timeout period has expired. (Miscrosoft SQL Server, Error: 121).
After talking to our DBA, I was informed that the suspected difference is that Machine 1 has service pack 1 installed for windows and Machine 2 has service pack 2 installed for windows. Apparently the latter may be blocking connections after a certain point to prevent a DDOS attack --- or enough open connections may cause the firewall to jump in.
Here are my thoughts. The connections are closing just fine. I know this because (for testing) - I used a MySQL DB on HostGator which limits me to 24 simultaneous connections (and would let me know if I exceeded it).
But maybe merely opening connections repeatedly, or sessions (?) -- is triggering some kind of security/ rate-limiting on this server.
My question is --- how do I proceed from here?
The way I see it, I have two options:
1. Make Spoon close enough to the destination DB so that parallelization (15 connections) is not necessary. So far I've been sending data across the US (ping 48ms) and across the Atlantic (ping 112ms). I'm not certain if getting a response time of 1ms (possible) by putting the two endpoints in the same room would make the SQL write speed much faster. I would love for it to be on the same computer, but getting pushback.
2. Somehow circumvent this Machine 2's response of trying to cut off my connection. I mean, sure, maybe it's some sort of spam/ DDOS defense. But honestly --- I thought it was quite common for a server to take lots and lots of connections and queries like this.