Skip to content

Commit a284169

Browse files
committed
better handling of network related timeouts
With a working network connection: * command batch is called on non gvl thread * tinytds_err_handler is called with timeout error and returns INT_TIMEOUT * dbcancel is called and command batch returns * nogvl_cleanup is called and timeout error is raised This is all great. The timeout is hit, the db connection lives, and a error is thrown. With a network failure: * command batch is called on non gvl thread * tinytds_err_handler is called with timeout error and returns INT_TIMEOUT * dbcancel is called and does not succeed. command batch never returns * nogvl_cleanup is not called This is not great. dbcancel does not succeed because of the network failure. the command batch does not return until the underlying network timeout on the os is hit. TinyTds doesn't throw an error in the expected timeout window. To fix, we set a flag when a timeout is encountered. We use dbsetinterrupt to check this flag periodically while waiting on a read from the server. Once the flag is set the interrupt with send INT_CANCEL causing the pending command batch to return early. This means nogvl_cleanup will be called and raise the timeout error. This shouldn't have any affect in "normal" timeout conditions due to the fact that dbcancel will actually succeed and cause the normal flow before the interrupt can be called/handled. This is good because in these situtations we still want the dbproc to remain in working condition.
1 parent 4953cd9 commit a284169

9 files changed

+128
-26
lines changed

.travis.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,9 @@ rvm:
1414
- 2.7.0
1515
before_install:
1616
- docker info
17+
- docker-compose up -d
1718
- sudo ./test/bin/install-openssl.sh
1819
- sudo ./test/bin/install-freetds.sh
19-
- sudo ./test/bin/setup.sh
2020
install:
2121
- gem install bundler
2222
- bundle --version

CHANGELOG.md

+2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
## (unreleased)
22

3+
* Improve handling of network related timeouts
4+
35
## 2.1.3
46

57
* Removed old/unused appveyor config

README.md

+7-4
Original file line numberDiff line numberDiff line change
@@ -419,17 +419,20 @@ First, clone the repo using the command line or your Git GUI of choice.
419419
$ git clone [email protected]:rails-sqlserver/tiny_tds.git
420420
```
421421

422-
After that, the quickest way to get setup for development is to use [Docker](https://www.docker.com/). Assuming you have [downloaded docker](https://www.docker.com/products/docker) for your platform and you have , you can run our test setup script.
422+
After that, the quickest way to get setup for development is to use [Docker](https://www.docker.com/). Assuming you have [downloaded docker](https://www.docker.com/products/docker) for your platform, you can use [docker-compose](https://docs.docker.com/compose/install/) to run the necessary containers for testing.
423423

424424
```shell
425-
$ ./test/bin/setup.sh
425+
$ docker-compose up -d
426426
```
427427

428-
This will download our SQL Server for Linux Docker image based from [microsoft/mssql-server-linux/](https://hub.docker.com/r/microsoft/mssql-server-linux/). Our image already has the `[tinytdstest]` DB and `tinytds` users created. Basically, it does the following.
428+
This will download our SQL Server for Linux Docker image based from [microsoft/mssql-server-linux/](https://hub.docker.com/r/microsoft/mssql-server-linux/). Our image already has the `[tinytdstest]` DB and `tinytds` users created. This will also download a [toxiproxy](https://github.com/shopify/toxiproxy) Docker image which we can use to simulate network failures for tests. Basically, it does the following.
429429

430430
```shell
431+
$ docker network create main-network
431432
$ docker pull metaskills/mssql-server-linux-tinytds
432-
$ docker run -p 1433:1433 -d metaskills/mssql-server-linux-tinytds
433+
$ docker run -p 1433:1433 -d --name sqlserver --network main-network metaskills/mssql-server-linux-tinytds
434+
$ docker pull shopify/toxiproxy
435+
$ docker run -p 8474:8474 -p 1234:1234 -d --name toxiproxy --network main-network shopify/toxiproxy
433436
```
434437

435438
If you are using your own database. Make sure to run these SQL commands as SA to get the test database and user installed.

docker-compose.yml

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
version: '3'
2+
3+
networks:
4+
main-network:
5+
6+
services:
7+
mssql:
8+
image: metaskills/mssql-server-linux-tinytds:2017-GA
9+
container_name: sqlserver
10+
ports:
11+
- "1433:1433"
12+
networks:
13+
- main-network
14+
15+
toxiproxy:
16+
image: shopify/toxiproxy
17+
container_name: toxiproxy
18+
ports:
19+
- "8474:8474"
20+
- "1234:1234"
21+
networks:
22+
- main-network

ext/tiny_tds/client.c

+35-1
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,13 @@ int tinytds_err_handler(DBPROCESS *dbproc, int severity, int dberr, int oserr, c
8686
but we don't ever want to automatically retry. Instead have the app
8787
decide what to do.
8888
*/
89-
return_value = INT_TIMEOUT;
89+
if (userdata->timing_out) {
90+
return INT_CANCEL;
91+
}
92+
else {
93+
userdata->timing_out = 1;
94+
return_value = INT_TIMEOUT;
95+
}
9096
cancel = 1;
9197
break;
9298

@@ -165,6 +171,33 @@ int tinytds_msg_handler(DBPROCESS *dbproc, DBINT msgno, int msgstate, int severi
165171
return 0;
166172
}
167173

174+
/*
175+
Used by dbsetinterrupt -
176+
This gets called periodically while waiting on a read from the server
177+
Right now, we only care about cases where a read from the server is
178+
taking longer than the specified timeout and dbcancel is not working.
179+
In these cases we decide that we actually want to handle the interrupt
180+
*/
181+
static int check_interrupt(void *ptr) {
182+
GET_CLIENT_USERDATA((DBPROCESS *)ptr);
183+
return userdata->timing_out;
184+
}
185+
186+
/*
187+
Used by dbsetinterrupt -
188+
This gets called if check_interrupt returns TRUE.
189+
Right now, this is only used in cases where a read from the server is
190+
taking longer than the specified timeout and dbcancel is not working.
191+
Return INT_CANCEL to abort the current command batch.
192+
*/
193+
static int handle_interrupt(void *ptr) {
194+
GET_CLIENT_USERDATA((DBPROCESS *)ptr);
195+
if (userdata->timing_out) {
196+
return INT_CANCEL;
197+
}
198+
return INT_CONTINUE;
199+
}
200+
168201
static void rb_tinytds_client_reset_userdata(tinytds_client_userdata *userdata) {
169202
userdata->timing_out = 0;
170203
userdata->dbsql_sent = 0;
@@ -381,6 +414,7 @@ static VALUE rb_tinytds_connect(VALUE self, VALUE opts) {
381414
}
382415
}
383416
dbsetuserdata(cwrap->client, (BYTE*)cwrap->userdata);
417+
dbsetinterrupt(cwrap->client, check_interrupt, handle_interrupt);
384418
cwrap->userdata->closed = 0;
385419
if (!NIL_P(database) && (azure != Qtrue)) {
386420
dbuse(cwrap->client, StringValueCStr(database));

ext/tiny_tds/result.c

+1
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,7 @@ static void nogvl_setup(DBPROCESS *client) {
9191
static void nogvl_cleanup(DBPROCESS *client) {
9292
GET_CLIENT_USERDATA(client);
9393
userdata->nonblocking = 0;
94+
userdata->timing_out = 0;
9495
/*
9596
Now that the blocking operation is done, we can finally throw any
9697
exceptions based on errors from SQL Server.

test/client_test.rb

+38-19
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,9 @@ class ClientTest < TinyTds::TestCase
6868
end
6969

7070
describe 'With in-valid options' do
71+
before(:all) do
72+
init_toxiproxy
73+
end
7174

7275
it 'raises an argument error when no :host given and :dataserver is blank' do
7376
assert_raises(ArgumentError) { new_connection :dataserver => nil, :host => nil }
@@ -129,30 +132,46 @@ class ClientTest < TinyTds::TestCase
129132
end
130133
end
131134

132-
it 'must run this test to prove we account for dropped connections' do
133-
skip
135+
it 'raises TinyTds exception with tcp socket network failure' do
136+
skip if ENV['CI'] && ENV['APPVEYOR_BUILD_FOLDER'] # only CI using docker
134137
begin
135-
client = new_connection :login_timeout => 2, :timeout => 2
138+
client = new_connection timeout: 2, port: 1234
136139
assert_client_works(client)
137-
STDOUT.puts "Disconnect network!"
138-
sleep 10
139-
STDOUT.puts "This should not get stuck past 6 seconds!"
140-
action = lambda { client.execute('SELECT 1 as [one]').each }
141-
assert_raise_tinytds_error(action) do |e|
142-
assert_equal 20003, e.db_error_number
143-
assert_equal 6, e.severity
144-
assert_match %r{timed out}i, e.message, 'ignore if non-english test run'
140+
action = lambda { client.execute("waitfor delay '00:00:05'").do }
141+
142+
# Use toxiproxy to close the TCP socket after 1 second.
143+
# We want TinyTds to execute the statement, hit the timeout configured above, and then not be able to use the network to cancel
144+
# the network connection needs to close after the sql batch is sent and before the timeout above is hit
145+
Toxiproxy[:sqlserver_test].toxic(:slow_close, delay: 1000).apply do
146+
assert_raise_tinytds_error(action) do |e|
147+
assert_equal 20003, e.db_error_number
148+
assert_equal 6, e.severity
149+
assert_match %r{timed out}i, e.message, 'ignore if non-english test run'
150+
end
145151
end
146152
ensure
147-
STDOUT.puts "Reconnect network!"
148-
sleep 10
149-
action = lambda { client.execute('SELECT 1 as [one]').each }
150-
assert_raise_tinytds_error(action) do |e|
151-
assert_equal 20047, e.db_error_number
152-
assert_equal 1, e.severity
153-
assert_match %r{dead or not enabled}i, e.message, 'ignore if non-english test run'
153+
assert_new_connections_work
154+
end
155+
end
156+
157+
it 'raises TinyTds exception with dead connection network failure' do
158+
skip if ENV['CI'] && ENV['APPVEYOR_BUILD_FOLDER'] # only CI using docker
159+
begin
160+
client = new_connection timeout: 2, port: 1234
161+
assert_client_works(client)
162+
action = lambda { client.execute("waitfor delay '00:00:05'").do }
163+
164+
# Use toxiproxy to close the network connection after 1 second.
165+
# We want TinyTds to execute the statement, hit the timeout configured above, and then not be able to use the network to cancel
166+
# the network connection needs to close after the sql batch is sent and before the timeout above is hit
167+
Toxiproxy[:sqlserver_test].toxic(:timeout, timeout: 1000).apply do
168+
assert_raise_tinytds_error(action) do |e|
169+
assert_equal 20047, e.db_error_number
170+
assert_includes [1,9], e.severity
171+
assert_match %r{dead or not enabled}i, e.message, 'ignore if non-english test run'
172+
end
154173
end
155-
close_client(client)
174+
ensure
156175
assert_new_connections_work
157176
end
158177
end

test/test_helper.rb

+21-1
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
require 'bundler' ; Bundler.require :development, :test
33
require 'tiny_tds'
44
require 'minitest/autorun'
5+
require 'toxiproxy'
56

67
TINYTDS_SCHEMAS = ['sqlserver_2000', 'sqlserver_2005', 'sqlserver_2008', 'sqlserver_2014', 'sqlserver_azure', 'sybase_ase'].freeze
78

@@ -212,6 +213,25 @@ def rollback_transaction(client)
212213
client.execute("ROLLBACK TRANSACTION").do
213214
end
214215

216+
def init_toxiproxy
217+
return if ENV['APPVEYOR_BUILD_FOLDER'] # only for CI using docker
218+
219+
# In order for toxiproxy to work for local docker instances of mssql, the containers must be on the same network
220+
# and the host used below must match the mssql container name so toxiproxy knows where to proxy to.
221+
# localhost from the perspective of toxiproxy's container is its own container an *not* the mssql container it needs to proxy to.
222+
# docker-compose.yml handles this automatically for us. In instances where someone is using their own local mssql container they'll
223+
# need to set up the networks manually and set TINYTDS_UNIT_HOST to their mssql container name
224+
# For anything other than localhost just use the environment config
225+
env_host = ENV['TINYTDS_UNIT_HOST_TEST'] || ENV['TINYTDS_UNIT_HOST'] || 'localhost'
226+
host = ['localhost', '127.0.0.1', '0.0.0.0'].include?(env_host) ? 'sqlserver' : env_host
227+
port = ENV['TINYTDS_UNIT_PORT'] || 1433
228+
Toxiproxy.populate([
229+
{
230+
name: "sqlserver_test",
231+
listen: "0.0.0.0:1234",
232+
upstream: "#{host}:#{port}"
233+
}
234+
])
235+
end
215236
end
216237
end
217-

tiny_tds.gemspec

+1
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,5 @@ Gem::Specification.new do |s|
2626
s.add_development_dependency 'rake-compiler-dock', '~> 1.0'
2727
s.add_development_dependency 'minitest', '~> 5.6'
2828
s.add_development_dependency 'connection_pool', '~> 2.2'
29+
s.add_development_dependency 'toxiproxy', '~> 2.0.0'
2930
end

0 commit comments

Comments
 (0)