Set appropriate timeouts whenever you connect to a database, an external API, a cache, an email client or anything that is running in a different process.
This advice holds true in every programming language, and for any library/driver that you are using. As a developer, you must read the documentation and find out the section that talks about timeouts – and then set appropriate timeouts.
MongoDB, MySQL, Postgres, Redis, Requests (python), libcurl(C/C++), boto (for AWS in python), RESTClient / Apache HTTPClient in Java, Third party libraries for things like Keen.io / NewRelic – MUST HAVE appropriate timeouts.
Few examples Customer 1 – The agent that’s installed on client computers makes an API call to the server to decide if it must update itself. A high percentage of collectors were still running on older versions. After several days of investigation, it was found that the thread was blocked indefinitely trying to reach the servers. A simple timeout would have prevented two weeks of investigation and a mysterious bug that spanned several years.
Customer 2 – We did not close the connection to S3. The library has a fixed number of connection objects. Requesting a connection after the pool is exhausted caused the threads to wait indefinitely. Two problems here – we did not close the connection, and we did not specify atimeout to the connection pool
I have countless other examples from past – and they all boil down to one simple root cause – failure to set timeouts. Set them consistently, and you’ll have lesser production problems and newrelic alerts.