.Net ThreadPool Exhaustion
More than once in my career I have come across this scenario: a .Net application frequently showing high response times. This high latency can have several causes, such as slow access to an external resource (a database or an API, for example), CPU usage “hitting” 100%, disk access overload, among others. I would like to add to the previous list another possibility, often little considered: server exhaustion. ThreadPool.
It will be presented very quickly as the ThreadPool .Net works, and code examples where this can happen. Finally, it will be demonstrated how to avoid this problem.
O ThreadPool of .Net
The asynchronous programming model based on Tasks (Task-based asynchronous programming) of .Net is well known by the development community, but I believe that its implementation details are little understood – and it is in the details where the danger lies, as the saying goes.
Behind the execution engine Tasks of .Net there is a Scheduler, responsible, as its name suggests, for scheduling the execution of Tasks. Unless otherwise explicitly changed, the scheduler .Net standard is ThreadPoolTaskScheduler, which also as the name suggests, uses the ThreadPool .Net standard to get your job done.
O ThreadPool manages then, as expected, a pool de threads, to which he attributes the Tasks that it receives using a queue. It is in this queue where the Tasks are stored until there is a thread free on pool, and then start processing it. By default, the minimum number of threads do pool is equal to the number of logical processors of the host.
And here is the detail in how it works: when there are more Taskto be executed than the number of threads do pool, ThreadPool can expect a thread get free or create more threads. If you choose to create a new thread and if the current number of threads do pool is equal to or greater than the configured minimum number, this growth takes between 1 and 2 seconds for each new thread added to pool.
Note: from the .Net 6 improvements have been introduced in this process, allowing for a faster increase in the number of threads in the ThreadPool, but still the idea main remains.
Let's look at an example to make it clearer: suppose a computer with 4 colors. The minimum value of the ThreadPool will be 4. If all the Tasks who arrive quickly process their work, the pool may even have less than the minimum of 4 threads active. Now, imagine that 4 Tasks of slightly longer duration arrived simultaneously, thus utilizing all the threads do pool. When the next Task reach the queue, he will need to wait between 1 and 2 seconds, until a new one thread be added to the pool, then exit the queue and start processing. If this new Task also have a longer duration, the next ones Tasks will wait in line again and will need to “pay the toll” for 1-2 seconds before they can start running.
If this behavior of new Tasks of long duration is maintained for some time, the feeling for the customers of this process will be of slowness, for any new task that arrives in the queue of ThreadPool. This scenario is called ThreadPool exhaustion (ThreadPool exhaustion ou ThreadPool starvation). This will occur until the Tasks finish your work and start returning the threads ao pool, enabling the reduction of the queue of Tasks pending, or that the pool can grow sufficiently to meet current demand. This may take several seconds, depending on the load, and only then will the slowdown observed previously cease to exist.
Synchronous vs. Asynchronous Code
It is now necessary to make an important distinction about types of long-running jobs. Generally they can be classified into 2 types: CPU/GPU-bound (CPU bound ou GPU bound), such as performing complex calculations, or limited by input/output operations (I/O-bound), such as database access or network calls.
In the case of tasks CPU bound, except for algorithm optimizations, there is not much that can be done: there must be enough processors to meet demand.
But in the case of tasks I/O-bound, it is possible to free the processor to respond to other requests while waiting for the I/O operation to finish. And that is exactly what the ThreadPool does when asynchronous APIs of I / O are used. In this case, even if the specific task is still time-consuming, the thread will be returned to the pool and may meet another Task of the queue. When the operation of I / O finish, the Task will be re-queued and then continue executing. For more details on how the ThreadPool awaits the end of operations I / O, click here.
However, it is important to note that there are still synchronous APIs I / O, which cause the blockage of the thread and prevent its release to the pool. These APIs – and any other type of call that blocks a thread before returning to execution – compromise the proper functioning of the ThreadPool, which can cause exhaustion when subjected to sufficiently large and/or long loads.
We can then say that the ThreadPool – and by extension ASP.NET Core/Kestrel, designed to operate asynchronously – is optimized for executing low computational complexity tasks, with loads I/O bound asynchronous. In this scenario, a small number of threads is capable of processing a very high number of tasks/requests efficiently.
Blocking threads with ASP.NET Core
Let's look at some code examples that cause the blocking of threads do pool, using ASP.NET Core 8.
Note: These codes are simple examples, and are not intended to represent any particular practice, recommendation, or style, except for the points related to the ThreadPool demonstration specifically.
To maintain identical behavior between the examples, a request to a SQL Server database will be used that will simulate a workload that takes 1 second to return, using the statement WAIT FOR DELAY.
To generate a usage load and demonstrate the practical effects of each example, we will use the seat, a free command line utility intended for this purpose.
In all examples, a load of 120 concurrent accesses will be simulated for 1 minute, with a random delay of up to 200 milliseconds between requests. These numbers are sufficient to demonstrate the effects on the ThreadPool without generating timeouts in accessing the database.
Synchronous Version
Let's start with a completely synchronous implementation:
A action DbCall is synchronous, and the method ExecuteNonQuery do DbCommand/SqlCommand is synchronous, so it will block the thread until the database returns. Below is the result of the load simulation (with the siege command used).
See that we achieved a rate of 27 requests per second (Transaction rate), and an average response time (Response time) of about 4 seconds, with the longest request (Longest transaction) lasting more than 16 seconds – a very poor performance.
Asynchronous Version – Attempt 1
Let's now use a action asynchronous (returning Task), but still use the synchronous method ExecuteNonQuery.
Running the same load scenario as before, we have the following result.
Note that the result was even worse in this case, with a request rate of 14 per second (compared to 27 for the completely synchronous version) and an average response time of more than 7 seconds (compared to 4 for the previous one).
Asynchronous Version – Attempt 2
In this next release, we have an implementation that exemplifies a common attempt – e not recommended – to transform a synchronous I/O call (in our case, the ExecuteNonQuery ) into an “asynchronous API”, using Task.Run.
The result, after simulation, shows that the result is close to the synchronous version: request rate of 24 per second, average response time of more than 4 seconds and the longest request taking more than 14 seconds to return.
Asynchronous Version – Attempt 3
Now the variation known as “sync over async”, where we use asynchronous methods, such as ExecuteNonQueryAsync of this example, but the method is called .Wait() da Task returned by the method, as shown below. Both the .Wait() as for the property .Result a Task have the same behavior: they cause the blocking of thread running!
Running our simulation, we can see below how the result is also bad, with a rate of 32 requests per second, an average time of more than 3 seconds, with requests taking up to 25 seconds to return. Not surprisingly, the use of .Wait() ou .Result in a Task is discouraged in asynchronous code.
Solution of the problem
Finally, let's look at the code created to work in the most efficient way, through asynchronous APIs and applying async / wait correctly, following Microsoft's recommendation.
We then have the action asynchronously, with the call ExecuteNonQueryAsync with wait.
The simulation result speaks for itself: request rate of 88 per second, average response time of 1,23 seconds and request taking a maximum of 3 seconds to return – numbers generally 3 times better than any previous option.
The table below summarizes the results of the different versions, for a better comparison of the data between them.
Code version | Request rate ( /s) | Average time (s) | Maximum time (s) |
synchronous | 27,38 | 4,14 | 16,93 |
Asynchronous 1 | 14,33 | 7,94 | 14,03 |
Asynchronous 2 | 24,90 | 4,57 | 14,80 |
Asynchronous 3 | 32,43 | 3,52 | 25,03 |
Solution | 88,91 | 1,23 | 3,18 |
palliative solution
It is worth mentioning that we can configure the ThreadPool to have a minimum number of threads larger than the default (the number of logical processors). With this, it will be able to quickly increase the number of threads without paying that “toll” of 1 or 2 seconds.
There are at least less 3 ways to do this: by dynamic configuration, using the file runtimeconfig.json, by project configuration, adjusting the property ThreadPoolMinThreads, or by code, by calling the method ThreadPool.SetMinThreads.
This should be seen as a temporary measure, while the appropriate adjustments to the code are not made as shown above, or after due prior testing to confirm that it brings benefits without performance side effects, as shown above. recommendation by Microsoft.
Conclusion
The exhaustion of the ThreadPool It is an implementation detail that can bring unexpected consequences. And that can be difficult to detect if we consider that .Net has several ways to obtain the same result, even in its best-known APIs – I believe motivated by years of evolution in the language and ASP.NET, always aiming at backward compatibility.
When we talk about operating at increasing rates or volumes, such as when going from dozens to hundreds of requests, it is essential to know the latest practices and recommendations. In addition, knowing one or two implementation details can be a differentiator in avoiding scaling problems or diagnosing them more quickly.
Keep an eye out for upcoming Proud posts Tech Writers. In an upcoming article, we will explore how to diagnose burnout. ThreadPool and identify the source of the problem in code from a running process.
Great post!