Introducing a timer (in this case 60 seconds) to ensure that the
execution time of the analysis takes less than 60 seconds. This
is a simple and standard POSIX signal handler. If the timeout
is reached, the module will process the next one.
This approach fixes the specific issues we have currently
with some inputs where the sentiment analysis takes too much time. This
fix should be improved and be more generic:
- Introducing statistics of content which timeouts.
- Keeping a list/queue to further process those files using a different
analysis approach. Maybe a set of "dirty" processes to handle the edge cases
and to not impact the overall processing and analysis.
- Make the timer configurable per module (at least for this one).