We start all pushers on start up and immediately start a background
process to fetch push to send. This makes start up incredibly painful
when dealing with many pushers.
Instead, let's do a quick fast DB check to see if there *may* be push to
send and only start the background processes for those pushers. We also
stagger starting up and doing those checks so that we don't try and
handle all pushers at once.
This brings it into line with on_new_notifications and on_new_receipts. It
requires a little bit of hoop-jumping in EmailPusher to load the throttle
params before the first loop.
`on_new_notifications` and `on_new_receipts` in `HttpPusher` and `EmailPusher`
now always return synchronously, so we can remove the `defer.gatherResults` on
their results, and the `run_as_background_process` wrappers can be removed too
because the PusherPool methods will now complete quickly enough.
Each pusher has its own loop which runs for as long as it has work to do. This
should run in its own background thread with its own logcontext, as other
similar loops elsewhere in the system do - which means that CPU usage is
consistently attributed to that loop, rather than to whatever request happened
to start the loop.
There were a bunch of places where we fire off a process to happen in the
background, but don't have any exception handling on it - instead relying on
the unhandled error being logged when the relevent deferred gets
garbage-collected.
This is unsatisfactory for a number of reasons:
- logging on garbage collection is best-effort and may happen some time after
the error, if at all
- it can be hard to figure out where the error actually happened.
- it is logged as a scary CRITICAL error which (a) I always forget to grep for
and (b) it's not really CRITICAL if a background process we don't care about
fails.
So this is an attempt to add exception handling to everything we fire off into
the background.
The redact_content option never worked because it read the wrong config
section. The PR introducing it
(https://github.com/matrix-org/synapse/pull/2301) had feedback suggesting the
name be changed to not re-use the term 'redact' but this wasn't
incorporated.
This reanmes the option to give it a less confusing name, and also
means that people who've set the redact_content option won't suddenly
see a behaviour change when upgrading synapse, but instead can set
include_content if they want to.
This PR also updates the wording of the config comment to clarify
that this has no effect on event_id_only push.
Includes https://github.com/matrix-org/synapse/pull/2422
Param in the data dict of a pusher that tells an HTTP pusher to
send just the event_id of the event it's notifying about and the
notification counts. For clients that want to go & fetch the body
of the event themselves anyway.
for the email and http pushers rather than trying to make a single
method that will work with their conflicting requirements.
The http pusher needs to get the messages in ascending stream order, and
doesn't want to miss a message.
The email pusher needs to get the messages in descending timestamp order,
and doesn't mind if it misses messages.
It wasn't possible to hit the code from the API because of a typo
in parsing the request path. Since no-one was using the feature
we might as well remove the dead code.
Add a timestamp to push tokens so we know the last time they we
got them from the device. Send it to the push gateways so it can
determine whether its failure is more recent than the token.
Stop and remove pushers that have been rejected.