Commit Graph

12131 Commits (9b3ab57acd87983a2c7c0b0ab618d6847c9e2f0e)

Author SHA1 Message Date
Erik Johnston 18de00adb4 Add ratelimiting on joins 2020-07-31 15:06:56 +01:00
Olivier Wilkinson (reivilibre) a9631b7b4b 1.18.0 2020-07-30 10:56:54 +01:00
Erik Johnston 2c1b9d6763
Update worker docs with recent enhancements (#7969) 2020-07-29 23:22:13 +01:00
Richard van der Hoff 7000a215e6 1.18.0rc2 2020-07-28 11:22:32 +01:00
Erik Johnston a8f7ed28c6
Typing worker needs to handle stream update requests (#7967)
IIRC this doesn't break tests because its only hit on reconnection, or something.

Basically, when a process needs to fetch missing updates for the `typing` stream it needs to query the writer instance via HTTP (as we don't write typing notifications to the DB), the problem was that the endpoint (`streams`) was only registered on master and specifically not on the typing writer worker.
2020-07-28 11:04:53 +01:00
Richard van der Hoff f57b99af22
Handle replication commands synchronously where possible (#7876)
Most of the stuff we do for replication commands can be done synchronously. There's no point spinning up background processes if we're not going to need them.
2020-07-27 18:54:43 +01:00
Richard van der Hoff f88c48f3b8 1.18.0rc1 2020-07-27 16:57:40 +01:00
Erik Johnston 1ef9efc1e0
Fix error reporting when using `opentracing.trace` (#7961) 2020-07-27 16:20:24 +01:00
Erik Johnston 84d099ae11
Fix typing replication not being handled on master (#7959)
Handling of incoming typing stream updates from replication was not
hooked up on master, effecting set ups where typing was handled on a
different worker.

This is really only a problem if the master process is also handling
sync requests, which is unlikely for those that are at the stage of
moving typing off.

The other observable effect is that if a worker restarts or a
replication connect drops then the typing worker will issue a
`POSITION typing`, triggering master process to try and stream *all*
typing updates from position 0.

Fixes #7907
2020-07-27 14:10:53 +01:00
Patrick Cloke d8a9cd8d3e
Remove hacky error handling for inlineDeferreds. (#7950) 2020-07-27 08:35:56 -04:00
Patrick Cloke 3fc8fdd150
Support oEmbed for media previews. (#7920)
Fixes previews of Twitter URLs by using their oEmbed endpoint to grab content.
2020-07-27 07:50:44 -04:00
Patrick Cloke b975fa2e99
Convert state resolution to async/await (#7942) 2020-07-24 10:59:51 -04:00
Patrick Cloke e739b20588
Fix up types and comments that refer to Deferreds. (#7945) 2020-07-24 10:53:25 -04:00
Patrick Cloke 53f7b49f5b
Do not convert async functions to Deferreds in the interactive_auth_handler (#7944) 2020-07-24 09:43:49 -04:00
Patrick Cloke 5ea29d7f85
Convert more of the media code to async/await (#7873) 2020-07-24 09:39:02 -04:00
Patrick Cloke 6a080ea184
Return an empty body for OPTIONS requests. (#7886) 2020-07-24 07:08:07 -04:00
Richard van der Hoff 1ec688bf21
Downgrade warning on client disconnect to INFO (#7928)
Clients disconnecting before we finish processing the request happens from time
to time. We don't need to yell about it
2020-07-24 09:55:47 +01:00
Patrick Cloke fefe9943ef
Convert presence handler helpers to async/await. (#7939) 2020-07-23 16:47:36 -04:00
Patrick Cloke 83434df381
Update the auth providers to be async. (#7935) 2020-07-23 15:45:39 -04:00
Richard van der Hoff 7078866969
Put a cache on `/state_ids` (#7931)
If we send out an event which refers to `prev_events` which other servers in
the federation are missing, then (after a round or two of backfill attempts),
they will end up asking us for `/state_ids` at a particular point in the DAG.

As per https://github.com/matrix-org/synapse/issues/7893, this is quite
expensive, and we tend to see lots of very similar requests around the same
time.

We can therefore handle this much more efficiently by using a cache, which (a)
ensures that if we see the same request from multiple servers (or even the same
server, multiple times), then they share the result, and (b) any other servers
that miss the initial excitement can also benefit from the work.

[It's interesting to note that `/state` has a cache for exactly this
reason. `/state` is now essentially unused and replaced with `/state_ids`, but
evidently when we replaced it we forgot to add a cache to the new endpoint.]
2020-07-23 18:38:19 +01:00
Richard van der Hoff 4876af06dd
Abort federation requests if the client disconnects early (#7930)
For inbound federation requests, if a given remote server makes too many
requests at once, we start stacking them up rather than processing them
immediatedly.

However, that means that there is a fair chance that the requesting server will
disconnect before we start processing the request. In that case, if it was a
read-only request (ie, a GET request), there is absolutely no point in
building a response (and some requests are quite expensive to handle).

Even in the case of a POST request, one of two things will happen:

 * Most likely, the requesting server will retry the request and we'll get the
   information anyway.

 * Even if it doesn't, the requesting server has to assume that we didn't get
   the memo, and act accordingly.

In short, we're better off aborting the request at this point rather than
ploughing on with what might be a quite expensive request.
2020-07-23 16:52:33 +01:00
Patrick Cloke 68cd935826
Convert the federation agent and related code to async/await. (#7874) 2020-07-23 07:05:57 -04:00
Patrick Cloke 13d77464c9
Follow-up to admin API to re-activate accounts (#7908) 2020-07-22 12:33:19 -04:00
Patrick Cloke cc9bb3dc3f
Convert the message handler to async/await. (#7884) 2020-07-22 12:29:15 -04:00
Richard van der Hoff 923c995023
Skip serializing /sync response if client has disconnected (#7927)
... it's a load of work which may be entirely redundant.
2020-07-22 13:44:16 +01:00
Richard van der Hoff b74919c72e
Add debugging to sync response generation (#7929) 2020-07-22 13:43:10 +01:00
Richard van der Hoff 931b026844
Remove an unused prometheus metric (#7878) 2020-07-22 00:40:55 +01:00
Richard van der Hoff 05060e0223
Track command processing as a background process (#7879)
I'm going to be doing more stuff synchronously, and I don't want to lose the
CPU metrics down the sofa.
2020-07-22 00:40:42 +01:00
Richard van der Hoff 15997618e2
Clean up PreserveLoggingContext (#7877)
This had some dead code and some just plain wrong docstrings.
2020-07-22 00:40:27 +01:00
Richard van der Hoff 2ccd48e921 fix an incorrect comment 2020-07-22 00:24:56 +01:00
Patrick Cloke de119063f2
Convert room list handler to async/await. (#7912) 2020-07-21 07:51:48 -04:00
Jason Robinson 759481af6d
Element CSS and logo in email templates (#7919)
Use Element CSS and logo in notification emails when app name is Element.

Signed-off-by: Jason Robinson <jasonr@matrix.org>
2020-07-21 11:58:01 +01:00
Karthikeyan Singaravelan 5662e2b0f3
Remove unused code from synapse.logging.utils. (#7897) 2020-07-20 15:20:53 -04:00
Adrian 64d2280299
Fix a typo in the sample config. (#7890) 2020-07-20 13:42:52 -04:00
Karthikeyan Singaravelan a7b06a81f0
Fix deprecation warning: import ABC from collections.abc (#7892) 2020-07-20 13:33:04 -04:00
Andrew Morgan 5ecf98f59e
Change sample config's postgres user to synapse_user (#7889)
The [postgres setup docs](https://github.com/matrix-org/synapse/blob/develop/docs/postgres.md#set-up-database) recommend setting up your database with user `synapse_user`.

However, uncommenting the postgres defaults in the sample config leave you with user `synapse`.

This PR switches the sample config to recommend `synapse_user`. Took a me a second to figure this out, so assume this will beneficial to others.
2020-07-20 18:29:25 +01:00
Patrick Cloke d1d5fa66e4
Fix the trace function for async functions. (#7872)
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2020-07-17 13:32:01 -04:00
Erik Johnston 2d2acc1cf2
Stop using 'device_max_stream_id' (#7882)
It serves no purpose and updating everytime we write to the device inbox
stream means all such transactions will conflict, causing lots of
transaction failures and retries.
2020-07-17 17:03:27 +01:00
Erik Johnston a3ad045286
Fix TypeError in synapse.notifier (#7880)
Fixes #7774
2020-07-17 14:11:05 +01:00
Patrick Cloke 852930add7
Add a default limit (of 100) to get/sync operations. (#7858) 2020-07-17 07:59:23 -04:00
Erik Johnston 4642fd66df
Change "unknown room ver" logging to warning. (#7881)
It's somewhat expected for us to have unknown room versions in the
database due to room version experiments.
2020-07-17 12:10:43 +01:00
Patrick Cloke 6b3ac3b8cd
Convert device handler to async/await (#7871) 2020-07-17 07:09:25 -04:00
Patrick Cloke 00e57b755c
Convert synapse.app to async/await. (#7868) 2020-07-17 07:08:56 -04:00
Patrick Cloke 6fca1b3506
Convert _base, profile, and _receipts handlers to async/await (#7860) 2020-07-17 07:08:30 -04:00
Michael Albert fff483ea96
Add admin endpoint to get members in a room. (#7842) 2020-07-16 16:43:23 -04:00
Patrick Cloke f460da6031
Consistently use `db_to_json` to convert from database values to JSON objects. (#7849) 2020-07-16 11:32:19 -04:00
Richard van der Hoff e5300063ed
Optimise queueing of inbound replication commands (#7861)
When we get behind on replication, we tend to stack up background processes
behind a linearizer. Bg processes are heavy (particularly with respect to
prometheus metrics) and linearizers aren't terribly efficient once the queue
gets long either.

A better approach is to maintain a queue of requests to be processed, and
nominate a single process to work its way through the queue.

Fixes: #7444
2020-07-16 15:49:37 +01:00
Richard van der Hoff 346476df21
Reject attempts to join empty rooms over federation (#7859)
We shouldn't allow others to make_join through us if we've left the room;
reject such attempts with a 404.

Fixes #7835. Fixes #6958.
2020-07-16 15:17:31 +01:00
Erik Johnston f2e38ca867
Allow moving typing off master (#7869) 2020-07-16 15:12:54 +01:00
Erik Johnston 649a7ead5c
Add ability to run multiple pusher instances (#7855)
This reuses the same scheme as federation sender sharding
2020-07-16 14:06:28 +01:00