Commit Graph

13294 Commits (9cf6e0eae759ce0b6197ba4afc636c4f431ab606)

Author SHA1 Message Date
Richard van der Hoff 9cf6e0eae7
Rip out the DNS lookup limiter (#10190)
As I've written in various places in the past (#7113, #9865) I'm pretty sure this is doing nothing useful at all.
2021-06-17 16:22:41 +01:00
Marcus 8070b893db
update black to 21.6b0 (#10197)
Reformat all files with the new version.

Signed-off-by: Marcus Hoffmann <bubu@bubu1.eu>
2021-06-17 15:20:06 +01:00
Andrew Morgan 6f1a28de19
Fix incorrect time magnitude on delayed call (#10195)
Fixes https://github.com/matrix-org/synapse/issues/10030.

We were expecting milliseconds where we should have provided a value in seconds.

The impact of this bug isn't too bad. The code is intended to count the number of remote servers that the homeserver can see and report that as a metric. This metric is supposed to run initially 1 second after server startup, and every 60s as well. Instead, it ran 1,000 seconds after server startup, and every 60s after startup.

This fix allows for the correct metrics to be collected immediately, as well as preventing a random collection 1,000s in the future after startup.
2021-06-17 15:04:26 +01:00
Eric Eastwood a911dd768b
Add fields to better debug where events are being soft_failed (#10168)
Follow-up to https://github.com/matrix-org/synapse/pull/10156#discussion_r650292223
2021-06-17 14:59:45 +01:00
Richard van der Hoff 52c60bd0a9
Fix persist_events to stop leaking opentracing contexts (#10193) 2021-06-17 11:21:53 +01:00
Patrick Cloke 18edc9ab06
Improve comments in the structured logging code. (#10188) 2021-06-16 19:18:02 +01:00
Patrick Cloke 76f9c701c3
Always require users to re-authenticate for dangerous operations. (#10184)
Dangerous actions means deactivating an account, modifying an account
password, or adding a 3PID.

Other actions (deleting devices, uploading keys) can re-use the same UI
auth session if ui_auth.session_timeout is configured.
2021-06-16 11:07:28 -04:00
Erik Johnston 36c426e294
Add debug logging when we enter/exit Measure block (#10183)
It can be helpful to know when trying to track down slow requests.
2021-06-16 13:29:54 +01:00
Lukas Lihotzki 2c240213f4
Fix requestOpenIdToken response: integer expires_in (#10175)
`expires_in` must be an integer according to the OpenAPI spec:
https://github.com/matrix-org/matrix-doc/blob/master/data/api/client-server/definitions/openid_token.yaml#L32

True division (`/`) returns a float instead (`"expires_in": 3600.0`).
Floor division (`//`) returns an integer, so the response is spec compliant.

Signed-off-by: Lukas Lihotzki <lukas@lihotzki.de>
2021-06-16 13:16:35 +01:00
Richard van der Hoff 9e405034e5
Make opentracing trace into event persistence (#10134)
* Trace event persistence

When we persist a batch of events, set the parent opentracing span to the that
from the request, so that we can trace all the way in.

* changelog

* When we force tracing, set a baggage item

... so that we can check again later.

* Link in both directions between persist_events spans
2021-06-16 11:41:15 +01:00
Erik Johnston d09e24a52d Merge branch 'master' into develop 2021-06-15 15:52:24 +01:00
Erik Johnston 1c8045f674 1.36.0 2021-06-15 15:42:02 +01:00
Patrick Cloke 4911f7931d
Remove support for unstable MSC1772 prefixes. (#10161)
The stable prefixes have been supported since v1.34.0. The unstable
prefixes are not supported by any known clients.
2021-06-15 08:03:17 -04:00
Patrick Cloke 9e5ab6dd58
Remove the experimental flag for knocking and use stable prefixes / endpoints. (#10167)
* Room version 7 for knocking.
* Stable prefixes and endpoints (both client and federation) for knocking.
* Removes the experimental configuration flag.
2021-06-15 07:45:14 -04:00
Michael Kutzner aac2c49b9b
Fix 'ip_range_whitelist' not working for federation servers (#10115)
Add 'federation_ip_range_whitelist'. This allows backwards-compatibility, If 'federation_ip_range_blacklist' is set. Otherwise 'ip_range_whitelist' will be used for federation servers.

Signed-off-by: Michael Kutzner 1mikure@gmail.com
2021-06-15 08:53:55 +01:00
Richard van der Hoff 1dfdc87b9b
Refactor `EventPersistenceQueue` (#10145)
some cleanup, pulled out of #10134.
2021-06-14 11:59:27 +01:00
Richard van der Hoff d7808a2dde
Extend `ResponseCache` to pass a context object into the callback (#10157)
This is the first of two PRs which seek to address #8518. This first PR lays the groundwork by extending ResponseCache; a second PR (#10158) will update the SyncHandler to actually use it, and fix the bug.

The idea here is that we allow the callback given to ResponseCache.wrap to decide whether its result should be cached or not. We do that by (optionally) passing a ResponseCacheContext into it, which it can modify.
2021-06-14 10:26:09 +01:00
Erik Johnston 29966a285d Synapse 1.36.0rc2 (2021-06-11)
==============================
 
 Bugfixes
 --------
 
 - Fix a bug which caused  presence updates to stop working some time after a restart, when using a presence writer worker. Broke in v1.33.0. ([\#10149](https://github.com/matrix-org/synapse/issues/10149))
 - Fix a bug when using federation sender worker where it would send out more presence updates than necessary, leading to high resource usage. Broke in v1.33.0. ([\#10163](https://github.com/matrix-org/synapse/issues/10163))
 - Fix a bug where Synapse could send the same presence update to a remote twice. ([\#10165](https://github.com/matrix-org/synapse/issues/10165))
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAmDDdrcQHGVyaWtAbWF0
 cml4Lm9yZwAKCRClQuTtGw+sCTzfB/4qaTqW2mBiwjf52SmOu6HNyd8uQd6nLIAZ
 mJC218Wakh2tT0W4iVkKwUgpuHFbtcy0rSNTlXtW4kG8XzhpTvT56RH9qls99aD3
 SGKqYpOv6NkWZibN6NdVvLDW85ixficDTXco3BljRCIMlORhY0swy+LWLwksdjWj
 6kQ+Gi/QAtKP3Pt5epYs0Ix5o1T94DfZOWE//mqBhG5cMDAw/K/G/c8tRfEjclt9
 wACBjmt2fw/Lbn9j3b0feNVp+xnFcFNuAK2bSEd8Y3yph1mhjdsIszULnM7IFNsR
 Q8zg+i7PJKNq8pQjei8j8T/aKscPTPH5XGqOSLlizj15snsiwlkz
 =C6lV
 -----END PGP SIGNATURE-----

Merge tag 'v1.36.0rc2' into develop

Synapse 1.36.0rc2 (2021-06-11)
==============================

Bugfixes
--------

- Fix a bug which caused  presence updates to stop working some time after a restart, when using a presence writer worker. Broke in v1.33.0. ([\#10149](https://github.com/matrix-org/synapse/issues/10149))
- Fix a bug when using federation sender worker where it would send out more presence updates than necessary, leading to high resource usage. Broke in v1.33.0. ([\#10163](https://github.com/matrix-org/synapse/issues/10163))
- Fix a bug where Synapse could send the same presence update to a remote twice. ([\#10165](https://github.com/matrix-org/synapse/issues/10165))
2021-06-11 15:46:38 +01:00
Erik Johnston fb10a73e85 1.36.0rc2 2021-06-11 15:21:34 +01:00
Erik Johnston cdd985c64f Only send a presence state to a destination once (#10165)
It turns out that we were sending the same presence state to a remote
potentially multiple times.
2021-06-11 15:21:08 +01:00
Erik Johnston 5e0b4719ea Fix sending presence over federation when using workers (#10163)
When using a federation sender we'd send out all local presence updates over
federation even when they shouldn't be.

Fixes #10153.
2021-06-11 15:20:54 +01:00
Erik Johnston c955f22e2c Fix bug when running presence off master (#10149)
Hopefully fixes #10027.
2021-06-11 15:20:45 +01:00
Erik Johnston 968f8283b4
Only send a presence state to a destination once (#10165)
It turns out that we were sending the same presence state to a remote
potentially multiple times.
2021-06-11 15:19:42 +01:00
Richard van der Hoff c1b9922498
Support for database schema version ranges (#9933)
This is essentially an implementation of the proposal made at https://hackmd.io/@richvdh/BJYXQMQHO, though the details have ended up looking slightly different.
2021-06-11 14:45:53 +01:00
Erik Johnston c8dd4db9eb
Fix sending presence over federation when using workers (#10163)
When using a federation sender we'd send out all local presence updates over
federation even when they shouldn't be.

Fixes #10153.
2021-06-11 13:08:30 +01:00
Andrew Morgan a15a046c93
Clean up a broken import in admin_cmd.py (#10154) 2021-06-11 11:34:40 +01:00
Erik Johnston d26d15ba3d
Fix bug when running presence off master (#10149)
Hopefully fixes #10027.
2021-06-11 10:27:12 +01:00
Eric Eastwood b31daac01c
Add metrics to track how often events are `soft_failed` (#10156)
Spawned from missing messages we were seeing on `matrix.org` from a
federated Gtiter bridged room, https://gitlab.com/gitterHQ/webapp/-/issues/2770.
The underlying issue in Synapse is tracked by https://github.com/matrix-org/synapse/issues/10066
where the message and join event race and the message is `soft_failed` before the
`join` event reaches the remote federated server.

Less soft_failed events = better and usually this should only trigger for events
where people are doing bad things and trying to fuzz and fake everything.
2021-06-11 10:12:35 +01:00
Aaron Raimist e6245e6d48
Mention that you need to configure max upload size in reverse proxy as well (#10122)
Signed-off-by: Aaron Raimist <aaron@raim.ist>
2021-06-10 11:40:24 +01:00
Andrew Morgan a7a37437bc
Integrate knock rooms with the public rooms directory (#9359)
This PR implements the ["Changes regarding the Public Rooms Directory"](https://github.com/Sorunome/matrix-doc/blob/soru/knock/proposals/2403-knock.md#changes-regarding-the-public-rooms-directory) section of knocking MSC2403.

Specifically, it:

* Allows rooms with `join_rule` "knock" to be returned by the query behind the public rooms directory
* Adds the field `join_rule` to each room entry returned by a public rooms directory query, so clients can know whether to attempt a join or knock on a room

Based on https://github.com/matrix-org/synapse/issues/6739. Complement tests for this change: https://github.com/matrix-org/complement/pull/72
2021-06-09 20:31:31 +01:00
Sorunome d936371b69
Implement knock feature (#6739)
This PR aims to implement the knock feature as proposed in https://github.com/matrix-org/matrix-doc/pull/2403

Signed-off-by: Sorunome mail@sorunome.de
Signed-off-by: Andrew Morgan andrewm@element.io
2021-06-09 19:39:51 +01:00
Patrick Cloke 11846dff8c
Limit the number of in-flight /keys/query requests from a single device. (#10144) 2021-06-09 07:05:32 -04:00
Richard van der Hoff 1bf83a191b
Clean up the interface for injecting opentracing over HTTP (#10143)
* Remove unused helper functions

* Clean up the interface for injecting opentracing over HTTP

* changelog
2021-06-09 11:33:00 +01:00
Patrick Cloke c7f3fb2745
Add type hints to the federation server transport. (#10080) 2021-06-08 11:19:25 -04:00
Andrew Morgan 8df9941cc2 1.36.0rc1 2021-06-08 14:09:00 +01:00
Erik Johnston 1092718cac
Fix logging context when opening new DB connection (#10141)
Fixes #10140
2021-06-08 13:49:29 +01:00
Patrick Cloke 9e4610cc27
Correct type hints for parse_string(s)_from_args. (#10137) 2021-06-08 08:30:48 -04:00
Erik Johnston c842c581ed
When joining a remote room limit the number of events we concurrently check signatures/hashes for (#10117)
If we do hundreds of thousands at once the memory overhead can easily reach 500+ MB.
2021-06-08 11:07:46 +01:00
Erik Johnston a0101fc021
Handle /backfill returning no events (#10133)
Fixes #10123
2021-06-08 10:37:01 +01:00
Richard van der Hoff 0acb5010ec
More database opentracing (#10136)
Add a couple of extra logs/spans, to give a bit of a better idea.
2021-06-07 18:01:32 +01:00
Richard van der Hoff b2557cbf42
opentracing: use a consistent name for background processes (#10135)
... otherwise we tend to get a namespace clash between the bg process and the
functions that it calls.
2021-06-07 17:57:49 +01:00
14mRh4X0r 8942e23a69
Always update AS last_pos, even on no events (#10107)
Fixes #1834.

`get_new_events_for_appservice` internally calls `get_events_as_list`, which will filter out any rejected events. If all returned events are filtered out, `_notify_interested_services` will return without updating the last handled stream position. If there are 100 consecutive such events, processing will halt altogether.

Breaking the loop is now done by checking whether we're up-to-date with `current_max` in the loop condition, instead of relying on an empty `events` list.


Signed-off-by: Willem Mulder <14mRh4X0r@gmail.com>
2021-06-07 15:42:05 +01:00
Dirk Klimpel d558292548
Add missing type hints to the admin API servlets (#10105) 2021-06-07 15:12:34 +01:00
Richard van der Hoff fa1db8f156
Delete completes to-device messages earlier in /sync (#10124)
I hope this will improve
https://github.com/matrix-org/synapse/issues/9564.
2021-06-07 09:19:06 +01:00
Erik Johnston a0cd8ae8cb
Don't try and backfill the same room in parallel. (#10116)
If backfilling is slow then the client may time out and retry, causing
Synapse to start a new `/backfill` before the existing backfill has
finished, duplicating work.
2021-06-04 10:47:58 +01:00
Erik Johnston c96ab31dff
Limit number of events in a replication request (#10118)
Fixes #9956.
2021-06-04 10:35:47 +01:00
Richard van der Hoff d8be7d493d
Enable Prometheus metrics for the jaeger client library (#10112) 2021-06-04 09:25:33 +01:00
Richard van der Hoff 9eea4646be
Add OpenTracing for database activity. (#10113)
This adds quite a lot of OpenTracing decoration for database activity. Specifically it adds tracing at four different levels:

 * emit a span for each "interaction" - ie, the top level database function that we tend to call "transaction", but isn't really, because it can end up as multiple transactions.
 * emit a span while we hold a database connection open
 * emit a span for each database transaction - actual actual transaction.
 * emit a span for each database query.

I'm aware this might be quite a lot of overhead, but even just running it on a local Synapse it looks really interesting, and I hope the overhead can be offset just by turning down the sampling frequency and finding other ways of tracing requests of interest (eg, the `force_tracing_for_users` setting).
2021-06-03 16:31:56 +01:00
Richard van der Hoff 1d143074c5
Improve opentracing annotations for Notifier (#10111)
The existing tracing reports an error each time there is a timeout, which isn't
really representative.

Additionally, we log things about the way `wait_for_events` works
(eg, the result of the callback) to the *parent* span, which is confusing.
2021-06-03 16:01:30 +01:00
Travis Ralston 5325f0308c
r0.6.1 support: /rooms/:roomId/aliases endpoint (#9224)
[MSC2432](https://github.com/matrix-org/matrix-doc/pull/2432) added this endpoint originally but it has since been included in the spec for nearly a year. 

This is progress towards https://github.com/matrix-org/synapse/issues/8334
2021-06-03 13:50:49 +01:00