Commit Graph

1070 Commits (94375f7a913f75fe0a93a3eda2bfe5060e975290)

Author SHA1 Message Date
Eric Eastwood 0a4efbc1dd
Instrument the federation/backfill part of `/messages` (#13489)
Instrument the federation/backfill part of `/messages` so it's easier to follow what's going on in Jaeger when viewing a trace.

Split out from https://github.com/matrix-org/synapse/pull/13440

Follow-up from https://github.com/matrix-org/synapse/pull/13368

Part of https://github.com/matrix-org/synapse/issues/13356
2022-08-16 12:39:40 -05:00
Eric Eastwood 344a2f767c
Instrument `FederationStateIdsServlet` - `/state_ids` (#13499)
Instrument FederationStateIdsServlet - `/state_ids` so it's easier to follow what's going on in Jaeger when viewing a trace.
2022-08-15 19:41:23 +01:00
reivilibre e9e6aacfbe
Faster Room Joins: prevent Synapse from answering federated join requests for a room which it has not fully joined yet. (#13416) 2022-08-04 16:27:04 +01:00
Eric Eastwood 92d21faf12
Instrument `/messages` for understandable traces in Jaeger (#13368)
In Jaeger:

 - Before: huge list of uncategorized database calls
 - After: nice and collapsible into units of work
2022-08-03 10:57:38 -05:00
Will Hunt 502f075e96
Implement MSC3848: Introduce errcodes for specific event sending failures (#13343)
Implements MSC3848
2022-07-27 13:44:40 +01:00
reivilibre 39be5bc550
Make minor clarifications to the error messages given when we fail to join a room via any server. (#13160) 2022-07-27 10:37:50 +00:00
Eric Eastwood 4f3082d6bf
Fix `get_pdu` asking every remote destination even after it finds an event (#13346) 2022-07-27 10:40:04 +01:00
Patrick Cloke 50122754c8
Add missing types to opentracing. (#13345)
After this change `synapse.logging` is fully typed.
2022-07-21 12:01:52 +00:00
Eric Eastwood 0f971ca68e
Update `get_pdu` to return the original, pristine `EventBase` (#13320)
Update `get_pdu` to return the untouched, pristine `EventBase` as it was originally seen over federation (no metadata added). Previously, we returned the same `event` reference that we stored in the cache which downstream code modified in place and added metadata like setting it as an `outlier`  and essentially poisoned our cache. Now we always return a copy of the `event` so the original can stay pristine in our cache and re-used for the next cache call.

Split out from https://github.com/matrix-org/synapse/pull/13205

As discussed at:

 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918365746
 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918366125

Related to https://github.com/matrix-org/synapse/issues/12584. This PR doesn't fix that issue because it hits [`get_event` which exists from the local database before it tries to `get_pdu`](7864f33e28/synapse/federation/federation_client.py (L581-L594)).
2022-07-20 15:58:51 -05:00
Patrick Cloke a6895dd576
Add type annotations to `trace` decorator. (#13328)
Functions that are decorated with `trace` are now properly typed
and the type hints for them are fixed.
2022-07-19 14:14:30 -04:00
David Robertson b977867358
Rate limit joins per-room (#13276) 2022-07-19 11:45:17 +00:00
Nick Mills-Barrett 21eeacc995
Federation Sender & Appservice Pusher Stream Optimisations (#13251)
* Replace `get_new_events_for_appservice` with `get_all_new_events_stream`

The functions were near identical and this brings the AS worker closer
to the way federation senders work which can allow for multiple workers
to handle AS traffic.

* Pull received TS alongside events when processing the stream

This avoids an extra query -per event- when both federation sender
and appservice pusher process events.
2022-07-15 09:36:56 +01:00
Sean Quah 68db233f0c
Handle race between persisting an event and un-partial stating a room (#13100)
Whenever we want to persist an event, we first compute an event context,
which includes the state at the event and a flag indicating whether the
state is partial. After a lot of processing, we finally try to store the
event in the database, which can fail for partial state events when the
containing room has been un-partial stated in the meantime.

We detect the race as a foreign key constraint failure in the data store
layer and turn it into a special `PartialStateConflictError` exception,
which makes its way up to the method in which we computed the event
context.

To make things difficult, the exception needs to cross a replication
request: `/fed_send_events` for events coming over federation and
`/send_event` for events from clients. We transport the
`PartialStateConflictError` as a `409 Conflict` over replication and
turn `409`s back into `PartialStateConflictError`s on the worker making
the request.

All client events go through
`EventCreationHandler.handle_new_client_event`, which is called in
*a lot* of places. Instead of trying to update all the code which
creates client events, we turn the `PartialStateConflictError` into a
`429 Too Many Requests` in
`EventCreationHandler.handle_new_client_event` and hope that clients
take it as a hint to retry their request.

On the federation event side, there are 7 places which compute event
contexts. 4 of them use outlier event contexts:
`FederationEventHandler._auth_and_persist_outliers_inner`,
`FederationHandler.do_knock`, `FederationHandler.on_invite_request` and
`FederationHandler.do_remotely_reject_invite`. These events won't have
the partial state flag, so we do not need to do anything for then.

The remaining 3 paths which create events are
`FederationEventHandler.process_remote_join`,
`FederationEventHandler.on_send_membership_event` and
`FederationEventHandler._process_received_pdu`.

We can't experience the race in `process_remote_join`, unless we're
handling an additional join into a partial state room, which currently
blocks, so we make no attempt to handle it correctly.

`on_send_membership_event` is only called by
`FederationServer._on_send_membership_event`, so we catch the
`PartialStateConflictError` there and retry just once.

`_process_received_pdu` is called by `on_receive_pdu` for incoming
events and `_process_pulled_event` for backfill. The latter should never
try to persist partial state events, so we ignore it. We catch the
`PartialStateConflictError` in `on_receive_pdu` and retry just once.

Refering to the graph of code paths in
https://github.com/matrix-org/synapse/issues/12988#issuecomment-1156857648
may make the above make more sense.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-07-05 16:12:52 +01:00
Patrick Cloke 81608490e3
Stop depending on `room_id` to be returned for children state in the hierarchy response. (#12991)
The `room_id` field was removed from MSC2946 before
it was accepted. It was initially kept for backwards compatibility
and should be removed now that the stable form of the API
is used.

This change only stops Synapse from validating that it is returned,
a future PR will remove returning it as part of the response.
2022-06-10 07:15:51 -04:00
David Robertson f30bcbd84a
Fix Synapse git info missing in version strings (#12973) 2022-06-07 15:24:11 +01:00
Erik Johnston a7e506ddee
Reduce amount of state we pull out when attempting to send catchup PDUs. (#12963)
* Don't pull out state for catchup

* Newsfile

* Merge newsfile
2022-06-07 14:35:56 +01:00
Erik Johnston 44de53bb79
Reduce state pulled from DB due to sending typing and receipts over federation (#12964)
Reducing the amount of state we pull from the DB is useful as fetching state is expensive in terms of DB, CPU and memory.
2022-06-06 16:46:11 +01:00
Erik Johnston e3163e2e11
Reduce the amount of state we pull from the DB (#12811) 2022-06-06 09:24:12 +01:00
Erik Johnston 888a29f412
Wait for lazy join to complete when getting current state (#12872) 2022-06-01 16:02:53 +01:00
Richard van der Hoff f0aec0abef
Improve logging when signature checks fail (#12925)
* Raise a dedicated `InvalidEventSignatureError` from `_check_sigs_on_pdu`

* Downgrade logging about redactions to DEBUG

this can be very spammy during a room join, and it's not very useful.

* Raise `InvalidEventSignatureError` from `_check_sigs_and_hash`

... and, more importantly, move the logging out to the callers.

* changelog
2022-05-31 23:32:56 +01:00
Sean Quah 2fba1076c5
Faster room joins: Try other destinations when resyncing the state of a partial-state room (#12812)
Signed-off-by: Sean Quah <seanq@matrix.org>
2022-05-31 15:50:29 +01:00
Erik Johnston 3594f6c1f3 Merge branch 'master' into develop 2022-05-31 14:48:22 +01:00
Erik Johnston 1e453053cb
Rename storage classes (#12913) 2022-05-31 12:17:50 +00:00
Brendan Abolivier 8fd87739bf
Fix import in module_api module and docs on the new check_event_for_spam signature (#12918)
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2022-05-31 12:04:53 +02:00
Patrick Cloke c52abc1cfd
Additional constants for EDU types. (#12884)
Instead of hard-coding strings in many places.
2022-05-27 07:14:36 -04:00
Patrick Cloke d9f092285b
Remove federation client code for groups. (#12563) 2022-05-27 07:13:58 -04:00
Sean Quah 053ca5f3ca Synapse 1.60.0rc2 (2022-05-27)
==============================
 
 This release of Synapse adds a unique index to the `state_group_edges` table, in
 order to prevent accidentally introducing duplicate information (for example,
 because a database backup was restored multiple times). If your Synapse database
 already has duplicate rows in this table, this could fail with an error and
 require manual remediation.
 
 Additionally, the signature of the `check_event_for_spam` module callback has changed.
 The previous signature has been deprecated and remains working for now. Module authors
 should update their modules to use the new signature where possible.
 
 See [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1600)
 for more details.
 
 Features
 --------
 
 - Add an option allowing users to use their password to reauthenticate for privileged actions even though password login is disabled. ([\#12883](https://github.com/matrix-org/synapse/issues/12883))
 
 Bugfixes
 --------
 
 - Explicitly close `ijson` coroutines once we are done with them, instead of leaving the garbage collector to close them. ([\#12875](https://github.com/matrix-org/synapse/issues/12875))
 
 Internal Changes
 ----------------
 
 - Improve URL previews by not including the content of media tags in the generated description. ([\#12887](https://github.com/matrix-org/synapse/issues/12887))
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEWMTnW8Z8khaaf90R+84KzgcyGG8FAmKQqMcACgkQ+84Kzgcy
 GG9Z2Av+N+b/fvaB3D56UkFqTW/xLmCEyri65njcXU8625bWiLSPM6hssmyJB1FA
 xlc2RBKr8QxlnHRS/v31wDtONC8YZ2O3fyzYPFfY1fF5Ul7Kg3XCzLeUH4/j1/Ar
 5bqriDqaN9FQ/6QJybShXlA4l7lY1Fs0C4P23jDBgqfKjnlToeVLqhVA70dDaFu/
 ir+vVprKCkQI1iqnYXwIxGRmgBzLWGoVqQFGbSI6hugGwXpGIyX7+2I+0v8tI6vA
 SZ99vLFWcvnd6DJTyBhIeV22Ff4qA7eQsyPvSrMETdsaZmrxGlG+t332HNCgplv8
 gv2gUpJL0br++5WTAX+nRc7HpfKo/74vKeTktqPmlvFP8kUOg+PbzmoJFUu21PhA
 rnq5TzgsPHK0dqBhM1RC2vtOiJ5v3ZBqzJJzSRXl6lsFpWxxFmwesEcIDAYS0Nmh
 QoJb7/L8cPCHksHvZM76bzB465tSH9NhuFYZQoLGHcpxa0kYekrdlYasP8U0FU7L
 nF3C0Pgw
 =D3F+
 -----END PGP SIGNATURE-----

Merge tag 'v1.60.0rc2' into develop

Synapse 1.60.0rc2 (2022-05-27)
==============================

This release of Synapse adds a unique index to the `state_group_edges` table, in
order to prevent accidentally introducing duplicate information (for example,
because a database backup was restored multiple times). If your Synapse database
already has duplicate rows in this table, this could fail with an error and
require manual remediation.

Additionally, the signature of the `check_event_for_spam` module callback has changed.
The previous signature has been deprecated and remains working for now. Module authors
should update their modules to use the new signature where possible.

See [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1600)
for more details.

Features
--------

- Add an option allowing users to use their password to reauthenticate for privileged actions even though password login is disabled. ([\#12883](https://github.com/matrix-org/synapse/issues/12883))

Bugfixes
--------

- Explicitly close `ijson` coroutines once we are done with them, instead of leaving the garbage collector to close them. ([\#12875](https://github.com/matrix-org/synapse/issues/12875))

Internal Changes
----------------

- Improve URL previews by not including the content of media tags in the generated description. ([\#12887](https://github.com/matrix-org/synapse/issues/12887))
2022-05-27 12:07:18 +01:00
Sean Quah bb7a637765
Close `ijson` coroutines ourselves instead of letting the GC close them (#12875)
Hopefully this means that exceptions raised due to truncated JSON
get a sensible logging context and stack.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-05-27 11:03:05 +01:00
Patrick Cloke 1885ee0113
Remove unstable APIs for /hierarchy. (#12851)
Removes the unstable endpoint as well as a duplicated field
which was modified during stabilization.
2022-05-26 07:10:28 -04:00
Patrick Cloke b5707ceaba
Avoid attempting to delete push actions for remote users. (#12879)
Remote users will never have push actions, so we can avoid a database
round-trip/transaction completely.
2022-05-26 07:09:16 -04:00
Richard van der Hoff 1b338476af
Allow bigger responses to `/federation/v1/state` (#12877)
* Refactor HTTP response size limits

Rather than passing a separate `max_response_size` down the stack, make it an
attribute of the `parser`.

* Allow bigger responses on `federation/v1/state`

`/state` can return huge responses, so we need to handle that.
2022-05-25 22:24:28 +01:00
Patrick Cloke a8db8c6eba
Remove user-visible groups/communities code (#12553)
Makes it so that groups/communities no longer exist from a user-POV. E.g. we remove:

* All API endpoints (including Client-Server, Server-Server, and admin).
* Documented configuration options (and the experimental flag, which is now unused).
* Special handling during room upgrades.
* The `groups` section of the `/sync` response.
2022-05-25 07:53:40 -04:00
David Teller 28199e9357
Uniformize spam-checker API, part 2: check_event_for_spam (#12808)
Signed-off-by: David Teller <davidt@element.io>
2022-05-23 17:27:39 +00:00
Jess Porter a608ac847b
add SpamChecker callback for silently dropping inbound federated events (#12744)
Signed-off-by: jesopo <github@lolnerd.net>
2022-05-23 16:36:21 +00:00
Hubert Chathi 8afb7b55d0
Make handling of federation Authorization header (more) compliant with RFC7230 (#12774)
The main differences are:
- values with delimiters (such as colons) should be quoted, so always
  quote the origin, since it could contain a colon followed by a port
  number
- should allow more than one space after "X-Matrix"
- quoted values with backslash-escaped characters should be unescaped
- names should be case insensitive
2022-05-18 11:19:30 +01:00
Dirk Klimpel 6edefef602
Add some type hints to datastore (#12717) 2022-05-17 15:29:06 +01:00
Sean Quah 6ee61b9052
Complain if a federation endpoint has the `@cancellable` flag (#12705)
`BaseFederationServlet` wraps its endpoints in a bunch of async code
that has not been vetted for compatibility with cancellation.
Fail CI if a `@cancellable` flag is applied to a federation endpoint.

Signed-off-by: Sean Quah <seanq@element.io>
2022-05-11 14:52:26 +01:00
Val Lorentz bf0c3ca20a
Fix inconsistent spelling of 'M_UNRECOGNIZED'. (#12665) 2022-05-09 20:29:07 +00:00
DeepBlueV7.X a377a43386
Support MSC3266 room summaries over federation (#11507)
Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
2022-05-05 15:25:00 +01:00
Richard van der Hoff d66d68f917
Add extra debug logging to federation sender (#12614)
... in order to debug some problems we've been having with certain events not
being sent when expected.
2022-05-03 16:32:40 +01:00
Richard van der Hoff db2edf5a65
Exclude OOB memberships from the federation sender (#12570)
As the comment says, there is no need to process such events, and indeed we
need to avoid doing so.

Fixes #12509.
2022-05-03 12:47:56 +00:00
David Robertson 6463244375
Remove unused `# type: ignore`s (#12531)
Over time we've begun to use newer versions of mypy, typeshed, stub
packages---and of course we've improved our own annotations. This makes
some type ignore comments no longer necessary. I have removed them.

There was one exception: a module that imports `select.epoll`. The
ignore is redundant on Linux, but I've kept it ignored for those of us
who work on the source tree using not-Linux. (#11771)

I'm more interested in the config line which enforces this. I want
unused ignores to be reported, because I think it's useful feedback when
annotating to know when you've fixed a problem you had to previously
ignore.

* Installing extras before typechecking

Lacking an easy way to install all extras generically, let's bite the bullet and
make install the hand-maintained `all` extra before typechecking.

Now that https://github.com/matrix-org/backend-meta/pull/6 is merged to
the release/v1 branch.
2022-04-27 14:03:44 +01:00
Jan Christian Grünhage a1f87f57ff
Implement MSC3383: include destination in X-Matrix auth header (#11398)
Co-authored-by: Jan Christian Grünhage <jan.christian@gruenhage.xyz>
Co-authored-by: Marcus Hoffmann <bubu@bubu1.eu>
2022-04-19 16:23:53 +01:00
Richard van der Hoff b121a3ad2b
Back out implementation of MSC2314 (#12474)
MSC2314 has now been closed, so we're backing out its implementation, which
originally happened in #6176.

Unfortunately it's not a direct revert, as that PR mixed in a bunch of
unrelated changes to tests etc.
2022-04-19 11:17:29 +00:00
Patrick Cloke 4bdbebccb9
Remove the unstable event field for `/send_join` per MSC3083. (#12395)
This was missed when initially stabilising room version 8 and was
left in as a compatibility shim. Most homeservers have upgraded
to a version which expects the proper field name, and the failure
mode is reasonable (a user on an older server may have to attempt
joining the room twice with an obscure error message the first time).
2022-04-12 11:27:45 -04:00
David Robertson 95a038c106
Unify HTTP query parameter type hints (#12415)
* Pull out query param types to `synapse.http.types`
* Use QueryParams everywhere
* Simplify `encode_query_args`
* Add annotation which would have caught #12410
2022-04-08 13:06:51 +01:00
Erik Johnston 36af768c13
Fix fetching public rooms over federation (#12410)
Broke by #12364
2022-04-07 14:18:02 +00:00
Sean Quah 800ba87cc8
Refactor and convert `Linearizer` to async (#12357)
Refactor and convert `Linearizer` to async. This makes a `Linearizer`
cancellation bug easier to fix.

Also refactor to use an async context manager, which eliminates an
unlikely footgun where code that doesn't immediately use the context
manager could forget to release the lock.

Signed-off-by: Sean Quah <seanq@element.io>
2022-04-05 15:43:52 +01:00
reivilibre 42d8710f38
Fix a spec compliance issue where requests to the `/publicRooms` federation API would specify `limit` as a string. (#12364) 2022-04-05 12:45:36 +01:00
Richard van der Hoff 38adf14998
Enhance logging for inbound federation events (#12301)
It is currently rather hard to see which rooms are causing inbound federation
traffic. Add the room id to the logs.
2022-03-25 14:44:57 +00:00