MatrixSynapse

Commit Graph

Author	SHA1	Message	Date
David Robertson	8e37ece015	Bump the client-side timeout for /state (#14912 ) * Bump the client-side timeout for /state to allow faster joins resyncs the chance to complete for large rooms. We have seen this fair poorly (~90s for Matrix HQ's /state) in testing, causing the resync to advance to another HS who hasn't seen our join yet. * Changelog * Milliseconds!!!!	2023-01-25 16:11:06 +00:00
Sean Quah	d329a566df	Faster joins: Fix incompatibility with restricted joins (#14882 ) * Avoid clearing out forward extremities when doing a second remote join When joining a restricted room where the local homeserver does not have a user able to issue invites, we perform a second remote join. We want to avoid clearing out forward extremities in this case because the forward extremities we have are up to date and clearing out forward extremities creates a window in which the room can get bricked if Synapse crashes. Signed-off-by: Sean Quah <seanq@matrix.org> * Do a full join when doing a second remote join into a full state room We cannot persist a partial state join event into a joined full state room, so we perform a full state join for such rooms instead. As a future optimization, we could always perform a partial state join and compute or retrieve the full state ourselves if necessary. Signed-off-by: Sean Quah <seanq@matrix.org> * Add lock around partial state flag for rooms Signed-off-by: Sean Quah <seanq@matrix.org> * Preserve partial state info when doing a second partial state join Signed-off-by: Sean Quah <seanq@matrix.org> * Add newsfile * Add a TODO(faster_joins) marker Signed-off-by: Sean Quah <seanq@matrix.org>	2023-01-22 19:19:31 +00:00
David Robertson	5b3af1c7d0	Stabilise serving partial join responses (#14839 ) Serving partial join responses is no longer experimental. They will only be served under the stable identifier if the the undocumented config flag experimental.msc3706_enabled is set to true. Synapse continues to request a partial join only if the undocumented config flag experimental.faster_joins is set to true; this setting remains present and unaffected.	2023-01-17 12:44:15 +00:00
Sean Quah	db5145a31d	Add parameter to control whether we do a partial state join (#14843 ) When the local homeserver is already joined to a room and wants to perform another remote join, we may find it useful to do a non-partial state join if we already have the full state for the room. Signed-off-by: Sean Quah <seanq@matrix.org>	2023-01-16 23:15:17 +00:00
David Robertson	85a7a201fa	Also use stable name in SendJoinResponse struct (#14841 ) * Also use stable name in SendJoinResponse struct follow-up to #14832 * Changelog * Fix a rename I missed * Run black * Update synapse/federation/federation_client.py Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com> Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>	2023-01-16 12:40:25 +00:00
David Robertson	52ae80dd1a	Use stable identifiers for faster joins (#14832 ) * Use new query param when requesting a partial join * Read new query param when serving partial join * Provide new field names when serving partial joins * Read new field names from partial join response * Changelog	2023-01-13 17:58:53 +00:00
Patrick Cloke	9b6224577e	Failover on proper error responses. (#14620 ) When querying a remote server handle a 404/405 with an errcode of M_UNRECOGNIZED as an unimplemented endpoint.	2022-12-06 07:23:03 -05:00
Richard van der Hoff	cb59e08062	Improve logging and opentracing for to-device message handling (#14598 ) A batch of changes intended to make it easier to trace to-device messages through the system. The intention here is that a client can set a property org.matrix.msgid in any to-device message it sends. That ID is then included in any tracing or logging related to the message. (Suggestions as to where this field should be documented welcome. I'm not enthusiastic about speccing it - it's very much an optional extra to help with debugging.) I've also generally improved the data we send to opentracing for these messages.	2022-12-06 09:52:55 +00:00
Mathieu Velten	4569eda944	Use servers list approx to send read receipts when in partial state (#14549 ) Signed-off-by: Mathieu Velten <mathieuv@matrix.org>	2022-11-30 13:39:47 +01:00
Eric Eastwood	8f10c8b054	Move MSC3030 `/timestamp_to_event` endpoint to stable v1 location (#14471 ) Fix https://github.com/matrix-org/synapse/issues/14390 - Client API: `/_matrix/client/unstable/org.matrix.msc3030/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction>` -> `/_matrix/client/v1/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction>` - Federation API: `/_matrix/federation/unstable/org.matrix.msc3030/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction>` -> `/_matrix/federation/v1/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction>` Complement test changes: https://github.com/matrix-org/complement/pull/559	2022-11-28 15:54:18 -06:00
Patrick Cloke	d748bbc8f8	Include thread information when sending receipts over federation. (#14466 ) Include the thread_id field when sending read receipts over federation. This might result in the same user having multiple read receipts per-room, meaning multiple EDUs must be sent to encapsulate those receipts. This restructures the PerDestinationQueue APIs to support multiple receipt EDUs, queue_read_receipt now becomes linear time in the number of queued threaded receipts in the room for the given user, it is expected this is a small number since receipt EDUs are sent as filler in transactions.	2022-11-28 14:40:17 +00:00
Mathieu Velten	39cde585bf	Faster joins: use initial list of servers if we don't have the full state yet (#14408 ) Signed-off-by: Mathieu Velten <mathieuv@matrix.org> Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>	2022-11-24 18:09:47 +01:00
Mathieu Velten	1526ff389f	Faster joins: filter out non local events when a room doesn't have its full state (#14404 ) Signed-off-by: Mathieu Velten <mathieuv@matrix.org>	2022-11-21 16:46:14 +01:00
Patrick Cloke	d8cc86eff4	Remove redundant types from comments. (#14412 ) Remove type hints from comments which have been added as Python type hints. This helps avoid drift between comments and reality, as well as removing redundant information. Also adds some missing type hints which were simple to fill in.	2022-11-16 15:25:24 +00:00
David Robertson	1eed795fc5	Include heroes in partial join responses' state (#14442 ) * Pull out hero selection logic * Include heroes in partial join response's state * Changelog * Fixup trial test * Remove TODO	2022-11-15 17:35:19 +00:00
David Robertson	d4fac8a3e2	Fix typo in #13320 which could cause log spam (#14347 )	2022-11-01 19:20:35 +00:00
Eric Eastwood	40fa8294e3	Refactor MSC3030 `/timestamp_to_event` to move away from our snowflake pull from `destination` pattern (#14096 ) 1. `federation_client.timestamp_to_event(...)` now handles all `destination` looping and uses our generic `_try_destination_list(...)` helper. 2. Consistently handling `NotRetryingDestination` and `FederationDeniedError` across `get_pdu` , backfill, and the generic `_try_destination_list` which is used for many places we use this pattern. 3. `get_pdu(...)` now returns `PulledPduInfo` so we know which `destination` we ended up pulling the PDU from	2022-10-26 16:10:55 -05:00
Olivier Wilkinson (reivilibre)	85fcbba595	Merge branch 'release-v1.70' into develop	2022-10-25 15:39:35 +01:00
Erik Johnston	09b588854e	Fix `TypeError: 'dict_keys' object is not reversible` (#14280 )	2022-10-24 13:05:14 +01:00
Andrew Morgan	da2c93d4b6	Stop returning `unsigned.invite_room_state` in `PUT /_matrix/federation/v2/invite/{roomId}/{eventId}` responses (#14064 ) Co-authored-by: David Robertson <davidr@element.io>	2022-10-20 15:17:45 +01:00
Eric Eastwood	70b3396506	Explain `SynapseError` and `FederationError` better (#14191 ) Explain `SynapseError` and `FederationError` better Spawning from https://github.com/matrix-org/synapse/pull/13816#discussion_r993262622	2022-10-19 15:39:43 -05:00
Andrew Morgan	97b3d037c0	Don't require optional `invite_room_state` field on fed v2 invite (#14083 )	2022-10-14 13:48:33 +01:00
Andrew Morgan	9c23442ac9	Correct field name for stripped state events when knocking. `knock_state_events` -> `knock_room_state` (#14102 )	2022-10-12 14:37:20 +01:00
Shay	a86b2f6837	Fix a bug where redactions were not being sent over federation if we did not have the original event. (#13813 )	2022-10-11 11:18:45 -07:00
David Robertson	cb20b885cb	Always close _all_ `ijson` coroutines, even if doing so raises Exceptions (#14065 )	2022-10-06 18:17:50 +00:00
Eric Eastwood	70a4317692	Track when the pulled event signature fails (#13815 ) Because we're doing the recording in `_check_sigs_and_hash_for_pulled_events_and_fetch` (previously named `_check_sigs_and_hash_and_fetch`), this means we will track signature failures for `backfill`, `get_room_state`, `get_event_auth`, and `get_missing_events` (all pulled event scenarios). And we also record signature failures from `get_pdu`. Part of https://github.com/matrix-org/synapse/issues/13700 Part of https://github.com/matrix-org/synapse/issues/13676 and https://github.com/matrix-org/synapse/issues/13356 This PR will be especially important for https://github.com/matrix-org/synapse/pull/13816 so we can avoid the costly `_get_state_ids_after_missing_prev_event` down the line when `/messages` calls backfill.	2022-10-03 14:53:29 -05:00
Erik Johnston	299b00d968	Prioritize outbound to-device over device list updates (#13922 ) Otherwise device list changes for large accounts can temporarily delay to-device messages.	2022-09-27 15:17:41 +01:00
reivilibre	c06b2b7142	Faster Remote Room Joins: tell remote homeservers that we are unable to authorise them if they query a room which has partial state on our server. (#13823 )	2022-09-23 11:47:16 +01:00
Denis	c802ef1411	Don't include redundant prev_state in new events (#13791 )	2022-09-20 09:44:38 +01:00
reivilibre	21687ec189	Fix a long-standing spec compliance bug where Synapse would accept a trailing slash on the end of `/get_missing_events` federation requests. (#13789 ) * Don't accept a trailing slash on the end of /get_missing_events * Newsfile Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org> Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>	2022-09-14 09:28:12 +01:00
reivilibre	526f84bc2e	Fix Prometheus recording rules to not use legacy metric names. (#13718 )	2022-09-08 15:01:42 +01:00
reivilibre	c2fe48a6ff	Rename the `EventFormatVersions` enum values so that they line up with room version numbers. (#13706 )	2022-09-07 11:08:20 +01:00
Erik Johnston	2318603772	Add some logging to help track down #13444 (#13679 )	2022-09-01 13:54:52 +01:00
reivilibre	7bc110a19e	Generalise the `@cancellable` annotation so it can be used on functions other than just servlet methods. (#13662 )	2022-08-31 11:16:05 +00:00
reivilibre	ba882c0357	Faster Room Joins: fix `/make_knock` blocking indefinitely when the room in question is a partial-stated room. (#13583 ) Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>	2022-08-24 09:09:59 +00:00
Eric Eastwood	7af07f9716	Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls (#13588 ) Instrument `_check_sigs_and_hash_and_fetch` to trace time spent in child concurrent calls because I've see `_check_sigs_and_hash_and_fetch` take [10.41s to process 100 events](https://github.com/matrix-org/synapse/issues/13587) Fix https://github.com/matrix-org/synapse/issues/13587 Part of https://github.com/matrix-org/synapse/issues/13356	2022-08-23 21:53:37 -05:00
Eric Eastwood	0a4efbc1dd	Instrument the federation/backfill part of `/messages` (#13489 ) Instrument the federation/backfill part of `/messages` so it's easier to follow what's going on in Jaeger when viewing a trace. Split out from https://github.com/matrix-org/synapse/pull/13440 Follow-up from https://github.com/matrix-org/synapse/pull/13368 Part of https://github.com/matrix-org/synapse/issues/13356	2022-08-16 12:39:40 -05:00
Eric Eastwood	344a2f767c	Instrument `FederationStateIdsServlet` - `/state_ids` (#13499 ) Instrument FederationStateIdsServlet - `/state_ids` so it's easier to follow what's going on in Jaeger when viewing a trace.	2022-08-15 19:41:23 +01:00
reivilibre	e9e6aacfbe	Faster Room Joins: prevent Synapse from answering federated join requests for a room which it has not fully joined yet. (#13416 )	2022-08-04 16:27:04 +01:00
Eric Eastwood	92d21faf12	Instrument `/messages` for understandable traces in Jaeger (#13368 ) In Jaeger: - Before: huge list of uncategorized database calls - After: nice and collapsible into units of work	2022-08-03 10:57:38 -05:00
Will Hunt	502f075e96	Implement MSC3848: Introduce errcodes for specific event sending failures (#13343 ) Implements MSC3848	2022-07-27 13:44:40 +01:00
reivilibre	39be5bc550	Make minor clarifications to the error messages given when we fail to join a room via any server. (#13160 )	2022-07-27 10:37:50 +00:00
Eric Eastwood	4f3082d6bf	Fix `get_pdu` asking every remote destination even after it finds an event (#13346 )	2022-07-27 10:40:04 +01:00
Patrick Cloke	50122754c8	Add missing types to opentracing. (#13345 ) After this change `synapse.logging` is fully typed.	2022-07-21 12:01:52 +00:00
Eric Eastwood	0f971ca68e	Update `get_pdu` to return the original, pristine `EventBase` (#13320 ) Update `get_pdu` to return the untouched, pristine `EventBase` as it was originally seen over federation (no metadata added). Previously, we returned the same `event` reference that we stored in the cache which downstream code modified in place and added metadata like setting it as an `outlier` and essentially poisoned our cache. Now we always return a copy of the `event` so the original can stay pristine in our cache and re-used for the next cache call. Split out from https://github.com/matrix-org/synapse/pull/13205 As discussed at: - https://github.com/matrix-org/synapse/pull/13205#discussion_r918365746 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918366125 Related to https://github.com/matrix-org/synapse/issues/12584. This PR doesn't fix that issue because it hits [`get_event` which exists from the local database before it tries to `get_pdu`](`7864f33e28/synapse/federation/federation_client.py (L581-L594)`).	2022-07-20 15:58:51 -05:00
Patrick Cloke	a6895dd576	Add type annotations to `trace` decorator. (#13328 ) Functions that are decorated with `trace` are now properly typed and the type hints for them are fixed.	2022-07-19 14:14:30 -04:00
David Robertson	b977867358	Rate limit joins per-room (#13276 )	2022-07-19 11:45:17 +00:00
Nick Mills-Barrett	21eeacc995	Federation Sender & Appservice Pusher Stream Optimisations (#13251 ) * Replace `get_new_events_for_appservice` with `get_all_new_events_stream` The functions were near identical and this brings the AS worker closer to the way federation senders work which can allow for multiple workers to handle AS traffic. * Pull received TS alongside events when processing the stream This avoids an extra query -per event- when both federation sender and appservice pusher process events.	2022-07-15 09:36:56 +01:00
Sean Quah	68db233f0c	Handle race between persisting an event and un-partial stating a room (#13100 ) Whenever we want to persist an event, we first compute an event context, which includes the state at the event and a flag indicating whether the state is partial. After a lot of processing, we finally try to store the event in the database, which can fail for partial state events when the containing room has been un-partial stated in the meantime. We detect the race as a foreign key constraint failure in the data store layer and turn it into a special `PartialStateConflictError` exception, which makes its way up to the method in which we computed the event context. To make things difficult, the exception needs to cross a replication request: `/fed_send_events` for events coming over federation and `/send_event` for events from clients. We transport the `PartialStateConflictError` as a `409 Conflict` over replication and turn `409`s back into `PartialStateConflictError`s on the worker making the request. All client events go through `EventCreationHandler.handle_new_client_event`, which is called in a lot of places. Instead of trying to update all the code which creates client events, we turn the `PartialStateConflictError` into a `429 Too Many Requests` in `EventCreationHandler.handle_new_client_event` and hope that clients take it as a hint to retry their request. On the federation event side, there are 7 places which compute event contexts. 4 of them use outlier event contexts: `FederationEventHandler._auth_and_persist_outliers_inner`, `FederationHandler.do_knock`, `FederationHandler.on_invite_request` and `FederationHandler.do_remotely_reject_invite`. These events won't have the partial state flag, so we do not need to do anything for then. The remaining 3 paths which create events are `FederationEventHandler.process_remote_join`, `FederationEventHandler.on_send_membership_event` and `FederationEventHandler._process_received_pdu`. We can't experience the race in `process_remote_join`, unless we're handling an additional join into a partial state room, which currently blocks, so we make no attempt to handle it correctly. `on_send_membership_event` is only called by `FederationServer._on_send_membership_event`, so we catch the `PartialStateConflictError` there and retry just once. `_process_received_pdu` is called by `on_receive_pdu` for incoming events and `_process_pulled_event` for backfill. The latter should never try to persist partial state events, so we ignore it. We catch the `PartialStateConflictError` in `on_receive_pdu` and retry just once. Refering to the graph of code paths in https://github.com/matrix-org/synapse/issues/12988#issuecomment-1156857648 may make the above make more sense. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-07-05 16:12:52 +01:00
Patrick Cloke	81608490e3	Stop depending on `room_id` to be returned for children state in the hierarchy response. (#12991 ) The `room_id` field was removed from MSC2946 before it was accepted. It was initially kept for backwards compatibility and should be removed now that the stable form of the API is used. This change only stops Synapse from validating that it is returned, a future PR will remove returning it as part of the response.	2022-06-10 07:15:51 -04:00

1 2 3 4 5 ...

1106 Commits (dc901a885f9a7111c97b2935d51d2d05a26db47b)