MatrixSynapse

Commit Graph

Author	SHA1	Message	Date
DeepBlueV7.X	753d1d9cde	Fix joining rooms you have been unbanned from (#15323 ) * Fix joining rooms you have been unbanned from Since forever synapse did not allow you to join a room after you have been unbanned from it over federation. This was not actually because of the unban event not federating. Synapse simply used outdated state to validate the join transition. This skips the validation if we are not in the room and for that reason won't have the current room state. Fixes #1563 Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de> * Add changelog Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de> * Update changelog.d/15323.bugfix --------- Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>	2023-03-29 08:37:27 +00:00
Mathieu Velten	6cddf24e36	Faster joins: don't stall when a user joins during a fast join (#14606 ) Fixes #12801. Complement tests are at https://github.com/matrix-org/complement/pull/567. Avoid blocking on full state when handling a subsequent join into a partial state room. Also always perform a remote join into partial state rooms, since we do not know whether the joining user has been banned and want to avoid leaking history to banned users. Signed-off-by: Mathieu Velten <mathieuv@matrix.org> Co-authored-by: Sean Quah <seanq@matrix.org> Co-authored-by: David Robertson <davidr@element.io>	2023-02-10 23:31:05 +00:00
Shay	03bccd542b	Add a class UnpersistedEventContext to allow for the batching up of storing state groups (#14675 ) * add class UnpersistedEventContext * modify create new client event to create unpersistedeventcontexts * persist event contexts after creation * fix tests to persist unpersisted event contexts * cleanup * misc lints + cleanup * changelog + fix comments * lints * fix batch insertion? * reduce redundant calculation * add unpersisted event classes * rework compute_event_context, split into function that returns unpersisted event context and then persists it * use calculate_context_info to create unpersisted event contexts * update typing * $%#^&* * black * fix comments and consolidate classes, use attr.s for class * requested changes * lint * requested changes * requested changes * refactor to be stupidly explicit * clearer renaming and flow * make partial state non-optional * update docstrings --------- Co-authored-by: Erik Johnston <erik@matrix.org>	2023-02-09 13:05:02 -08:00
Patrick Cloke	ba79fb4a61	Use StrCollection in place of Collection[str] in (most) handlers code. (#14922 ) Due to the increased safety of StrCollection over Collection[str] and Sequence[str].	2023-01-26 12:31:58 -05:00
Erik Johnston	9187fd940e	Wait for streams to catch up when processing HTTP replication. (#14820 ) This should hopefully mitigate a class of races where data gets out of sync due a HTTP replication request racing with the replication streams.	2023-01-18 19:35:29 +00:00
reivilibre	ba4ea7d13f	Batch up replication requests to request the resyncing of remote users's devices. (#14716 )	2023-01-10 11:17:59 +00:00
reivilibre	fb60cb16fe	Faster remote room joins: stream the un-partial-stating of events over replication. [rei:frrj/streams/unpsr] (#14545 )	2022-12-14 14:47:11 +00:00
reivilibre	62ed877433	Improve validation of field size limits in events. (#14664 )	2022-12-13 13:19:19 +00:00
David Robertson	b5b5f66084	Move `StateFilter` to `synapse.types` (#14668 ) * Move `StateFilter` to `synapse.types` * Changelog	2022-12-12 16:19:30 +00:00
David Robertson	d10a85ec9e	Quieter logging for stateres failure at missing prev events (#14346 )	2022-11-10 12:17:46 +00:00
Eric Eastwood	40fa8294e3	Refactor MSC3030 `/timestamp_to_event` to move away from our snowflake pull from `destination` pattern (#14096 ) 1. `federation_client.timestamp_to_event(...)` now handles all `destination` looping and uses our generic `_try_destination_list(...)` helper. 2. Consistently handling `NotRetryingDestination` and `FederationDeniedError` across `get_pdu` , backfill, and the generic `_try_destination_list` which is used for many places we use this pattern. 3. `get_pdu(...)` now returns `PulledPduInfo` so we know which `destination` we ended up pulling the PDU from	2022-10-26 16:10:55 -05:00
Shay	b7a7ff6ee3	Add initial power level event to batch of bulk persisted events when creating a new room. (#14228 )	2022-10-21 10:46:22 -07:00
Andrew Morgan	dc02d9f8c5	Avoid checking the event cache when backfilling events (#14164 )	2022-10-18 10:33:35 +01:00
Eric Eastwood	40bb37eb27	Stop getting missing `prev_events` after we already know their signature is invalid (#13816 ) While https://github.com/matrix-org/synapse/pull/13635 stops us from doing the slow thing after we've already done it once, this PR stops us from doing one of the slow things in the first place. Related to - https://github.com/matrix-org/synapse/issues/13622 - https://github.com/matrix-org/synapse/pull/13635 - https://github.com/matrix-org/synapse/issues/13676 Part of https://github.com/matrix-org/synapse/issues/13356 Follow-up to https://github.com/matrix-org/synapse/pull/13815 which tracks event signature failures. With this PR, we avoid the call to the costly `_get_state_ids_after_missing_prev_event` because the signature failure will count as an attempt before and we filter events based on the backoff before calling `_get_state_ids_after_missing_prev_event` now. For example, this will save us 156s out of the 185s total that this `matrix.org` `/messages` request. If you want to see the full Jaeger trace of this, you can drag and drop this `trace.json` into your own Jaeger, https://gist.github.com/MadLittleMods/4b12d0d0afe88c2f65ffcc907306b761 To explain this exact scenario around `/messages` -> backfill, we call `/backfill` and first check the signatures of the 100 events. We see bad signature for `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` and `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` (both member events). Then we process the 98 events remaining that have valid signatures but one of the events references `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` as a `prev_event`. So we have to do the whole `_get_state_ids_after_missing_prev_event` rigmarole which pulls in those same events which fail again because the signatures are still invalid. - `backfill` - `outgoing-federation-request` `/backfill` - `_check_sigs_and_hash_and_fetch` - `_check_sigs_and_hash_and_fetch_one` for each event received over backfill - ❗ `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` fails with `Signature on retrieved event was invalid.`: `unable to verify signature for sender domain xxx: 401: Failed to find any key to satisfy: _FetchKeyRequest(...)` - ❗ `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` fails with `Signature on retrieved event was invalid.`: `unable to verify signature for sender domain xxx: 401: Failed to find any key to satisfy: _FetchKeyRequest(...)` - `_process_pulled_events` - `_process_pulled_event` for each validated event - ❗ Event `$Q0iMdqtz3IJYfZQU2Xk2WjB5NDF8Gg8cFSYYyKQgKJ0` references `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` as a `prev_event` which is missing so we try to get it - `_get_state_ids_after_missing_prev_event` - `outgoing-federation-request` `/state_ids` - ❗ `get_pdu` for `$luA4l7QHhf_jadH3mI-AyFqho0U2Q-IXXUbGSMq6h6M` which fails the signature check again - ❗ `get_pdu` for `$zuOn2Rd2vsC7SUia3Hp3r6JSkSFKcc5j3QTTqW_0jDw` which fails the signature check	2022-10-15 00:36:49 -05:00
Shay	b6baa46db0	Fix a bug where the joined hosts for a given event were not being properly cached (#14125 )	2022-10-12 11:01:00 -07:00
Shay	7b7478e8b6	Batch up notifications after event persistence (#14033 )	2022-10-05 10:12:48 -07:00
Eric Eastwood	2769ef4df1	Revert the general exception recording introduced in #13814 (#13969 ) * Maybe not catch all errors to avoid things in the nature-of CancelledError See https://github.com/matrix-org/synapse/pull/13815#discussion_r983384698 * Remove general exception tracking * Add changelog	2022-10-03 10:14:45 +01:00
Kateřina Churanová	6caa303083	fix: Push notifications for invite over federation (#13719 )	2022-09-28 12:31:53 +00:00
reivilibre	c06b2b7142	Faster Remote Room Joins: tell remote homeservers that we are unable to authorise them if they query a room which has partial state on our server. (#13823 )	2022-09-23 11:47:16 +01:00
Eric Eastwood	140af0cdb6	Record any exception when processing a pulled event (#13814 ) Part of https://github.com/matrix-org/synapse/issues/13700 and https://github.com/matrix-org/synapse/issues/13356 Follow-up to https://github.com/matrix-org/synapse/pull/13589	2022-09-15 14:40:49 -05:00
Eric Eastwood	957e3d74fc	Keep track when we try and fail to process a pulled event (#13589 ) We can follow-up this PR with: 1. Only try to backfill from an event if we haven't tried recently -> https://github.com/matrix-org/synapse/issues/13622 1. When we decide to backfill that event again, process it in the background so it doesn't block and make `/messages` slow when we know it will probably fail again -> https://github.com/matrix-org/synapse/issues/13623 1. Generally track failures everywhere we try and fail to pull an event over federation -> https://github.com/matrix-org/synapse/issues/13700 Fix https://github.com/matrix-org/synapse/issues/13621 Part of https://github.com/matrix-org/synapse/issues/13356 Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.qv7cj51sv9i5)	2022-09-14 13:57:50 -05:00
Eric Eastwood	0bf180cbb4	Comment about a better future where we can get the state diff between two events (#13586 ) Split off from https://github.com/matrix-org/synapse/pull/13561 Part of https://github.com/matrix-org/synapse/issues/13356 Mentioned in [internal doc](https://docs.google.com/document/d/1lvUoVfYUiy6UaHB6Rb4HicjaJAU40-APue9Q4vzuW3c/edit#bookmark=id.2tvwz3yhcafh)	2022-08-24 18:59:27 -05:00
Eric Eastwood	9385c41ba4	Fix Prometheus metrics being negative (mixed up start/end) (#13584 ) Fix: - https://github.com/matrix-org/synapse/pull/13535#discussion_r949582508 - https://github.com/matrix-org/synapse/pull/13533#discussion_r949577244	2022-08-23 08:47:30 +01:00
Eric Eastwood	06df5d4250	MSC2716v4 room version - remove namespace from MSC2716 event content fields (#13551 ) Complement PR: https://github.com/matrix-org/complement/pull/450 As suggested in https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941444525	2022-08-19 15:37:01 -05:00
Eric Eastwood	088bcb7ecb	Time how long it takes us to do backfill processing (#13535 )	2022-08-17 10:33:19 +01:00
Eric Eastwood	0a4efbc1dd	Instrument the federation/backfill part of `/messages` (#13489 ) Instrument the federation/backfill part of `/messages` so it's easier to follow what's going on in Jaeger when viewing a trace. Split out from https://github.com/matrix-org/synapse/pull/13440 Follow-up from https://github.com/matrix-org/synapse/pull/13368 Part of https://github.com/matrix-org/synapse/issues/13356	2022-08-16 12:39:40 -05:00
Eric Eastwood	92d21faf12	Instrument `/messages` for understandable traces in Jaeger (#13368 ) In Jaeger: - Before: huge list of uncategorized database calls - After: nice and collapsible into units of work	2022-08-03 10:57:38 -05:00
Patrick Cloke	f8e7a9418a	Fix missing import in `federation_event` handler. (#13431 ) #13404 removed an import of `Optional` which was still needed due to #13413 added more usages.	2022-08-01 14:14:29 +00:00
Sean Quah	224d792dd7	Refactor `_resolve_state_at_missing_prevs` to return an `EventContext` (#13404 ) Previously, `_resolve_state_at_missing_prevs` returned the resolved state before an event and a partial state flag. These were unwieldy to carry around would only ever be used to build an event context. Build the event context directly instead. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-08-01 13:53:56 +01:00
Richard van der Hoff	23768ccb4d	Faster joins: fix rejected events becoming un-rejected during resync (#13413 ) Make sure that we re-check the auth rules during state resync, otherwise rejected events get un-rejected.	2022-08-01 11:20:05 +01:00
Richard van der Hoff	ca3db044a3	Fix infinite loop in partial-state resync (#13353 ) Make sure that we only pull out events from the db once they have no prev-events with partial state.	2022-07-26 11:47:31 +00:00
Sean Quah	335ebb21cc	Faster room joins: avoid blocking when pulling events with missing prevs (#13355 ) Avoid blocking on full state in `_resolve_state_at_missing_prevs` and return a new flag indicating whether the resolved state is partial. Thread that flag around so that it makes it into the event context. Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>	2022-07-26 12:39:23 +01:00
Eric Eastwood	357561c1a2	Backfill remote event fetched by MSC3030 so we can paginate from it later (#13205 ) Depends on https://github.com/matrix-org/synapse/pull/13320 Complement tests: https://github.com/matrix-org/complement/pull/406 We could use the same method to backfill for `/context` as well in the future, see https://github.com/matrix-org/synapse/issues/3848	2022-07-22 16:00:11 -05:00
Sean Quah	158782c3ce	Skip soft fail checks for rooms with partial state (#13354 ) When a room has the partial state flag, we may not have an accurate `m.room.member` event for event senders in the room's current state, and so cannot perform soft fail checks correctly. Skip the soft fail check entirely in this case. As an alternative, we could block until we have full state, but that would prevent us from receiving incoming events over federation, which is undesirable. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-07-22 10:13:01 +01:00
Eric Eastwood	0f971ca68e	Update `get_pdu` to return the original, pristine `EventBase` (#13320 ) Update `get_pdu` to return the untouched, pristine `EventBase` as it was originally seen over federation (no metadata added). Previously, we returned the same `event` reference that we stored in the cache which downstream code modified in place and added metadata like setting it as an `outlier` and essentially poisoned our cache. Now we always return a copy of the `event` so the original can stay pristine in our cache and re-used for the next cache call. Split out from https://github.com/matrix-org/synapse/pull/13205 As discussed at: - https://github.com/matrix-org/synapse/pull/13205#discussion_r918365746 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918366125 Related to https://github.com/matrix-org/synapse/issues/12584. This PR doesn't fix that issue because it hits [`get_event` which exists from the local database before it tries to `get_pdu`](`7864f33e28/synapse/federation/federation_client.py (L581-L594)`).	2022-07-20 15:58:51 -05:00
Sean Quah	172ce29b14	Fix spurious warning when fetching state after a missing prev event (#13258 )	2022-07-19 19:15:54 +01:00
David Robertson	b977867358	Rate limit joins per-room (#13276 )	2022-07-19 11:45:17 +00:00
Richard van der Hoff	fe15a865a5	Rip out auth-event reconciliation code (#12943 ) There is a corner in `_check_event_auth` (long known as "the weird corner") where, if we get an event with auth_events which don't match those we were expecting, we attempt to resolve the diffence between our state and the remote's with a state resolution. This isn't specced, and there's general agreement we shouldn't be doing it. However, it turns out that the faster-joins code was relying on it, so we need to introduce something similar (but rather simpler) for that.	2022-07-14 21:52:26 +00:00
Sean Quah	68db233f0c	Handle race between persisting an event and un-partial stating a room (#13100 ) Whenever we want to persist an event, we first compute an event context, which includes the state at the event and a flag indicating whether the state is partial. After a lot of processing, we finally try to store the event in the database, which can fail for partial state events when the containing room has been un-partial stated in the meantime. We detect the race as a foreign key constraint failure in the data store layer and turn it into a special `PartialStateConflictError` exception, which makes its way up to the method in which we computed the event context. To make things difficult, the exception needs to cross a replication request: `/fed_send_events` for events coming over federation and `/send_event` for events from clients. We transport the `PartialStateConflictError` as a `409 Conflict` over replication and turn `409`s back into `PartialStateConflictError`s on the worker making the request. All client events go through `EventCreationHandler.handle_new_client_event`, which is called in a lot of places. Instead of trying to update all the code which creates client events, we turn the `PartialStateConflictError` into a `429 Too Many Requests` in `EventCreationHandler.handle_new_client_event` and hope that clients take it as a hint to retry their request. On the federation event side, there are 7 places which compute event contexts. 4 of them use outlier event contexts: `FederationEventHandler._auth_and_persist_outliers_inner`, `FederationHandler.do_knock`, `FederationHandler.on_invite_request` and `FederationHandler.do_remotely_reject_invite`. These events won't have the partial state flag, so we do not need to do anything for then. The remaining 3 paths which create events are `FederationEventHandler.process_remote_join`, `FederationEventHandler.on_send_membership_event` and `FederationEventHandler._process_received_pdu`. We can't experience the race in `process_remote_join`, unless we're handling an additional join into a partial state room, which currently blocks, so we make no attempt to handle it correctly. `on_send_membership_event` is only called by `FederationServer._on_send_membership_event`, so we catch the `PartialStateConflictError` there and retry just once. `_process_received_pdu` is called by `on_receive_pdu` for incoming events and `_process_pulled_event` for backfill. The latter should never try to persist partial state events, so we ignore it. We catch the `PartialStateConflictError` in `on_receive_pdu` and retry just once. Refering to the graph of code paths in https://github.com/matrix-org/synapse/issues/12988#issuecomment-1156857648 may make the above make more sense. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-07-05 16:12:52 +01:00
Richard van der Hoff	6da861ae69	`_process_received_pdu`: Improve exception handling (#13145 ) `_check_event_auth` is expected to raise `AuthError`s, so no need to log it again.	2022-07-01 10:52:10 +01:00
Sean Quah	9372f6f842	Fix logging context misuse when we fail to persist a federation event (#13089 ) When we fail to persist a federation event, we kick off a task to remove its push actions in the background, using the current logging context. Since we don't `await` that task, we may finish our logging context before the task finishes. There's no reason to not `await` the task, so let's do that. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-06-17 10:22:50 +01:00
Richard van der Hoff	8ecf6be1e1	Move some event auth checks out to a different method (#13065 ) * Add auth events to events used in tests * Move some event auth checks out to a different method Some of the event auth checks apply to an event's auth_events, rather than the state at the event - which means they can play no part in state resolution. Move them out to a separate method. * Rename check_auth_rules_for_event Now it only checks the state-dependent auth rules, it needs a better name.	2022-06-15 19:48:22 +01:00
Richard van der Hoff	f68b5e5773	Merge branch 'rav/simplify_event_auth_interface' into develop	2022-06-13 11:34:59 +01:00
Richard van der Hoff	0d9d36b15c	Remove `room_version` param from `check_auth_rules_for_event` Instead, use the `room_version` property of the event we're checking. The `room_version` was originally added as a parameter somewhere around #4482, but really it's been redundant since #6875 added a `room_version` field to `EventBase`.	2022-06-12 23:13:10 +01:00
Richard van der Hoff	68be42f6b6	Remove `room_version` param from `validate_event_for_room_version` Instead, use the `room_version` property of the event we're validating. The `room_version` was originally added as a parameter somewhere around #4482, but really it's been redundant since #6875 added a `room_version` field to `EventBase`.	2022-06-12 23:13:09 +01:00
Richard van der Hoff	7c6b2204d1	Faster joins: add issue links to the TODOs (#13004 ) ... to help us keep track of these things	2022-06-09 10:13:03 +00:00
Erik Johnston	e3163e2e11	Reduce the amount of state we pull from the DB (#12811 )	2022-06-06 09:24:12 +01:00
Sean Quah	2fba1076c5	Faster room joins: Try other destinations when resyncing the state of a partial-state room (#12812 ) Signed-off-by: Sean Quah <seanq@matrix.org>	2022-05-31 15:50:29 +01:00
Erik Johnston	1e453053cb	Rename storage classes (#12913 )	2022-05-31 12:17:50 +00:00
Erik Johnston	b83bc5fab5	Pull out less state when handling gaps mk2 (#12852 )	2022-05-26 09:48:12 +00:00

1 2 3

105 Commits (a3bad89d57645b2ea304d2900adab71a786b0172)