MatrixSynapse

Commit Graph

Author	SHA1	Message	Date
David Robertson	e9eb26e3af	Cache device resync requests over replication (#16241 )	2023-09-04 11:57:59 +01:00
Patrick Cloke	40901af5e0	Pass the device ID around in the presence handler (#16171 ) Refactoring to pass the device ID (in addition to the user ID) through the presence handler (specifically the `user_syncing`, `set_state`, and `bump_presence_active_time` methods and their replication versions).	2023-08-28 13:08:49 -04:00
Patrick Cloke	1bf143699c	Combine logic about not overriding BUSY presence. (#16170 ) Simplify some of the presence code by reducing duplicated code between worker & non-worker modes. The main change is to push some of the logic from `user_syncing` into `set_state`. This is done by passing whether the user is setting the presence via a `/sync` with a new `is_sync` flag to `set_state`. If this is `true` some additional logic is performed: * Don't override `busy` presence. * Update the `last_user_sync_ts`. * Never update the status message.	2023-08-28 11:03:23 -04:00
Shay	68b2611783	Clarify comment on key uploads over replication (#16016 )	2023-07-27 15:08:46 -07:00
Jason Little	1df0221bda	Use a custom scheme & the worker name for replication requests. (#15578 ) All the information needed is already in the `instance_map`, so use that instead of passing the hostname / IP & port manually for each replication request. This consolidates logic for future improvements of using e.g. UNIX sockets for workers.	2023-05-23 09:05:30 -04:00
Jason Little	e4f545c452	Remove `worker_replication_` settings (#15491 ) Add master to the instance_map as part of Complement, have ReplicationEndpoint look at instance_map for master. * Fix typo in drive by. * Remove unnecessary worker_replication_* bits from unit tests and add master to instance_map(hopefully in the right place) * Several updates: 1. Switch from master to main for naming the main process in the instance_map. Add useful constants for easier adjustment of names in the future. 2. Add backwards compatibility for worker_replication_* to allow time to transition to new style. Make sure to prioritize declaring main directly on the instance_map. 3. Clean up old comments/commented out code. 4. Adjust unit tests to match with new code. 5. Adjust Complement setup infrastructure to only add main to the instance_map if workers are used and remove now unused options from the worker.yaml template. * Initial Docs upload * Changelog * Missed some commented out code that can go now * Remove TODO comment that no longer holds true. * Fix links in docs * More docs * Remove debug logging * Apply suggestions from code review Co-authored-by: reivilibre <olivier@librepush.net> * Apply suggestions from code review Co-authored-by: reivilibre <olivier@librepush.net> * Update version to latest, include completeish before/after examples in upgrade notes. * Fix up and docs too --------- Co-authored-by: reivilibre <olivier@librepush.net>	2023-05-11 11:30:56 +01:00
Jason Little	d3bd03559b	HTTP Replication Client (#15470 ) Separate out a HTTP client for replication in preparation for also supporting using UNIX sockets. The major difference from the base class is that this does not use treq to handle HTTP requests.	2023-05-09 14:25:20 -04:00
Alok Kumar Singh	197fbb123b	Remove legacy code of single user device resync api (#15418 ) * Removed single-user resync usage and updated it to use multi-user counterpart Signed-off-by: Alok Kumar Singh alokaks601@gmail.com	2023-04-21 12:06:39 +01:00
David Robertson	1bc9985eb7	Have replication clients remove _INT_STREAM_POS (#15309 ) * Have replication clients remove _INT_STREAM_POS Suppose worker A makes an internal http request from worker B. B may make changes that A later learns about over replication. We want A's request to block until it has seen those changes—mainly to ensure A's caches are invalidated promptly. This helps provide read-after-write consistency, eliminating entire categories of races and test flakes. To implement this, B includes a top-level field `_INT_STREAM_POS` in its response JSON. Roughly speaking, the field's value tells A what to wait for. But we weren't removing that internal field before A's request completed! Introduced in https://github.com/matrix-org/synapse/pull/14820. Fixes #15308. * Changelog	2023-03-22 12:53:55 +00:00
Dirk Klimpel	ecbe0ddbe7	Add support for knocking to workers. (#15133 )	2023-03-02 12:59:53 -05:00
dependabot[bot]	9bb2eac719	Bump black from 22.12.0 to 23.1.0 (#15103 )	2023-02-22 15:29:09 -05:00
Erik Johnston	c78c67c5a9	Fix bug in replication where response is cached (#15024 )	2023-02-08 16:41:55 +00:00
Erik Johnston	0ec12a3753	Reduce max time we wait for stream positions (#14881 ) Now that we wait for stream positions whenever we do a HTTP replication hit, we need to be less brutal in the case where we do timeout (as we have bugs around this).	2023-01-20 21:04:33 +00:00
Erik Johnston	9187fd940e	Wait for streams to catch up when processing HTTP replication. (#14820 ) This should hopefully mitigate a class of races where data gets out of sync due a HTTP replication request racing with the replication streams.	2023-01-18 19:35:29 +00:00
reivilibre	ba4ea7d13f	Batch up replication requests to request the resyncing of remote users's devices. (#14716 )	2023-01-10 11:17:59 +00:00
Andrew Morgan	c4456114e1	Add experimental support for MSC3391: deleting account data (#14714 )	2023-01-01 03:40:46 +00:00
Patrick Cloke	6d47b7e325	Add a type hint for `get_device_handler()` and fix incorrect types. (#14055 ) This was the last untyped handler from the HomeServer object. Since it was being treated as Any (and thus unchecked) it was being used incorrectly in a few places.	2022-11-22 14:08:04 -05:00
realtyem	c15e9a0edb	Remove need for `worker_main_http_uri` setting to use /keys/upload. (#14400 )	2022-11-16 22:16:25 +00:00
Patrick Cloke	d8cc86eff4	Remove redundant types from comments. (#14412 ) Remove type hints from comments which have been added as Python type hints. This helps avoid drift between comments and reality, as well as removing redundant information. Also adds some missing type hints which were simple to fill in.	2022-11-16 15:25:24 +00:00
Tuomas Ojamies	b5ab2c428a	Support using SSL on worker endpoints. (#14128 ) * Fix missing SSL support in worker endpoints. * Add changelog * SSL for Replication endpoint * Remove unit test change * Refactor listener creation to reduce duplicated code * Fix the logger message * Update synapse/app/_base.py Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> * Update synapse/app/_base.py Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> * Update synapse/app/_base.py Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> * Add config documentation for new TLS option Co-authored-by: Tuomas Ojamies <tojamies@palantir.com> Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> Co-authored-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>	2022-11-15 12:55:00 +00:00
Brendan Abolivier	422cff7df6	Fallback if 'approved' isn't included in a registration replication request (#14135 )	2022-10-11 14:41:06 +02:00
Brendan Abolivier	be76cd8200	Allow admins to require a manual approval process before new accounts can be used (using MSC3866) (#13556 )	2022-09-29 15:23:24 +02:00
Shay	8ab16a92ed	Persist CreateRoom events to DB in a batch (#13800 )	2022-09-28 10:11:48 +00:00
reivilibre	7bc110a19e	Generalise the `@cancellable` annotation so it can be used on functions other than just servlet methods. (#13662 )	2022-08-31 11:16:05 +00:00
Patrick Cloke	a6895dd576	Add type annotations to `trace` decorator. (#13328 ) Functions that are decorated with `trace` are now properly typed and the type hints for them are fixed.	2022-07-19 14:14:30 -04:00
Sean Quah	1391a76cd2	Faster room joins: fix race in recalculation of current room state (#13151 ) Bounce recalculation of current state to the correct event persister and move recalculation of current state into the event persistence queue, to avoid concurrent updates to a room's current state. Also give recalculation of a room's current state a real stream ordering. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-07-07 12:19:31 +00:00
Sean Quah	68db233f0c	Handle race between persisting an event and un-partial stating a room (#13100 ) Whenever we want to persist an event, we first compute an event context, which includes the state at the event and a flag indicating whether the state is partial. After a lot of processing, we finally try to store the event in the database, which can fail for partial state events when the containing room has been un-partial stated in the meantime. We detect the race as a foreign key constraint failure in the data store layer and turn it into a special `PartialStateConflictError` exception, which makes its way up to the method in which we computed the event context. To make things difficult, the exception needs to cross a replication request: `/fed_send_events` for events coming over federation and `/send_event` for events from clients. We transport the `PartialStateConflictError` as a `409 Conflict` over replication and turn `409`s back into `PartialStateConflictError`s on the worker making the request. All client events go through `EventCreationHandler.handle_new_client_event`, which is called in a lot of places. Instead of trying to update all the code which creates client events, we turn the `PartialStateConflictError` into a `429 Too Many Requests` in `EventCreationHandler.handle_new_client_event` and hope that clients take it as a hint to retry their request. On the federation event side, there are 7 places which compute event contexts. 4 of them use outlier event contexts: `FederationEventHandler._auth_and_persist_outliers_inner`, `FederationHandler.do_knock`, `FederationHandler.on_invite_request` and `FederationHandler.do_remotely_reject_invite`. These events won't have the partial state flag, so we do not need to do anything for then. The remaining 3 paths which create events are `FederationEventHandler.process_remote_join`, `FederationEventHandler.on_send_membership_event` and `FederationEventHandler._process_received_pdu`. We can't experience the race in `process_remote_join`, unless we're handling an additional join into a partial state room, which currently blocks, so we make no attempt to handle it correctly. `on_send_membership_event` is only called by `FederationServer._on_send_membership_event`, so we catch the `PartialStateConflictError` there and retry just once. `_process_received_pdu` is called by `on_receive_pdu` for incoming events and `_process_pulled_event` for backfill. The latter should never try to persist partial state events, so we ignore it. We catch the `PartialStateConflictError` in `on_receive_pdu` and retry just once. Refering to the graph of code paths in https://github.com/matrix-org/synapse/issues/12988#issuecomment-1156857648 may make the above make more sense. Signed-off-by: Sean Quah <seanq@matrix.org>	2022-07-05 16:12:52 +01:00
Erik Johnston	1e453053cb	Rename storage classes (#12913 )	2022-05-31 12:17:50 +00:00
Sean Quah	a559c8b0d9	Respect the `@cancellable` flag for `ReplicationEndpoint`s (#12700 ) While `ReplicationEndpoint`s register themselves via `JsonResource`, they pass a method that calls the handler, instead of the handler itself, to `register_paths`. As a result, `JsonResource` will not correctly pick up the `@cancellable` flag and we have to apply it ourselves. Signed-off-by: Sean Quah <seanq@element.io>	2022-05-11 12:25:39 +01:00
David Robertson	a2b00a4486	Bump `black` and `click` versions (#12320 )	2022-03-29 10:41:19 +00:00
Nick Mills-Barrett	180d8ff0d4	Retry some http replication failures (#12182 ) This allows for the target process to be down for around a minute which provides time for restarts during synapse upgrades/config updates. Closes: #12178 Signed off by Nick Mills-Barrett nick@beeper.com	2022-03-09 14:53:28 +00:00
Richard van der Hoff	e24ff8ebe3	Remove `HomeServer.get_datastore()` (#12031 ) The presence of this method was confusing, and mostly present for backwards compatibility. Let's get rid of it. Part of #11733	2022-02-23 11:04:02 +00:00
Erik Johnston	6d14b3dabf	Better error message when failing to request from another process (#12060 )	2022-02-22 15:52:08 +00:00
Patrick Cloke	63d90f10ec	Add missing type hints to synapse.replication.http. (#11856 )	2022-02-08 07:44:39 -05:00
Quentin Gliech	a15a893df8	Save the OIDC session ID (sid) with the device on login (#11482 ) As a step towards allowing back-channel logout for OIDC.	2021-12-06 12:43:06 -05:00
Sean Quah	2b82ec425f	Add type hints for most `HomeServer` parameters (#11095 )	2021-10-22 18:15:41 +01:00
Sean Quah	6b18eb4430	Fix opentracing and Prometheus metrics for replication requests (#10996 ) This commit fixes two bugs to do with decorators not instrumenting `ReplicationEndpoint`'s `send_request` correctly. There are two decorators on `send_request`: Prometheus' `Gauge.track_inprogress()` and Synapse's `opentracing.trace`. `Gauge.track_inprogress()` does not have any support for async functions when used as a decorator. Since async functions behave like regular functions that return coroutines, only the creation of the coroutine was covered by the metric and none of the actual body of `send_request`. `Gauge.track_inprogress()` returns a regular, non-async function wrapping `send_request`, which is the source of the next bug. The `opentracing.trace` decorator would normally handle async functions correctly, but since the wrapped `send_request` is a non-async function, the decorator ends up suffering from the same issue as `Gauge.track_inprogress()`: the opentracing span only measures the creation of the coroutine and none of the actual function body. Using `Gauge.track_inprogress()` as a context manager instead of a decorator resolves both bugs.	2021-10-12 11:23:46 +01:00
Patrick Cloke	bb7fdd821b	Use direct references for configuration variables (part 5). (#10897 )	2021-09-24 07:25:21 -04:00
Richard van der Hoff	1800aabfc2	Split `FederationHandler` in half (#10692 ) The idea here is to take anything to do with incoming events and move it out to a separate handler, as a way of making FederationHandler smaller.	2021-08-26 21:41:44 +01:00
Jonathan de Jong	bf72d10dbf	Use inline type hints in various other places (in `synapse/`) (#10380 )	2021-07-15 11:02:43 +01:00
Quentin Gliech	bd4919fb72	MSC2918 Refresh tokens implementation (#9450 ) This implements refresh tokens, as defined by MSC2918 This MSC has been implemented client side in Hydrogen Web: vector-im/hydrogen-web#235 The basics of the MSC works: requesting refresh tokens on login, having the access tokens expire, and using the refresh token to get a new one. Signed-off-by: Quentin Gliech <quentingliech@gmail.com>	2021-06-24 14:33:20 +01:00
Richard van der Hoff	d7808a2dde	Extend `ResponseCache` to pass a context object into the callback (#10157 ) This is the first of two PRs which seek to address #8518. This first PR lays the groundwork by extending ResponseCache; a second PR (#10158) will update the SyncHandler to actually use it, and fix the bug. The idea here is that we allow the callback given to ResponseCache.wrap to decide whether its result should be cached or not. We do that by (optionally) passing a ResponseCacheContext into it, which it can modify.	2021-06-14 10:26:09 +01:00
Sorunome	d936371b69	Implement knock feature (#6739 ) This PR aims to implement the knock feature as proposed in https://github.com/matrix-org/matrix-doc/pull/2403 Signed-off-by: Sorunome mail@sorunome.de Signed-off-by: Andrew Morgan andrewm@element.io	2021-06-09 19:39:51 +01:00
Richard van der Hoff	1bf83a191b	Clean up the interface for injecting opentracing over HTTP (#10143 ) * Remove unused helper functions * Clean up the interface for injecting opentracing over HTTP * changelog	2021-06-09 11:33:00 +01:00
Andrew Morgan	4d6e5a5e99	Use a database table to hold the users that should have full presence sent to them, instead of something in-memory (#9823 )	2021-05-18 14:13:45 +01:00
Erik Johnston	9d25a0ae65	Split presence out of master (#9820 )	2021-04-23 12:21:55 +01:00
Jonathan de Jong	4b965c862d	Remove redundant "coding: utf-8" lines (#9786 ) Part of #9744 Removes all redundant `# -- coding: utf-8 --` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`	2021-04-14 15:34:27 +01:00
Erik Johnston	963f4309fe	Make RateLimiter class check for ratelimit overrides (#9711 ) This should fix a class of bug where we forget to check if e.g. the appservice shouldn't be ratelimited. We also check the `ratelimit_override` table to check if the user has ratelimiting disabled. That table is really only meant to override the event sender ratelimiting, so we don't use any values from it (as they might not make sense for different rate limits), but we do infer that if ratelimiting is disabled for the user we should disabled all ratelimits. Fixes #9663	2021-03-30 12:06:09 +01:00
Richard van der Hoff	567f88f835	Prep work for removing `outlier` from `internal_metadata` (#9411 ) * Populate `internal_metadata.outlier` based on `events` table Rather than relying on `outlier` being in the `internal_metadata` column, populate it based on the `events.outlier` column. * Move `outlier` out of InternalMetadata._dict Ultimately, this will allow us to stop writing it to the database. For now, we have to grandfather it back in so as to maintain compatibility with older versions of Synapse.	2021-03-17 12:33:18 +00:00
Richard van der Hoff	1107214a1d	Fix the auth provider on the logins metric (#9573 ) We either need to pass the auth provider over the replication api, or make sure we report the auth provider on the worker that received the request. I've gone with the latter.	2021-03-10 18:15:03 +00:00

1 2 3 4

160 Commits (1e571cd66437ea2455c203dafb94c20ba48cdcc1)