MatrixSynapse

Commit Graph

Author	SHA1	Message	Date
Sean Quah	a302d3ecf7	Remove unnecessary reactor reference from `_PerHostRatelimiter` (#14842 ) Fix up #14812 to avoid introducing a reference to the reactor. Signed-off-by: Sean Quah <seanq@matrix.org>	2023-01-16 13:16:19 +00:00
Sean Quah	772e8c2385	Fix stack overflow in `_PerHostRatelimiter` due to synchronous requests (#14812 ) When there are many synchronous requests waiting on a `_PerHostRatelimiter`, each request will be started recursively just after the previous request has completed. Under the right conditions, this leads to stack exhaustion. A common way for requests to become synchronous is when the remote client disconnects early, because the homeserver is overloaded and slow to respond. Avoid stack exhaustion under these conditions by deferring subsequent requests until the next reactor tick. Fixes #14480. Signed-off-by: Sean Quah <seanq@matrix.org>	2023-01-13 00:16:21 +00:00
reivilibre	ba4ea7d13f	Batch up replication requests to request the resyncing of remote users's devices. (#14716 )	2023-01-10 11:17:59 +00:00
Patrick Cloke	630d0aeaf6	Support RFC7636 PKCE in the OAuth 2.0 flow. (#14750 ) PKCE can protect against certain attacks and is enabled by default. Support can be controlled manually by setting the pkce_method of each oidc_providers entry to 'auto' (default), 'always', or 'never'. This is required by Twitter OAuth 2.0 support.	2023-01-04 14:58:08 -05:00
Patrick Cloke	3aeca2588b	Add missing type hints to tests.config. (#14681 )	2022-12-16 08:53:28 -05:00
reivilibre	864c3f85b0	Improve type annotations for the helper methods on a `CachedFunction`. (#14685 )	2022-12-16 13:04:54 +00:00
Mathieu Velten	54c012c5a8	Make `handle_new_client_event` throws `PartialStateConflictError` (#14665 ) Then adapts calling code to retry when needed so it doesn't 500 to clients. Signed-off-by: Mathieu Velten <mathieuv@matrix.org> Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>	2022-12-15 16:04:23 +00:00
Patrick Cloke	9d8a3234ba	Respond with proper error responses on unknown paths. (#14621 ) Returns a proper 404 with an errcode of M_RECOGNIZED for unknown endpoints per MSC3743.	2022-12-08 11:37:05 -05:00
Patrick Cloke	da77720752	Check the stream position before checking if the cache is empty. (#14639 ) An empty cache does not mean the entity has no changed, if it is earlier than the earliest known stream position return that the entity has changed since the cache cannot accurately answer that query.	2022-12-08 11:35:49 -05:00
Erik Johnston	cee9445884	Better return type for `get_all_entities_changed` (#14604 ) Help callers from using the return value incorrectly by ensuring that callers explicitly check if there was a cache hit or not.	2022-12-05 15:19:14 -05:00
Patrick Cloke	6a8310f3df	Compare to the earliest known stream pos in the stream change cache. (#14435 ) The internal methods of the StreamChangeCache were inconsistently treating the earliest known stream position as valid. It is now treated as invalid, meaning the cache cannot determine if an entity at the earliest known stream position has changed or not.	2022-12-05 09:00:59 -05:00
Patrick Cloke	1799a54a54	Batch fetch bundled annotations (#14491 ) Avoid an n+1 query problem and fetch the bundled aggregations for m.annotation relations in a single query instead of a query per event. This applies similar logic for as was previously done for edits in `8b309adb43` (#11660) and threads in `b65acead42` (#11752).	2022-11-22 07:26:11 -05:00
Patrick Cloke	d8cc86eff4	Remove redundant types from comments. (#14412 ) Remove type hints from comments which have been added as Python type hints. This helps avoid drift between comments and reality, as well as removing redundant information. Also adds some missing type hints which were simple to fill in.	2022-11-16 15:25:24 +00:00
Patrick Cloke	13ca8bb2fc	Remove duplicated code to evict entries. (#14410 ) This code was factored out to a method, but also left in-place. Calling this twice in a row makes no sense: the first call will reduce the size appropriately, but the loop will immediately exit since the cache size was already reduced.	2022-11-10 15:33:34 -05:00
Eric Eastwood	40fa8294e3	Refactor MSC3030 `/timestamp_to_event` to move away from our snowflake pull from `destination` pattern (#14096 ) 1. `federation_client.timestamp_to_event(...)` now handles all `destination` looping and uses our generic `_try_destination_list(...)` helper. 2. Consistently handling `NotRetryingDestination` and `FederationDeniedError` across `get_pdu` , backfill, and the generic `_try_destination_list` which is used for many places we use this pattern. 3. `get_pdu(...)` now returns `PulledPduInfo` so we know which `destination` we ended up pulling the PDU from	2022-10-26 16:10:55 -05:00
Quentin Gliech	8756d5c87e	Save login tokens in database (#13844 ) * Save login tokens in database Signed-off-by: Quentin Gliech <quenting@element.io> * Add upgrade notes * Track login token reuse in a Prometheus metric Signed-off-by: Quentin Gliech <quenting@element.io>	2022-10-26 11:45:41 +01:00
Nick Mills-Barrett	c9dffd5b33	Remove unused `@lru_cache` decorator (#13595 ) * Remove unused `@lru_cache` decorator Spotted this working on something else. Co-authored-by: David Robertson <davidr@element.io>	2022-10-25 11:39:25 +01:00
dependabot[bot]	0b7830e457	Bump flake8-bugbear from 21.3.2 to 22.9.23 (#14042 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Erik Johnston <erik@matrix.org> Co-authored-by: David Robertson <davidr@element.io>	2022-10-19 19:38:24 +00:00
Abdullah Osama	a9934d48c1	Making parse_server_name more consistent (#14007 ) Fixes #12122	2022-10-11 12:42:11 +00:00
David Robertson	cb20b885cb	Always close _all_ `ijson` coroutines, even if doing so raises Exceptions (#14065 )	2022-10-06 18:17:50 +00:00
David Robertson	6f0c3e669d	Don't require `setuptools_rust` at runtime (#13952 )	2022-09-29 20:16:08 +00:00
Eric Eastwood	29269d9d3f	Fix `have_seen_event` cache not being invalidated (#13863 ) Fix https://github.com/matrix-org/synapse/issues/13856 Fix https://github.com/matrix-org/synapse/issues/13865 > Discovered while trying to make Synapse fast enough for [this MSC2716 test for importing many batches](https://github.com/matrix-org/complement/pull/214#discussion_r741678240). As an example, disabling the `have_seen_event` cache saves 10 seconds for each `/messages` request in that MSC2716 Complement test because we're not making as many federation requests for `/state` (speeding up `have_seen_event` itself is related to https://github.com/matrix-org/synapse/issues/13625) > > But this will also make `/messages` faster in general so we can include it in the [faster `/messages` milestone](https://github.com/matrix-org/synapse/milestone/11). > > -- https://github.com/matrix-org/synapse/issues/13856 ### The problem `_invalidate_caches_for_event` doesn't run in monolith mode which means we never even tried to clear the `have_seen_event` and other caches. And even in worker mode, it only runs on the workers, not the master (AFAICT). Additionally there was bug with the key being wrong so `_invalidate_caches_for_event` never invalidates the `have_seen_event` cache even when it does run. Because we were using the `@cachedList` wrong, it was putting items in the cache under keys like `((room_id, event_id),)` with a `set` in a `set` (ex. `(('!TnCIJPKzdQdUlIyXdQ:test', '$Iu0eqEBN7qcyF1S9B3oNB3I91v2o5YOgRNPwi_78s-k'),)`) and we we're trying to invalidate with just `(room_id, event_id)` which did nothing.	2022-09-27 15:55:43 -05:00
Mathieu Velten	6bd8763804	Add cache invalidation across workers to module API (#13667 ) Signed-off-by: Mathieu Velten <mathieuv@matrix.org>	2022-09-21 15:32:01 +02:00
reivilibre	cf65433de2	Fix a memory leak when running the unit tests. (#13798 )	2022-09-14 15:29:05 +00:00
Erik Johnston	ebfeac7c5d	Check if Rust lib needs rebuilding. (#13759 ) This protects against the common mistake of failing to remember to rebuild Rust code after making changes.	2022-09-12 10:03:42 +00:00
reivilibre	cf11919ddd	Fix cache metrics not being updated when not using the legacy exposition module. (#13717 )	2022-09-08 15:30:48 +01:00
reivilibre	b455c2a5ec	Update Grafana dashboard to not use legacy metric names. (#13714 )	2022-09-06 12:21:21 +01:00
reivilibre	7bc110a19e	Generalise the `@cancellable` annotation so it can be used on functions other than just servlet methods. (#13662 )	2022-08-31 11:16:05 +00:00
David Robertson	4249082eed	Merge branch 'release-v1.66' into develop	2022-08-30 15:31:51 +01:00
Eric Eastwood	1eea73b413	Fix rate limit metrics registering twice and misreporting (#13649 ) * Fix rate limit metrics registering twice and misreporting Fix https://github.com/matrix-org/synapse/issues/13641 * Fix lints * Add changelog * Document `metrics_name=None`.	2022-08-30 12:08:29 +01:00
reivilibre	be4250c7a8	Add experimental configuration option to allow disabling legacy Prometheus metric names. (#13540 ) Co-authored-by: David Robertson <davidr@element.io>	2022-08-24 11:35:54 +00:00
Erik Johnston	f7ddfe17a3	Speed up `@cachedList` (#13591 ) This speeds things up by ~2x. The vast majority of the time is now spent in `LruCache` moving things around the linked lists. We do this via two things: 1. Don't create a deferred per-key during bulk set operations in `DeferredCache`. Instead, only create them if a subsequent caller asks for the key. 2. Add a bulk lookup API to `DeferredCache` rather than use a loop.	2022-08-23 14:53:27 +00:00
Nick Mills-Barrett	5e7847dc92	Cache user IDs instead of profile objects (#13573 ) The profile objects are never used and increase cache size significantly.	2022-08-23 09:49:59 +00:00
Sean Quah	b251cff819	Fix incorrect juggling of logging contexts in `_PerHostRatelimiter` (#13554 ) Signed-off-by: Sean Quah <seanq@matrix.org> Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>	2022-08-18 16:26:26 +01:00
Eric Eastwood	d64653d062	Track number of hosts affected by the rate limiter (#13541 ) Track number of hosts affected by the rate limiter so we can differentiate one really noisy homeserver from a general ratelimit tuning problem across the federation. Follow-up to https://github.com/matrix-org/synapse/pull/13534 Part of https://github.com/matrix-org/synapse/issues/13356	2022-08-18 10:05:07 -05:00
Eric Eastwood	49d04e43df	Add metrics to track how the rate limiter is affecting requests (sleep/reject) (#13534 ) Related to https://github.com/matrix-org/synapse/pull/13499 Part of https://github.com/matrix-org/synapse/issues/13356	2022-08-17 16:10:07 -05:00
Eric Eastwood	c6ee9c0ee4	Add metrics to track rate limiter queue timing (#13544 )	2022-08-17 10:38:05 +01:00
Eric Eastwood	344a2f767c	Instrument `FederationStateIdsServlet` - `/state_ids` (#13499 ) Instrument FederationStateIdsServlet - `/state_ids` so it's easier to follow what's going on in Jaeger when viewing a trace.	2022-08-15 19:41:23 +01:00
Nick Mills-Barrett	41320a0554	Optimise async get event lookups (#13435 ) Still maintains local in memory lookup optimisation, but does any external lookup as part of the deferred that prevents duplicate lookups for the same event at once. This makes the assumption that fetching from an external cache is a non-zero load operation.	2022-08-04 15:49:55 +01:00
Dirk Klimpel	d6e94ad9d9	Rename `RateLimitConfig` to `RatelimitSettings` (#13442 )	2022-08-03 10:40:20 +01:00
Erik Johnston	0b87eb8e0c	Make DictionaryCache have better expiry properties (#13292 )	2022-07-21 17:13:44 +01:00
Nick Mills-Barrett	cc21a431f3	Async get event cache prep (#13242 ) Some experimental prep work to enable external event caching based on #9379 & #12955. Doesn't actually move the cache at all, just lays the groundwork for async implemented caches. Signed off by Nick @ Beeper (@Fizzadar)	2022-07-15 09:30:46 +00:00
David Robertson	6ba732fefe	Type `tests.utils` (#13028 ) * Cast to postgres types when handling postgres db * Remove unused method * Easy annotations * Annotate create_room * Use `ParamSpec` to annotate looping_call * Annotate `default_config` * Track `now` as a float `time_ms` returns an int like the proper Synapse `Clock` * Introduce a `Timer` dataclass * Introduce a Looper type * Suppress checking of a mock * tests.utils is typed * Changelog * Whoops, import ParamSpec from typing_extensions * ditch the psycopg2 casts	2022-07-05 15:13:47 +01:00
Quentin Gliech	fe1daad672	Move the "email unsubscribe" resource, refactor the macaroon generator & simplify the access token verification logic. (#12986 ) This simplifies the access token verification logic by removing the `rights` parameter which was only ever used for the unsubscribe link in email notifications. The latter has been moved under the `/_synapse` namespace, since it is not a standard API. This also makes the email verification link more secure, by embedding the app_id and pushkey in the macaroon and verifying it. This prevents the user from tampering the query parameters of that unsubscribe link. Macaroon generation is refactored: - Centralised all macaroon generation and verification logic to the `MacaroonGenerator` - Moved to `synapse.utils` - Changed the constructor to require only a `Clock`, hostname, and a secret key (instead of a full `Homeserver`). - Added tests for all methods.	2022-06-14 09:12:08 -04:00
David Robertson	f30bcbd84a	Fix Synapse git info missing in version strings (#12973 )	2022-06-07 15:24:11 +01:00
Patrick Cloke	759f9c09e1	Fix caching behavior for relations push rules. (#12859 ) By always returning all requested values from the function wrapped by cachedList. Otherwise implicit None values get added into the cache, which are unexpected.	2022-05-25 07:49:54 -04:00
Sean Quah	2be5a2b07b	Fix `RetryDestinationLimiter` re-starting finished log contexts (#12803 ) Signed-off-by: Sean Quah <seanq@matrix.org>	2022-05-19 20:17:10 +01:00
David Robertson	d4713d3e33	Discard null-containing strings before updating the user directory (#12762 )	2022-05-18 11:28:14 +01:00
Shay	cde8af9a49	Add config flags to allow for cache auto-tuning (#12701 )	2022-05-13 12:32:39 -07:00
Erik Johnston	8dd3e0e084	Immediately retry any requests that have backed off when a server comes back online. (#12500 ) Otherwise it can take up to a minute for any in-flight `/send` requests to be retried.	2022-05-10 10:39:54 +01:00

1 2 3 4 5 ...

812 Commits (6d14fdc2710688014a7a66cc48485462c6e86a1e)