MatrixSynapse

Commit Graph

Author	SHA1	Message	Date
Jason Little	21fea6b749	Prefill events after invalidate not before when persisting events (#15758 ) Fixes #15757	2023-06-14 09:42:18 +01:00
Eric Eastwood	8ddb2de553	Document `looping_call()` functionality that will wait for the given function to finish before scheduling another (#15772 ) Thanks to @erikjohnston for clarifying, https://github.com/matrix-org/synapse/pull/15743#discussion_r1226544457 We don't have to worry about calls stacking up if the given function takes longer than the scheduled time.	2023-06-13 16:34:54 -05:00
Erik Johnston	c485ed1c5a	Clear event caches when we purge history (#15609 ) This should help a little with #13476 --------- Co-authored-by: Patrick Cloke <patrickc@matrix.org>	2023-06-08 13:14:40 +01:00
Patrick Cloke	c01343de43	Add stricter mypy options (#15694 ) Enable warn_unused_configs, strict_concatenate, disallow_subclassing_any, and disallow_incomplete_defs.	2023-05-31 07:18:29 -04:00
Eric Eastwood	77156a4bc1	Process previously failed backfill events in the background (#15585 ) Process previously failed backfill events in the background because they are bound to fail again and we don't need to waste time holding up the request for something that is bound to fail again. Fix https://github.com/matrix-org/synapse/issues/13623 Follow-up to https://github.com/matrix-org/synapse/issues/13621 and https://github.com/matrix-org/synapse/issues/13622 Part of making `/messages` faster: https://github.com/matrix-org/synapse/issues/13356	2023-05-24 23:22:24 -05:00
Patrick Cloke	1f55c04cbc	Improve type hints for cached decorator. (#15658 ) The cached decorators always return a Deferred, which was not properly propagated. It was close enough when wrapping coroutines, but failed if a bare function was wrapped.	2023-05-24 12:59:31 +00:00
Sean Quah	d0de452d12	Fix `HomeServer`s leaking during `trial` test runs (#15630 ) This change fixes two memory leaks during `trial` test runs. Garbage collection is disabled during each test case and a gen-0 GC is run at the end of each test. However, when the gen-0 GC is run, the `TestCase` object usually still holds references to the `HomeServer` used during the test. As a result, the `HomeServer` gets promoted to gen-1 and then never garbage collected. Fix this by periodically running full GCs. Additionally, fix `HomeServer`s leaking after tests that touch inbound federation due to `FederationRateLimiter`s adding themselves to a global set, by turning the set into a `WeakSet`. Resolves #15622. Signed-off-by: Sean Quah <seanq@matrix.org>	2023-05-19 11:17:12 +01:00
Sean Quah	68dcd2cbcb	Re-type config paths in `ConfigError`s to be `StrSequence`s (#15615 ) Part of #14809. Signed-off-by: Sean Quah <seanq@matrix.org>	2023-05-18 11:11:30 +01:00
Andrew Morgan	7c95b65873	Clean up and clarify "Create or modify Account" Admin API documentation (#15544 )	2023-05-05 15:51:46 +01:00
David Robertson	3b0083c92a	Use immutabledict instead of frozendict (#15113 ) Additionally: * Consistently use `freeze()` in test --------- Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> Co-authored-by: 6543 <6543@obermui.de>	2023-03-22 17:15:34 +00:00
dependabot[bot]	9bb2eac719	Bump black from 22.12.0 to 23.1.0 (#15103 )	2023-02-22 15:29:09 -05:00
Sean Quah	a302d3ecf7	Remove unnecessary reactor reference from `_PerHostRatelimiter` (#14842 ) Fix up #14812 to avoid introducing a reference to the reactor. Signed-off-by: Sean Quah <seanq@matrix.org>	2023-01-16 13:16:19 +00:00
Sean Quah	772e8c2385	Fix stack overflow in `_PerHostRatelimiter` due to synchronous requests (#14812 ) When there are many synchronous requests waiting on a `_PerHostRatelimiter`, each request will be started recursively just after the previous request has completed. Under the right conditions, this leads to stack exhaustion. A common way for requests to become synchronous is when the remote client disconnects early, because the homeserver is overloaded and slow to respond. Avoid stack exhaustion under these conditions by deferring subsequent requests until the next reactor tick. Fixes #14480. Signed-off-by: Sean Quah <seanq@matrix.org>	2023-01-13 00:16:21 +00:00
reivilibre	ba4ea7d13f	Batch up replication requests to request the resyncing of remote users's devices. (#14716 )	2023-01-10 11:17:59 +00:00
Patrick Cloke	630d0aeaf6	Support RFC7636 PKCE in the OAuth 2.0 flow. (#14750 ) PKCE can protect against certain attacks and is enabled by default. Support can be controlled manually by setting the pkce_method of each oidc_providers entry to 'auto' (default), 'always', or 'never'. This is required by Twitter OAuth 2.0 support.	2023-01-04 14:58:08 -05:00
Patrick Cloke	3aeca2588b	Add missing type hints to tests.config. (#14681 )	2022-12-16 08:53:28 -05:00
reivilibre	864c3f85b0	Improve type annotations for the helper methods on a `CachedFunction`. (#14685 )	2022-12-16 13:04:54 +00:00
Mathieu Velten	54c012c5a8	Make `handle_new_client_event` throws `PartialStateConflictError` (#14665 ) Then adapts calling code to retry when needed so it doesn't 500 to clients. Signed-off-by: Mathieu Velten <mathieuv@matrix.org> Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>	2022-12-15 16:04:23 +00:00
Patrick Cloke	9d8a3234ba	Respond with proper error responses on unknown paths. (#14621 ) Returns a proper 404 with an errcode of M_RECOGNIZED for unknown endpoints per MSC3743.	2022-12-08 11:37:05 -05:00
Patrick Cloke	da77720752	Check the stream position before checking if the cache is empty. (#14639 ) An empty cache does not mean the entity has no changed, if it is earlier than the earliest known stream position return that the entity has changed since the cache cannot accurately answer that query.	2022-12-08 11:35:49 -05:00
Erik Johnston	cee9445884	Better return type for `get_all_entities_changed` (#14604 ) Help callers from using the return value incorrectly by ensuring that callers explicitly check if there was a cache hit or not.	2022-12-05 15:19:14 -05:00
Patrick Cloke	6a8310f3df	Compare to the earliest known stream pos in the stream change cache. (#14435 ) The internal methods of the StreamChangeCache were inconsistently treating the earliest known stream position as valid. It is now treated as invalid, meaning the cache cannot determine if an entity at the earliest known stream position has changed or not.	2022-12-05 09:00:59 -05:00
Patrick Cloke	1799a54a54	Batch fetch bundled annotations (#14491 ) Avoid an n+1 query problem and fetch the bundled aggregations for m.annotation relations in a single query instead of a query per event. This applies similar logic for as was previously done for edits in `8b309adb43` (#11660) and threads in `b65acead42` (#11752).	2022-11-22 07:26:11 -05:00
Patrick Cloke	d8cc86eff4	Remove redundant types from comments. (#14412 ) Remove type hints from comments which have been added as Python type hints. This helps avoid drift between comments and reality, as well as removing redundant information. Also adds some missing type hints which were simple to fill in.	2022-11-16 15:25:24 +00:00
Patrick Cloke	13ca8bb2fc	Remove duplicated code to evict entries. (#14410 ) This code was factored out to a method, but also left in-place. Calling this twice in a row makes no sense: the first call will reduce the size appropriately, but the loop will immediately exit since the cache size was already reduced.	2022-11-10 15:33:34 -05:00
Eric Eastwood	40fa8294e3	Refactor MSC3030 `/timestamp_to_event` to move away from our snowflake pull from `destination` pattern (#14096 ) 1. `federation_client.timestamp_to_event(...)` now handles all `destination` looping and uses our generic `_try_destination_list(...)` helper. 2. Consistently handling `NotRetryingDestination` and `FederationDeniedError` across `get_pdu` , backfill, and the generic `_try_destination_list` which is used for many places we use this pattern. 3. `get_pdu(...)` now returns `PulledPduInfo` so we know which `destination` we ended up pulling the PDU from	2022-10-26 16:10:55 -05:00
Quentin Gliech	8756d5c87e	Save login tokens in database (#13844 ) * Save login tokens in database Signed-off-by: Quentin Gliech <quenting@element.io> * Add upgrade notes * Track login token reuse in a Prometheus metric Signed-off-by: Quentin Gliech <quenting@element.io>	2022-10-26 11:45:41 +01:00
Nick Mills-Barrett	c9dffd5b33	Remove unused `@lru_cache` decorator (#13595 ) * Remove unused `@lru_cache` decorator Spotted this working on something else. Co-authored-by: David Robertson <davidr@element.io>	2022-10-25 11:39:25 +01:00
dependabot[bot]	0b7830e457	Bump flake8-bugbear from 21.3.2 to 22.9.23 (#14042 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Erik Johnston <erik@matrix.org> Co-authored-by: David Robertson <davidr@element.io>	2022-10-19 19:38:24 +00:00
Abdullah Osama	a9934d48c1	Making parse_server_name more consistent (#14007 ) Fixes #12122	2022-10-11 12:42:11 +00:00
David Robertson	cb20b885cb	Always close _all_ `ijson` coroutines, even if doing so raises Exceptions (#14065 )	2022-10-06 18:17:50 +00:00
David Robertson	6f0c3e669d	Don't require `setuptools_rust` at runtime (#13952 )	2022-09-29 20:16:08 +00:00
Eric Eastwood	29269d9d3f	Fix `have_seen_event` cache not being invalidated (#13863 ) Fix https://github.com/matrix-org/synapse/issues/13856 Fix https://github.com/matrix-org/synapse/issues/13865 > Discovered while trying to make Synapse fast enough for [this MSC2716 test for importing many batches](https://github.com/matrix-org/complement/pull/214#discussion_r741678240). As an example, disabling the `have_seen_event` cache saves 10 seconds for each `/messages` request in that MSC2716 Complement test because we're not making as many federation requests for `/state` (speeding up `have_seen_event` itself is related to https://github.com/matrix-org/synapse/issues/13625) > > But this will also make `/messages` faster in general so we can include it in the [faster `/messages` milestone](https://github.com/matrix-org/synapse/milestone/11). > > -- https://github.com/matrix-org/synapse/issues/13856 ### The problem `_invalidate_caches_for_event` doesn't run in monolith mode which means we never even tried to clear the `have_seen_event` and other caches. And even in worker mode, it only runs on the workers, not the master (AFAICT). Additionally there was bug with the key being wrong so `_invalidate_caches_for_event` never invalidates the `have_seen_event` cache even when it does run. Because we were using the `@cachedList` wrong, it was putting items in the cache under keys like `((room_id, event_id),)` with a `set` in a `set` (ex. `(('!TnCIJPKzdQdUlIyXdQ:test', '$Iu0eqEBN7qcyF1S9B3oNB3I91v2o5YOgRNPwi_78s-k'),)`) and we we're trying to invalidate with just `(room_id, event_id)` which did nothing.	2022-09-27 15:55:43 -05:00
Mathieu Velten	6bd8763804	Add cache invalidation across workers to module API (#13667 ) Signed-off-by: Mathieu Velten <mathieuv@matrix.org>	2022-09-21 15:32:01 +02:00
reivilibre	cf65433de2	Fix a memory leak when running the unit tests. (#13798 )	2022-09-14 15:29:05 +00:00
Erik Johnston	ebfeac7c5d	Check if Rust lib needs rebuilding. (#13759 ) This protects against the common mistake of failing to remember to rebuild Rust code after making changes.	2022-09-12 10:03:42 +00:00
reivilibre	cf11919ddd	Fix cache metrics not being updated when not using the legacy exposition module. (#13717 )	2022-09-08 15:30:48 +01:00
reivilibre	b455c2a5ec	Update Grafana dashboard to not use legacy metric names. (#13714 )	2022-09-06 12:21:21 +01:00
reivilibre	7bc110a19e	Generalise the `@cancellable` annotation so it can be used on functions other than just servlet methods. (#13662 )	2022-08-31 11:16:05 +00:00
David Robertson	4249082eed	Merge branch 'release-v1.66' into develop	2022-08-30 15:31:51 +01:00
Eric Eastwood	1eea73b413	Fix rate limit metrics registering twice and misreporting (#13649 ) * Fix rate limit metrics registering twice and misreporting Fix https://github.com/matrix-org/synapse/issues/13641 * Fix lints * Add changelog * Document `metrics_name=None`.	2022-08-30 12:08:29 +01:00
reivilibre	be4250c7a8	Add experimental configuration option to allow disabling legacy Prometheus metric names. (#13540 ) Co-authored-by: David Robertson <davidr@element.io>	2022-08-24 11:35:54 +00:00
Erik Johnston	f7ddfe17a3	Speed up `@cachedList` (#13591 ) This speeds things up by ~2x. The vast majority of the time is now spent in `LruCache` moving things around the linked lists. We do this via two things: 1. Don't create a deferred per-key during bulk set operations in `DeferredCache`. Instead, only create them if a subsequent caller asks for the key. 2. Add a bulk lookup API to `DeferredCache` rather than use a loop.	2022-08-23 14:53:27 +00:00
Nick Mills-Barrett	5e7847dc92	Cache user IDs instead of profile objects (#13573 ) The profile objects are never used and increase cache size significantly.	2022-08-23 09:49:59 +00:00
Sean Quah	b251cff819	Fix incorrect juggling of logging contexts in `_PerHostRatelimiter` (#13554 ) Signed-off-by: Sean Quah <seanq@matrix.org> Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>	2022-08-18 16:26:26 +01:00
Eric Eastwood	d64653d062	Track number of hosts affected by the rate limiter (#13541 ) Track number of hosts affected by the rate limiter so we can differentiate one really noisy homeserver from a general ratelimit tuning problem across the federation. Follow-up to https://github.com/matrix-org/synapse/pull/13534 Part of https://github.com/matrix-org/synapse/issues/13356	2022-08-18 10:05:07 -05:00
Eric Eastwood	49d04e43df	Add metrics to track how the rate limiter is affecting requests (sleep/reject) (#13534 ) Related to https://github.com/matrix-org/synapse/pull/13499 Part of https://github.com/matrix-org/synapse/issues/13356	2022-08-17 16:10:07 -05:00
Eric Eastwood	c6ee9c0ee4	Add metrics to track rate limiter queue timing (#13544 )	2022-08-17 10:38:05 +01:00
Eric Eastwood	344a2f767c	Instrument `FederationStateIdsServlet` - `/state_ids` (#13499 ) Instrument FederationStateIdsServlet - `/state_ids` so it's easier to follow what's going on in Jaeger when viewing a trace.	2022-08-15 19:41:23 +01:00
Nick Mills-Barrett	41320a0554	Optimise async get event lookups (#13435 ) Still maintains local in memory lookup optimisation, but does any external lookup as part of the deferred that prevents duplicate lookups for the same event at once. This makes the assumption that fetching from an external cache is a non-zero load operation.	2022-08-04 15:49:55 +01:00

1 2 3 4 5 ...

823 Commits (0f02f0b4da92229e88e27a92ea3bfa523457bfc1)