Commit Graph

136 Commits (matrix-org-hotfixes)

Author SHA1 Message Date
Olivier Wilkinson (reivilibre) 8c60ebf209 Revert "TEMPORARY Subdivide _resolve_events Measure blocks"
This reverts commit f3db863420.
2023-10-16 18:24:46 +01:00
Olivier Wilkinson (reivilibre) 1e1cf4bb9d Revert "TEMPORARY Add more Measure blocks"
This reverts commit adfa0fded3.
2023-10-16 18:24:45 +01:00
Olivier Wilkinson (reivilibre) adfa0fded3 TEMPORARY Add more Measure blocks 2023-10-16 18:15:48 +01:00
Olivier Wilkinson (reivilibre) f3db863420 TEMPORARY Subdivide _resolve_events Measure blocks 2023-10-16 17:55:05 +01:00
Patrick Cloke cdb89dcefe
Improve state types. (#16395) 2023-09-28 07:01:46 -04:00
Patrick Cloke d38d0dffc9
Use StrCollection in additional places. (#16301) 2023-09-13 07:57:19 -04:00
Erik Johnston fc1e534e41
Speed up updating state in large rooms (#15971)
This should speed up updating state in rooms with lots of state.
2023-07-20 15:51:28 +01:00
Erik Johnston 67f9e5293e
Ensure a long state res does not starve CPU (#15960)
We do this by yielding the reactor in hot loops.
2023-07-19 17:00:33 +00:00
Eric Eastwood 703a8f9c67
Instrument `state` and `state_group` storage related things (tracing) (#15610)
Instrument `state` and `state_group` storage related things (tracing) so it's a little more clear where these database transactions are coming from as there is a lot of wires crossing in these functions.

Part of `/messages` performance investigation: https://github.com/matrix-org/synapse/issues/13356
2023-05-19 12:26:58 -05:00
David Robertson 3b0083c92a
Use immutabledict instead of frozendict (#15113)
Additionally:

* Consistently use `freeze()` in test

---------

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: 6543 <6543@obermui.de>
2023-03-22 17:15:34 +00:00
Sean Quah d0c713cc85
Return read-only collections from `@cached` methods (#13755)
It's important that collections returned from `@cached` methods are not
modified, otherwise future retrievals from the cache will return the
modified collection.

This applies to the return values from `@cached` methods and the values
inside the dictionaries returned by `@cachedList` methods. It's not
necessary for the dictionaries returned by `@cachedList` methods
themselves to be read-only.

Signed-off-by: Sean Quah <seanq@matrix.org>
Co-authored-by: David Robertson <davidr@element.io>
2023-02-10 23:29:00 +00:00
Shay 03bccd542b
Add a class UnpersistedEventContext to allow for the batching up of storing state groups (#14675)
* add class UnpersistedEventContext

* modify create new client event to create unpersistedeventcontexts

* persist event contexts after creation

* fix tests to persist unpersisted event contexts

* cleanup

* misc lints + cleanup

* changelog + fix comments

* lints

* fix batch insertion?

* reduce redundant calculation

* add unpersisted event classes

* rework compute_event_context, split into function that returns unpersisted event context and then persists it

* use calculate_context_info to create unpersisted event contexts

* update typing

* $%#^&*

* black

* fix comments and consolidate classes, use attr.s for class

* requested changes

* lint

* requested changes

* requested changes

* refactor to be stupidly explicit

* clearer renaming and flow

* make partial state non-optional

* update docstrings

---------

Co-authored-by: Erik Johnston <erik@matrix.org>
2023-02-09 13:05:02 -08:00
David Robertson 4f4d690423
Allow `compute_state_after_events` to use partial state (#14676)
* Allow `compute_state_after_events` to use partial state

if fetching a subset of state that is trusted during a partial join.

* Changelog
2022-12-14 14:52:35 +00:00
David Robertson b5b5f66084
Move `StateFilter` to `synapse.types` (#14668)
* Move `StateFilter` to `synapse.types`

* Changelog
2022-12-12 16:19:30 +00:00
Mathieu Velten 75888c2b1f
Faster joins: do not wait for full state when creating events to send (#14403)
Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
2022-11-17 17:01:14 +01:00
Shay a2cf66a94d
Prepatory work for batching events to send (#13487)
This PR begins work on batching up events during the creation of a room. The PR splits out the creation and sending/persisting of the events. The first three events in the creation of the room-creating the room, joining the creator to the room, and the power levels event are sent sequentially, while the subsequent events are created and collected to be sent at the end of the function. This is currently done by appending them to a list and then iterating over the list to send, the next step (after this PR) would be to send and persist the collected events as a batch.
2022-09-28 10:39:03 +01:00
Sean Quah b73cbb8215
Avoid putting rejected events in room state (#13723)
Signed-off-by: Sean Quah <seanq@matrix.org>
2022-09-16 12:45:04 +01:00
Nick Mills-Barrett 42b11d5565
Remove cached wrap on `_get_joined_users_from_context` method (#13569)
The method doesn't actually do any data fetching and the method that
does, `_get_joined_profile_from_event_id`, has its own cache.

Signed off by Nick @ Beeper (@Fizzadar).
2022-08-31 12:19:39 +01:00
David Robertson c406d50d2d
Rename `event_map` to `unpersisted_events` (#13603) 2022-08-24 21:06:31 +01:00
Nick Mills-Barrett 5e7847dc92
Cache user IDs instead of profile objects (#13573)
The profile objects are never used and increase cache size significantly.
2022-08-23 09:49:59 +00:00
David Robertson 7a19995120
Correct a misnamed argument in state res v2 (#13467)
In state res v2, we apply two passes of iterative auth checks. The first
pass replays power events and events in their auth chains, but only
those belonging to the full conflicted set. The source code as written
suggests that we want only those belonging to the auth difference (which
is a smaller set of events).

At runtime we were doing the correct thing anyway, because the only
callsite of `_reverse_topological_power_sort` passes in the
`full_conflicted_set`. So this really is just a rename.
2022-08-08 16:59:56 +01:00
Sean Quah 224d792dd7
Refactor `_resolve_state_at_missing_prevs` to return an `EventContext` (#13404)
Previously, `_resolve_state_at_missing_prevs` returned the resolved
state before an event and a partial state flag. These were unwieldy to
carry around would only ever be used to build an event context. Build
the event context directly instead.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-08-01 13:53:56 +01:00
Sean Quah 335ebb21cc
Faster room joins: avoid blocking when pulling events with missing prevs (#13355)
Avoid blocking on full state in `_resolve_state_at_missing_prevs` and
return a new flag indicating whether the resolved state is partial.
Thread that flag around so that it makes it into the event context.

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2022-07-26 12:39:23 +01:00
Erik Johnston 13341dde5a
Don't hold onto full state in state cache (#13324) 2022-07-21 16:02:02 +01:00
Erik Johnston c6a05063ff
Don't pull out the full state when creating an event (#13281) 2022-07-18 10:05:30 +01:00
Erik Johnston 0731e0829c
Don't pull out the full state when storing state (#13274) 2022-07-15 12:59:45 +00:00
David Robertson 7281591f4c
Use state before join to determine if we `_should_perform_remote_join` (#13270)
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2022-07-15 12:20:47 +00:00
Erik Johnston 7be954f59b
Fix a bug which could lead to incorrect state (#13278)
There are two fixes here:
1. A long-standing bug where we incorrectly calculated `delta_ids`; and
2. A bug introduced in #13267 where we got current state incorrect.
2022-07-15 11:06:41 +00:00
Richard van der Hoff fe15a865a5
Rip out auth-event reconciliation code (#12943)
There is a corner in `_check_event_auth` (long known as "the weird corner") where, if we get an event with auth_events which don't match those we were expecting, we attempt to resolve the diffence between our state and the remote's with a state resolution.

This isn't specced, and there's general agreement we shouldn't be doing it.

However, it turns out that the faster-joins code was relying on it, so we need to introduce something similar (but rather simpler) for that.
2022-07-14 21:52:26 +00:00
Erik Johnston 0ca4172b5d
Don't pull out state in `compute_event_context` for unconflicted state (#13267) 2022-07-14 13:57:02 +00:00
Sean Quah 1391a76cd2
Faster room joins: fix race in recalculation of current room state (#13151)
Bounce recalculation of current state to the correct event persister and
move recalculation of current state into the event persistence queue, to
avoid concurrent updates to a room's current state.

Also give recalculation of a room's current state a real stream
ordering.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-07-07 12:19:31 +00:00
Richard van der Hoff 8c2825276f
Skip waiting for full state for incoming events (#13144)
When we receive an event over federation during a faster join, there is no need
to wait for full state, since we have a whole reconciliation process designed
to take the partial state into account.
2022-07-01 10:19:27 +01:00
Richard van der Hoff 8ecf6be1e1
Move some event auth checks out to a different method (#13065)
* Add auth events to events used in tests

* Move some event auth checks out to a different method

Some of the event auth checks apply to an event's auth_events, rather than the
state at the event - which means they can play no part in state
resolution. Move them out to a separate method.

* Rename check_auth_rules_for_event

Now it only checks the state-dependent auth rules, it needs a better name.
2022-06-15 19:48:22 +01:00
David Robertson 941dc3db13
Track a histogram of state res durations (#13036) 2022-06-15 15:19:49 +01:00
Richard van der Hoff f68b5e5773 Merge branch 'rav/simplify_event_auth_interface' into develop 2022-06-13 11:34:59 +01:00
Richard van der Hoff 0d9d36b15c Remove `room_version` param from `check_auth_rules_for_event`
Instead, use the `room_version` property of the event we're checking.

The `room_version` was originally added as a parameter somewhere around #4482,
but really it's been redundant since #6875 added a `room_version` field to `EventBase`.
2022-06-12 23:13:10 +01:00
David Robertson 97053c9406
Type annotations for `test_v2` (#12985) 2022-06-09 09:48:04 +01:00
Erik Johnston 44de53bb79
Reduce state pulled from DB due to sending typing and receipts over federation (#12964)
Reducing the amount of state we pull from the DB is useful as fetching state is expensive in terms of DB, CPU and memory.
2022-06-06 16:46:11 +01:00
Erik Johnston e3163e2e11
Reduce the amount of state we pull from the DB (#12811) 2022-06-06 09:24:12 +01:00
Erik Johnston 1e453053cb
Rename storage classes (#12913) 2022-05-31 12:17:50 +00:00
Erik Johnston b83bc5fab5
Pull out less state when handling gaps mk2 (#12852) 2022-05-26 09:48:12 +00:00
Erik Johnston 4660d9fdcf
Fix up `state_store` naming (#12871) 2022-05-25 12:59:04 +01:00
Shay 19d79b6ebe
Refactor `resolve_state_groups_for_events` to not pull out full state when no state resolution happens. (#12775) 2022-05-18 10:15:52 -07:00
Dirk Klimpel 6edefef602
Add some type hints to datastore (#12717) 2022-05-17 15:29:06 +01:00
Erik Johnston c72d26c1e1
Refactor `EventContext` (#12689)
Refactor how the `EventContext` class works, with the intention of reducing the amount of state we fetch from the DB during event processing.

The idea here is to get rid of the cached `current_state_ids` and `prev_state_ids` that live in the `EventContext`, and instead defer straight to the database (and its caching). 

One change that may have a noticeable effect is that we now no longer prefill the `get_current_state_ids` cache on a state change. However, that query is relatively light, since its just a case of reading a table from the DB (unlike fetching state at an event which is more heavyweight). For deployments with workers this cache isn't even used.


Part of #12684
2022-05-10 19:43:13 +00:00
andrew do 01e625513a
remove constantly lib use and switch to enums. (#12624) 2022-05-04 11:26:11 +00:00
Sean Quah 800ba87cc8
Refactor and convert `Linearizer` to async (#12357)
Refactor and convert `Linearizer` to async. This makes a `Linearizer`
cancellation bug easier to fix.

Also refactor to use an async context manager, which eliminates an
unlikely footgun where code that doesn't immediately use the context
manager could forget to release the lock.

Signed-off-by: Sean Quah <seanq@element.io>
2022-04-05 15:43:52 +01:00
Richard van der Hoff d56202b038
Fix type of `events` in `StateGroupStorage` and `StateHandler` (#12156)
We make multiple passes over this, so a regular iterable won't do.
2022-03-04 10:25:18 +00:00
Richard van der Hoff e2e1d90a5e
Faster joins: persist to database (#12012)
When we get a partial_state response from send_join, store information in the
database about it:
 * store a record about the room as a whole having partial state, and stash the
   list of member servers too.
 * flag the join event itself as having partial state
 * also, for any new events whose prev-events are partial-stated, note that
   they will *also* be partial-stated.

We don't yet make any attempt to interpret this data, so API calls (and a bunch
of other things) are just going to get incorrect data.
2022-03-01 12:49:54 +00:00
Richard van der Hoff e24ff8ebe3
Remove `HomeServer.get_datastore()` (#12031)
The presence of this method was confusing, and mostly present for backwards
compatibility. Let's get rid of it.

Part of #11733
2022-02-23 11:04:02 +00:00