Commit Graph

726 Commits (6dd7fa12dc18835b18c0632b89c2563d806cd8ef)

Author SHA1 Message Date
Eric Eastwood 92d21faf12
Instrument `/messages` for understandable traces in Jaeger (#13368)
In Jaeger:

 - Before: huge list of uncategorized database calls
 - After: nice and collapsible into units of work
2022-08-03 10:57:38 -05:00
Richard van der Hoff 23768ccb4d
Faster joins: fix rejected events becoming un-rejected during resync (#13413)
Make sure that we re-check the auth rules during state resync, otherwise
rejected events get un-rejected.
2022-08-01 11:20:05 +01:00
Šimon Brandner 583f22780f
Use stable prefixes for MSC3827: filtering of `/publicRooms` by room type (#13370)
Signed-off-by: Šimon Brandner <simon.bra.ag@gmail.com>
2022-07-27 19:46:57 +01:00
Richard van der Hoff ca3db044a3
Fix infinite loop in partial-state resync (#13353)
Make sure that we only pull out events from the db once they have no
prev-events with partial state.
2022-07-26 11:47:31 +00:00
Patrick Cloke 8b603299bf
Remove unused argument for get_relations_for_event. (#13383) 2022-07-26 07:19:20 -04:00
Erik Johnston 43adf2521c
Refactor presence so we can prune user in room caches (#13313)
See #10826 and #10786 for context as to why we had to disable pruning on
those caches.

Now that `get_users_who_share_room_with_user` is called frequently only
for presence, we just need to make calls to it less frequent and then we
can remove the various levels of caching that is going on.
2022-07-25 09:21:06 +00:00
Erik Johnston 0b87eb8e0c
Make DictionaryCache have better expiry properties (#13292) 2022-07-21 17:13:44 +01:00
Patrick Cloke 50122754c8
Add missing types to opentracing. (#13345)
After this change `synapse.logging` is fully typed.
2022-07-21 12:01:52 +00:00
Nick Mills-Barrett 190f49d8ab
Use cache store remove base slaved (#13329)
This comes from two identical definitions in each of the base stores, and means the base slaved store is now empty and can be removed.
2022-07-21 11:51:30 +01:00
Eric Eastwood 0f971ca68e
Update `get_pdu` to return the original, pristine `EventBase` (#13320)
Update `get_pdu` to return the untouched, pristine `EventBase` as it was originally seen over federation (no metadata added). Previously, we returned the same `event` reference that we stored in the cache which downstream code modified in place and added metadata like setting it as an `outlier`  and essentially poisoned our cache. Now we always return a copy of the `event` so the original can stay pristine in our cache and re-used for the next cache call.

Split out from https://github.com/matrix-org/synapse/pull/13205

As discussed at:

 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918365746
 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918366125

Related to https://github.com/matrix-org/synapse/issues/12584. This PR doesn't fix that issue because it hits [`get_event` which exists from the local database before it tries to `get_pdu`](7864f33e28/synapse/federation/federation_client.py (L581-L594)).
2022-07-20 15:58:51 -05:00
Patrick Cloke a6895dd576
Add type annotations to `trace` decorator. (#13328)
Functions that are decorated with `trace` are now properly typed
and the type hints for them are fixed.
2022-07-19 14:14:30 -04:00
Erik Johnston de70b25e84
Reduce memory usage of state group cache (#13323) 2022-07-19 14:40:37 +01:00
David Robertson b977867358
Rate limit joins per-room (#13276) 2022-07-19 11:45:17 +00:00
Nick Mills-Barrett 2ee0b6ef4b
Safe async event cache (#13308)
Fix race conditions in the async cache invalidation logic, by separating
the async & local invalidation calls and ensuring any async call i
executed first.

Signed off by Nick @ Beeper (@Fizzadar).
2022-07-19 11:25:29 +00:00
Shay 7864f33e28
Increase batch size of `bulk_get_push_rules` and `_get_joined_profiles_from_event_ids`. (#13300) 2022-07-18 13:15:23 -07:00
Shay 15edf23626
Improve performance of query ` _get_subset_users_in_room_with_profiles` (#13299) 2022-07-18 12:35:45 -07:00
Erik Johnston f721f1baba
Revert "Make all `process_replication_rows` methods async (#13304)" (#13312)
This reverts commit 5d4028f217.
2022-07-18 14:28:14 +01:00
Nick Mills-Barrett 6785b0f39d
Use READ COMMITTED isolation level when purging rooms (#12942)
To close: #10294.

Signed off by Nick @ Beeper.
2022-07-18 14:17:24 +01:00
Nick Mills-Barrett 5d4028f217
Make all `process_replication_rows` methods async (#13304)
More prep work for asyncronous caching, also makes all process_replication_rows methods consistent (presence handler already is so).

Signed off by Nick @ Beeper (@Fizzadar)
2022-07-17 22:19:43 +01:00
Erik Johnston 0731e0829c
Don't pull out the full state when storing state (#13274) 2022-07-15 12:59:45 +00:00
Richard van der Hoff b116d3ce00
Bg update to populate new `events` table columns (#13215)
These columns were added back in Synapse 1.52, and have been populated for new
events since then. It's now (beyond) time to back-populate them for existing
events.
2022-07-15 12:47:26 +01:00
Nick Mills-Barrett cc21a431f3
Async get event cache prep (#13242)
Some experimental prep work to enable external event caching based on #9379 & #12955. Doesn't actually move the cache at all, just lays the groundwork for async implemented caches.

Signed off by Nick @ Beeper (@Fizzadar)
2022-07-15 09:30:46 +00:00
Nick Mills-Barrett 21eeacc995
Federation Sender & Appservice Pusher Stream Optimisations (#13251)
* Replace `get_new_events_for_appservice` with `get_all_new_events_stream`

The functions were near identical and this brings the AS worker closer
to the way federation senders work which can allow for multiple workers
to handle AS traffic.

* Pull received TS alongside events when processing the stream

This avoids an extra query -per event- when both federation sender
and appservice pusher process events.
2022-07-15 09:36:56 +01:00
Erik Johnston 0ca4172b5d
Don't pull out state in `compute_event_context` for unconflicted state (#13267) 2022-07-14 13:57:02 +00:00
andrew do 2d82cdafd2
expose whether a room is a space in the Admin API (#13208) 2022-07-12 15:30:53 +01:00
Erik Johnston e5716b631c
Don't pull out the full state when calculating push actions (#13078) 2022-07-11 20:08:39 +00:00
Erik Johnston f1711e1f5c
Remove delay when rotating event push actions (#13211)
We want to be as up to date as possible, and sleeping doesn't help here
and can mean we fall behind.
2022-07-11 16:51:30 +01:00
Erik Johnston 757bc0caef
Fix notification count after a highlighted message (#13223)
Fixes #13196

Broke by #13005
2022-07-08 14:00:29 +01:00
Sean Quah 1391a76cd2
Faster room joins: fix race in recalculation of current room state (#13151)
Bounce recalculation of current state to the correct event persister and
move recalculation of current state into the event persistence queue, to
avoid concurrent updates to a room's current state.

Also give recalculation of a room's current state a real stream
ordering.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-07-07 12:19:31 +00:00
Erik Johnston a0f51b059c
Fix bug where we failed to delete old push actions (#13194)
This happened if we encountered a stream ordering in `event_push_actions` that had more rows than the batch size of the delete, as If we don't delete any rows in an iteration then the next time round we get the exact same stream ordering and get stuck.
2022-07-06 12:09:19 +01:00
Sean Quah 68db233f0c
Handle race between persisting an event and un-partial stating a room (#13100)
Whenever we want to persist an event, we first compute an event context,
which includes the state at the event and a flag indicating whether the
state is partial. After a lot of processing, we finally try to store the
event in the database, which can fail for partial state events when the
containing room has been un-partial stated in the meantime.

We detect the race as a foreign key constraint failure in the data store
layer and turn it into a special `PartialStateConflictError` exception,
which makes its way up to the method in which we computed the event
context.

To make things difficult, the exception needs to cross a replication
request: `/fed_send_events` for events coming over federation and
`/send_event` for events from clients. We transport the
`PartialStateConflictError` as a `409 Conflict` over replication and
turn `409`s back into `PartialStateConflictError`s on the worker making
the request.

All client events go through
`EventCreationHandler.handle_new_client_event`, which is called in
*a lot* of places. Instead of trying to update all the code which
creates client events, we turn the `PartialStateConflictError` into a
`429 Too Many Requests` in
`EventCreationHandler.handle_new_client_event` and hope that clients
take it as a hint to retry their request.

On the federation event side, there are 7 places which compute event
contexts. 4 of them use outlier event contexts:
`FederationEventHandler._auth_and_persist_outliers_inner`,
`FederationHandler.do_knock`, `FederationHandler.on_invite_request` and
`FederationHandler.do_remotely_reject_invite`. These events won't have
the partial state flag, so we do not need to do anything for then.

The remaining 3 paths which create events are
`FederationEventHandler.process_remote_join`,
`FederationEventHandler.on_send_membership_event` and
`FederationEventHandler._process_received_pdu`.

We can't experience the race in `process_remote_join`, unless we're
handling an additional join into a partial state room, which currently
blocks, so we make no attempt to handle it correctly.

`on_send_membership_event` is only called by
`FederationServer._on_send_membership_event`, so we catch the
`PartialStateConflictError` there and retry just once.

`_process_received_pdu` is called by `on_receive_pdu` for incoming
events and `_process_pulled_event` for backfill. The latter should never
try to persist partial state events, so we ignore it. We catch the
`PartialStateConflictError` in `on_receive_pdu` and retry just once.

Refering to the graph of code paths in
https://github.com/matrix-org/synapse/issues/12988#issuecomment-1156857648
may make the above make more sense.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-07-05 16:12:52 +01:00
Erik Johnston 578a5e24a9
Use upserts for updating `event_push_summary` (#13153) 2022-07-05 13:51:04 +01:00
Andrew Morgan 6180e1bc4b Synapse 1.62.0rc3 (2022-07-04)
==============================
 
 Bugfixes
 --------
 
 - Update the version of the [ldap3 plugin](https://github.com/matrix-org/matrix-synapse-ldap3/) included in the `matrixdotorg/synapse` DockerHub images and the Debian packages hosted on `packages.matrix.org` to 0.2.1. This fixes [a bug](https://github.com/matrix-org/matrix-synapse-ldap3/pull/163) with usernames containing uppercase characters. ([\#13156](https://github.com/matrix-org/synapse/issues/13156))
 - Fix a bug introduced in Synapse 1.62.0rc1 affecting unread counts for users on small servers. ([\#13168](https://github.com/matrix-org/synapse/issues/13168))
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEgQG31Z317NrSMt0QiISIDS7+X/QFAmLDDVgACgkQiISIDS7+
 X/Q+KQ//WuWB9hfAW8XEyYHWox95zaITsAzY/TTG1IXygAMjgEk2+9utdRaX3wbk
 YDaZCeEw+vbK/3w/lt1RzI30K3uVCZVcW2DTQr1Qi4B+UWLOlCsVfOT9LcMvNoJe
 ww/cOK6RpPgTlqk5ij0MtdjfWkAJeToi7ESMooORxhWFm3Zd8e5BpbNv89WUBZhk
 zCqCjIdjSF+Mwk8NwmU1iJi5JQY/+Xl51uk2+wGIAe4vtgPTz7PJmoPF1E6nGGVF
 9OYdlWU4H7u6js8n05QL2jKtX34uszCo2hwoW2aFPPmF0B2CFEV6WFBiDOppLZ1g
 ZMJv1s/34RXoBu8pAuJnq2BZkWxu99LRmPV+f/R+S0jDT1MH9tdSdhfcGu7iH/Y9
 uguGX3OOlxnkUb5o825Xt3mvBcVaTGY+sspFtB12RtXmWRdll/Hq6w11ZN5f6qDy
 Nr/DuoPjMAH7kzelFn/GpP6K8zX8iYjf0lLCyrbYV7OYAI6/I+Vao+sT2ctHD1T8
 s4aTTx1bEl23mo/RiqH2fRHaPhBjZKW0uv6iRNqDE2ThYPAXinVtt7MiUU0QGco5
 vMca/RZBkEj0Lov0AleBx4XRXlBTyq5BX2V1frYLenKp42bDzN9sgsPAOPeKieHW
 qjr+Ti9i47wGADXs2GI/mke/C8jlONEKJm/v8mwXItn8Za7wBJc=
 =SpI6
 -----END PGP SIGNATURE-----

Merge tag 'v1.62.0rc3' into develop

Synapse 1.62.0rc3 (2022-07-04)
==============================

Bugfixes
--------

- Update the version of the [ldap3 plugin](https://github.com/matrix-org/matrix-synapse-ldap3/) included in the `matrixdotorg/synapse` DockerHub images and the Debian packages hosted on `packages.matrix.org` to 0.2.1. This fixes [a bug](https://github.com/matrix-org/matrix-synapse-ldap3/pull/163) with usernames containing uppercase characters. ([\#13156](https://github.com/matrix-org/synapse/issues/13156))
- Fix a bug introduced in Synapse 1.62.0rc1 affecting unread counts for users on small servers. ([\#13168](https://github.com/matrix-org/synapse/issues/13168))
2022-07-04 17:35:06 +01:00
Erik Johnston 723ce73d02
Fix stuck notification counts on small servers (#13168) 2022-07-04 16:02:21 +01:00
Patrick Cloke b0366853ca Merge remote-tracking branch 'origin/release-v1.62' into develop 2022-06-30 13:27:24 -04:00
Erik Johnston dbce28b2f1
Fix unread counts on large servers (#13140) 2022-06-30 15:08:40 +01:00
Erik Johnston a3a05c812d
Add index to help delete old push actions (#13141) 2022-06-30 14:05:49 +00:00
Šimon Brandner 13e359aec8
Implement MSC3827: Filtering of `/publicRooms` by room type (#13031)
Signed-off-by: Šimon Brandner <simon.bra.ag@gmail.com>
2022-06-29 17:12:45 +00:00
Erik Johnston 92a0c18ef0
Improve performance of getting unread counts in rooms (#13119) 2022-06-29 10:32:38 +00:00
Erik Johnston 7469824d58
Fix serialization errors when rotating notifications (#13118) 2022-06-28 13:13:44 +01:00
reivilibre b26cbe3d45
Fix type error that made its way onto develop (#13098)
* Fix type error introduced accidentally by #13045

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2022-06-17 13:05:27 +01:00
Erik Johnston 5ef05c70c3
Rotate notifications more frequently (#13096) 2022-06-17 10:58:00 +00:00
Erik Johnston 5099b5ecc7
Use new `device_list_changes_in_room` table when getting device list changes (#13045) 2022-06-17 11:42:03 +01:00
Erik Johnston 8ceed5e6b5
Add desc to `get_earliest_token_for_stats` (#13085) 2022-06-16 17:50:46 +00:00
David Robertson 97e9fbe1b2
Type annotations in `synapse.databases.main.devices` (#13025)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2022-06-15 15:20:04 +00:00
Erik Johnston 0d1d3e0708
Speed up `get_unread_event_push_actions_by_room` (#13005)
Fixes #11887 hopefully.

The core change here is that `event_push_summary` now holds a summary of counts up until a much more recent point, meaning that the range of rows we need to count in `event_push_actions` is much smaller.

This needs two major changes:
1. When we get a receipt we need to recalculate `event_push_summary` rather than just delete it
2. The logic for deleting `event_push_actions` is now divorced from calculating `event_push_summary`.

In future it would be good to calculate `event_push_summary` while we persist a new event (it should just be a case of adding one to the relevant rows in `event_push_summary`), as that will further simplify the get counts logic and remove the need for us to periodically update `event_push_summary` in a background job.
2022-06-15 15:17:14 +00:00
Richard van der Hoff 75fb10ee45
Clean up schema for `event_edges` (#12893)
* Remove redundant references to `event_edges.room_id`

We don't need to care about the room_id here, because we are already checking
the event id.

* Clean up the event_edges table

We make a number of changes to `event_edges`:

 * We give the `room_id` and `is_state` columns defaults (null and false
   respectively) so that we can stop populating them.
 * We drop any rows that have `is_state` set true - they should no longer
   exist.
 * We drop any rows that do not exist in `events` - these should not exist
   either.
 * We drop the old unique constraint on all the colums, which wasn't much use.
 * We create a new unique index on `(event_id, prev_event_id)`.
 * We add a foreign key constraint to `events`.

These happen rather differently depending on whether we are on Postgres or
SQLite. For SQLite, we just rebuild the whole table, copying only the rows we
want to keep. For Postgres, we try to do things in the background as much as
possible.

* Stop populating `event_edges.room_id` and `is_state`

We can just rely on the defaults.
2022-06-15 12:29:42 +01:00
Patrick Cloke 53b77b203a
Replace noop background updates with DELETE. (#12954)
Removes the `register_noop_background_update` and deletes the background
updates directly in a delta file.
2022-06-13 14:06:27 -04:00
Richard van der Hoff 7c6b2204d1
Faster joins: add issue links to the TODOs (#13004)
... to help us keep track of these things
2022-06-09 10:13:03 +00:00
Nick Mills-Barrett 04ca3a52f6
Use READ COMMITTED isolation level when inserting read receipts (#12957) 2022-06-09 09:44:16 +01:00