MatrixSynapse

Commit Graph

Author	SHA1	Message	Date
Andrew Morgan	4e0fd35bc9	Revert "Experimental Federation Speedup (#9702 )" This reverts commit `05e8c70c05`.	2021-04-28 11:38:33 +01:00
Richard van der Hoff	294c675033	Remove `synapse.types.Collection` (#9856 ) This is no longer required, since we have dropped support for Python 3.5.	2021-04-22 16:43:50 +01:00
Erik Johnston	db70435de7	Fix bug where we sent remote presence states to remote servers (#9850 )	2021-04-20 13:37:54 +01:00
Erik Johnston	2b7dd21655	Don't send normal presence updates over federation replication stream (#9828 )	2021-04-19 10:50:49 +01:00
Richard van der Hoff	5a153772c1	remove `HomeServer.get_config` (#9815 ) Every single time I want to access the config object, I have to remember whether or not we use `get_config`. Let's just get rid of it.	2021-04-14 19:09:08 +01:00
Jonathan de Jong	05e8c70c05	Experimental Federation Speedup (#9702 ) This basically speeds up federation by "squeezing" each individual dual database call (to destinations and destination_rooms), which previously happened per every event, into one call for an entire batch (100 max). Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>	2021-04-14 17:19:02 +01:00
Jonathan de Jong	4b965c862d	Remove redundant "coding: utf-8" lines (#9786 ) Part of #9744 Removes all redundant `# -- coding: utf-8 --` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`	2021-04-14 15:34:27 +01:00
Erik Johnston	3a569fb200	Fix sharded federation sender sometimes using 100% CPU. We pull all destinations requiring catchup from the DB in batches. However, if all those destinations get filtered out (due to the federation sender being sharded), then the `last_processed` destination doesn't get updated, and we keep requesting the same set repeatedly.	2021-04-08 17:34:07 +01:00
Andrew Morgan	04819239ba	Add a Synapse Module for configuring presence update routing (#9491 ) At the moment, if you'd like to share presence between local or remote users, those users must be sharing a room together. This isn't always the most convenient or useful situation though. This PR adds a module to Synapse that will allow deployments to set up extra logic on where presence updates should be routed. The module must implement two methods, `get_users_for_states` and `get_interested_users`. These methods are given presence updates or user IDs and must return information that Synapse will use to grant passing presence updates around. A method is additionally added to `ModuleApi` which allows triggering a set of users to receive the current, online presence information for all users they are considered interested in. This is the equivalent of that user receiving presence information during an initial sync. The goal of this module is to be fairly generic and useful for a variety of applications, with hard requirements being: * Sending state for a specific set or all known users to a defined set of local and remote users. * The ability to trigger an initial sync for specific users, so they receive all current state.	2021-04-06 14:38:30 +01:00
Erik Johnston	33548f37aa	Improve tracing for to device messages (#9686 )	2021-04-01 17:08:21 +01:00
Patrick Cloke	da75d2ea1f	Add type hints for the federation sender. (#9681 ) Includes an abstract base class which both the FederationSender and the FederationRemoteSendQueue must implement.	2021-03-29 11:43:20 -04:00
Erik Johnston	c602ba8336	Fixed undefined variable error in catchup (#9664 ) Broke in #9640 Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>	2021-03-24 16:12:47 +00:00
Erik Johnston	dd71eb0f8a	Make federation catchup send last event from any server. (#9640 ) Currently federation catchup will send the last local event that we failed to send to the remote. This can cause issues for large rooms where lots of servers have sent events while the remote server was down, as when it comes back up again it'll be flooded with events from various points in the DAG. Instead, let's make it so that all the servers send the most recent events, even if its not theirs. The remote should deduplicate the events, so there shouldn't be much overhead in doing this. Alternatively, the servers could only send local events if they were also extremities and hope that the other server will send the event over, but that is a bit risky.	2021-03-18 15:52:26 +00:00
Erik Johnston	026503fa3b	Don't go into federation catch up mode so easily (#9561 ) Federation catch up mode is very inefficient if the number of events that the remote server has missed is small, since handling gaps can be very expensive, c.f. #9492. Instead of going into catch up mode whenever we see an error, we instead do so only if we've backed off from trying the remote for more than an hour (the assumption being that in such a case it is more than a transient failure).	2021-03-15 14:42:40 +00:00
Richard van der Hoff	8a4b3738f3	Replace `last_*_pdu_age` metrics with timestamps (#9540 ) Following the advice at https://prometheus.io/docs/practices/instrumentation/#timestamps-not-time-since, it's preferable to export unix timestamps, not ages. There doesn't seem to be any particular naming convention for timestamp metrics.	2021-03-04 16:40:18 +00:00
Andrew Morgan	8bcfc2eaad	Be smarter about which hosts to send presence to when processing room joins (#9402 ) This PR attempts to eliminate unnecessary presence sending work when your local server joins a room, or when a remote server joins a room your server is participating in by processing state deltas in chunks rather than individually. --- When your server joins a room for the first time, it requests the historical state as well. This chunk of new state is passed to the presence handler which, after filtering that state down to only membership joins, will send presence updates to homeservers for each join processed. It turns out that we were being a bit naive and processing each event individually, and sending out presence updates for every one of those joins. Even if many different joins were users on the same server (hello IRC bridges), we'd send presence to that same homeserver for every remote user join we saw. This PR attempts to deduplicate all of that by processing the entire batch of state deltas at once, instead of only doing each join individually. We process the joins and note down which servers need which presence: * If it was a local user join, send that user's latest presence to all servers in the room * If it was a remote user join, send the presence for all local users in the room to that homeserver We deduplicate by inserting all of those pending updates into a dictionary of the form: ``` { server_name1: {presence_update1, ...}, server_name2: {presence_update1, presence_update2, ...} } ``` Only after building this dict do we then start sending out presence updates.	2021-02-19 11:37:29 +00:00
Eric Eastwood	0a00b7ff14	Update black, and run auto formatting over the codebase (#9381 ) - Update black version to the latest - Run black auto formatting over the codebase - Run autoformatting according to [`docs/code_style.md `](`80d6dc9783/docs/code_style.md`) - Update `code_style.md` docs around installing black to use the correct version	2021-02-16 22:32:34 +00:00
Erik Johnston	dd8da8c5f6	Precompute joined hosts and store in Redis (#9198 )	2021-01-26 13:57:31 +00:00
Erik Johnston	921a3f8a59	Fix not sending events over federation when using sharded event persisters (#8536 ) * Fix outbound federaion with multiple event persisters. We incorrectly notified federation senders that the minimum persisted stream position had advanced when we got an `RDATA` from an event persister. Notifying of federation senders already correctly happens in the notifier, so we just delete the offending line. * Change some interfaces to use RoomStreamToken. By enforcing use of `RoomStreamTokens` we make it less likely that people pass in random ints that they got from somewhere random.	2020-10-14 13:27:51 +01:00
Richard van der Hoff	f31f8e6319	Remove stream ordering from Metadata dict (#8452 ) There's no need for it to be in the dict as well as the events table. Instead, we store it in a separate attribute in the EventInternalMetadata object, and populate that on load. This means that we can rely on it being correctly populated for any event which has been persited to the database.	2020-10-05 14:43:14 +01:00
Richard van der Hoff	3bd3707cb9	Fix malformed log line in new federation "catch up" logic (#8442 )	2020-10-02 11:05:29 +01:00
Richard van der Hoff	c1ef579b63	Add prometheus metrics to track federation delays (#8430 ) Add a pair of federation metrics to track the delays in sending PDUs to/from particular servers.	2020-10-01 11:09:12 +01:00
reivilibre	36efbcaf51	Catch-up after Federation Outage (bonus): Catch-up on Synapse Startup (#8322 ) Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net> Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> * Fix _set_destination_retry_timings This came about because the code assumed that retry_interval could not be NULL — which has been challenged by catch-up.	2020-09-18 14:59:13 +01:00
reivilibre	576bc37d31	Catch-up after Federation Outage (split, 4): catch-up loop (#8272 )	2020-09-15 09:07:19 +01:00
reivilibre	17fa4c7ca7	Catch up after Federation Outage (split, 2): Track last successful stream ordering after transmission (#8247 ) Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>	2020-09-04 15:06:51 +01:00
reivilibre	58f61f10f7	Catch-up after Federation Outage (split, 1) (#8230 ) Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>	2020-09-04 12:22:23 +01:00
Patrick Cloke	c619253db8	Stop sub-classing object (#8249 )	2020-09-04 06:54:56 -04:00
reivilibre	4535e849d7	Remove obsolete order field in `send_new_transaction` (#8245 ) Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>	2020-09-03 19:23:07 +01:00
Patrick Cloke	5758dcf30c	Add type hints for state. (#8140 )	2020-08-24 14:25:27 -04:00
Patrick Cloke	eebf52be06	Be stricter about JSON that is accepted by Synapse (#8106 )	2020-08-19 07:26:03 -04:00
Patrick Cloke	ad6190c925	Convert stream database to async/await. (#8074 )	2020-08-17 07:24:46 -04:00
reivilibre	ff0e894656	Drop federation transmission queues during a significant remote outage. (#7864 ) * Empty federation transmission queues when we are backing off. Fixes #7828. Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net> * Address feedback Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net> * Reword newsfile	2020-08-13 12:35:04 +01:00
Erik Johnston	9d1e4942ab	Fix typing for notifier (#8064 )	2020-08-12 14:03:08 +01:00
Olivier Wilkinson (reivilibre)	3aa36b782c	Merge branch 'master' into develop	2020-07-30 15:18:36 +01:00
Patrick Cloke	c978f6c451	Convert federation client to async/await. (#7975 )	2020-07-30 08:01:33 -04:00
Erik Johnston	2c1b9d6763	Update worker docs with recent enhancements (#7969 )	2020-07-29 23:22:13 +01:00
Patrick Cloke	b975fa2e99	Convert state resolution to async/await (#7942 )	2020-07-24 10:59:51 -04:00
Patrick Cloke	fefe9943ef	Convert presence handler helpers to async/await. (#7939 )	2020-07-23 16:47:36 -04:00
Erik Johnston	649a7ead5c	Add ability to run multiple pusher instances (#7855 ) This reuses the same scheme as federation sender sharding	2020-07-16 14:06:28 +01:00
Olivier Wilkinson (reivilibre)	12528dc42f	Remove obsolete comment. It was correct at the time of our friend Jorik writing it (checking git blame), but the world has moved now and it is no longer a generator. Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>	2020-07-16 11:12:48 +01:00
Erik Johnston	f299441cc6	Add ability to shard the federation sender (#7798 )	2020-07-10 18:26:36 +01:00
Patrick Cloke	38e1fac886	Fix some spelling mistakes / typos. (#7811 )	2020-07-09 09:52:58 -04:00
Erik Johnston	1e03513f9a	Fix new metric where we used ms instead of seconds (#7771 ) Introduced in #7755, not yet released.	2020-07-01 15:23:58 +01:00
Erik Johnston	a99658074d	Add some metrics for inbound and outbound federation processing times (#7755 )	2020-06-30 16:58:06 +01:00
Patrick Cloke	bd6dc17221	Replace iteritems/itervalues/iterkeys with native versions. (#7692 )	2020-06-15 07:03:36 -04:00
Richard van der Hoff	075375bbc9	add a comment	2020-05-21 13:25:41 +01:00
Richard van der Hoff	d5aa7d93ed	Fix catchup-on-reconnect for the Federation Stream (#7374 ) looks like we managed to break this during the refactorathon.	2020-05-05 14:15:57 +01:00
Erik Johnston	4cff617df1	Move catchup of replication streams to worker. (#7024 ) This changes the replication protocol so that the server does not send down `RDATA` for rows that happened before the client connected. Instead, the server will send a `POSITION` and clients then query the database (or master out of band) to get up to date.	2020-03-25 14:54:01 +00:00
Erik Johnston	b08b0a22d5	Add typing to synapse.federation.sender (#6871 )	2020-02-07 13:56:38 +00:00
Erik Johnston	a8a50f5b57	Wake up transaction queue when remote server comes back online (#6706 ) This will be used to retry outbound transactions to a remote server if we think it might have come back up.	2020-01-17 10:27:19 +00:00

1 2

75 Commits (ef889c98a6cde0cfa95f7fdaf7f99ec3c1e9bb7f)