MatrixSynapse

Commit Graph

Author	SHA1	Message	Date
Richard van der Hoff	b8a5b0097c	Various cleanups in the federation client code (#4031 ) - Improve logging: log things in the right order, include destination and txids in all log lines, don't log successful responses twice - Fix the docstring on TransportLayerClient.send_transaction - Don't use treq.request, which is overcomplicated for our purposes: just use a twisted.web.client.Agent. - simplify the logic for setting up the bodyProducer - fix bytes/str confusions	2018-10-16 10:44:49 +01:00
Richard van der Hoff	965154d60a	Fix complete fail to do the right thing	2018-09-28 12:45:54 +01:00
Richard van der Hoff	9453c65948	remove spurious federation checks on localhost There's really no point in checking for destinations called "localhost" because there is nothing stopping people creating other DNS entries which point to 127.0.0.1. The right fix for this is https://github.com/matrix-org/synapse/issues/3953. Blocking localhost, on the other hand, means that you get a surprise when trying to connect a test server on localhost to an existing server (with a 'normal' server_name).	2018-09-26 16:53:52 +01:00
Erik Johnston	6707a3212c	Limit the number of PDUs/EDUs per fedreation transaction	2018-09-06 15:23:55 +01:00
Amber Brown	c334ca67bb	Integrate presence from hotfixes (#3694 )	2018-08-18 01:08:45 +10:00
Richard van der Hoff	53bca4690b	more metrics for the federation and appservice senders	2018-08-07 19:09:48 +01:00
Travis Ralston	e908b86832	Remove pdu_failures from transactions The field is never read from, and all the opportunities given to populate it are not utilized. It should be very safe to remove this.	2018-07-30 16:28:47 -06:00
Richard van der Hoff	667fba68f3	Run things as background processes This fixes #3518, and ensures that we get useful logs and metrics for lots of things that happen in the background. (There are certainly more things that happen in the background; these are just the common ones I've found running a single-process synapse locally).	2018-07-18 20:55:05 +01:00
Richard van der Hoff	6e3fc657b4	Resource tracking for background processes This introduces a mechanism for tracking resource usage by background processes, along with an example of how it will be used. This will help address #3518, but more importantly will give us better insights into things which are happening but not being shown up by the request metrics. We could do this with Measure blocks, but: - I think having them pulled out as a completely separate metric class will make it easier to distinguish top-level processes from those which are nested. - I want to be able to report on in-flight background processes, and I don't think we want to do this for all Measure blocks.	2018-07-18 10:50:33 +01:00
Amber Brown	49af402019	run isort	2018-07-09 16:09:20 +10:00
Amber Brown	c2eff937ac	Populate synapse_federation_client_sent_pdu_destinations:count again (#3386 )	2018-06-21 09:39:58 +01:00
Amber Brown	a61738b316	Remove run_on_reactor (#3395 )	2018-06-14 18:27:37 +10:00
Amber Brown	c936a52a9e	Consistently use six's iteritems and wrap lazy keys/values in list() if they're not meant to be lazy (#3307 )	2018-05-31 19:03:47 +10:00
Amber Brown	e987079037	fixes	2018-05-23 13:03:51 -05:00
Amber Brown	071206304d	cleanup pep8 errors	2018-05-22 16:54:22 -05:00
Amber Brown	85ba83eb51	fixes	2018-05-22 16:28:23 -05:00
Amber Brown	df9f72d9e5	replacing portions	2018-05-21 19:47:37 -05:00
Richard van der Hoff	9255a6cb17	Improve exception handling for background processes There were a bunch of places where we fire off a process to happen in the background, but don't have any exception handling on it - instead relying on the unhandled error being logged when the relevent deferred gets garbage-collected. This is unsatisfactory for a number of reasons: - logging on garbage collection is best-effort and may happen some time after the error, if at all - it can be hard to figure out where the error actually happened. - it is logged as a scary CRITICAL error which (a) I always forget to grep for and (b) it's not really CRITICAL if a background process we don't care about fails. So this is an attempt to add exception handling to everything we fire off into the background.	2018-04-27 11:07:40 +01:00
Erik Johnston	f67e906e18	Set all metrics at the same time	2018-04-12 11:18:19 +01:00
Erik Johnston	4dae4a97ed	Track last processed event received_ts	2018-04-11 14:27:09 +01:00
Erik Johnston	92e34615c5	Track where event stream processing have gotten up to	2018-04-11 12:13:40 +01:00
Erik Johnston	a060dfa132	Use run_in_background instead	2018-04-10 14:25:11 +01:00
Erik Johnston	1246d23710	Preserve log contexts correctly	2018-04-10 12:04:32 +01:00
Erik Johnston	d49cbf712f	Log event ID on exception	2018-04-10 12:03:41 +01:00
Erik Johnston	6e025a97b4	Handle all events in a room correctly	2018-04-09 16:02:48 +01:00
Erik Johnston	11974f3787	Send federation events concurrently	2018-04-09 11:47:10 +01:00
Erik Johnston	145d14656b	Handle exceptions in get_hosts_for_room when sending events over federation	2018-04-09 11:47:01 +01:00
Matthew Hodgson	ab9f844aaf	Add federation_domain_whitelist option (#2820 ) Add federation_domain_whitelist gives a way to restrict which domains your HS is allowed to federate with. useful mainly for gracefully preventing a private but internet-connected HS from trying to federate to the wider public Matrix network	2018-01-22 19:11:18 +01:00
Richard van der Hoff	a027c2af8d	Metrics for events processed in appservice and fed sender More metrics I wished I'd had	2018-01-15 18:23:24 +00:00
Richard van der Hoff	d4fb4f7c52	Clear logcontext before starting fed txn queue runner These processes take a long time compared to the request, so there is lots of "Entering\|Restoring dead context" in the logs. Let's try to shut it up a bit.	2017-11-28 15:26:14 +00:00
Richard van der Hoff	01bbacf3c4	Fix up logcontext handling in (federation) TransactionQueue Avoid using preserve_context_over_function, which has problems with respect to logcontexts.	2017-10-06 22:39:25 +01:00
Erik Johnston	6e2a7ee1bc	Remove spurious log lines	2017-06-07 11:05:17 +01:00
Erik Johnston	dfbda5e025	Faster cache for get_joined_hosts	2017-05-25 17:24:44 +01:00
Erik Johnston	ec5c4499f4	Make presence use cached users/hosts in room	2017-05-16 16:01:43 +01:00
Erik Johnston	7166854f41	Add cache for get_current_hosts_in_room	2017-05-02 10:36:35 +01:00
Erik Johnston	247c736b9b	Merge pull request #2115 from matrix-org/erikj/dedupe_federation_repl Reduce federation replication traffic	2017-04-12 11:07:13 +01:00
Erik Johnston	1745069543	Comment	2017-04-12 10:17:10 +01:00
Erik Johnston	c7ddb5ef7a	Reuse get_interested_parties	2017-04-12 10:16:26 +01:00
Paul "LeoNerd" Evans	11dbceb761	Add a counter metric for successfully-sent transactions	2017-04-11 17:16:12 +01:00
Erik Johnston	a8c8e4efd4	Comment	2017-04-11 15:35:49 +01:00
Erik Johnston	6308ac45b0	Move get_interested_remotes back to presence handler	2017-04-11 15:19:26 +01:00
Erik Johnston	b9b72bc6e2	Comments	2017-04-11 15:15:34 +01:00
Erik Johnston	29574fd5b3	Reduce federation presence replication traffic This is mainly done by moving the calculation of where to send presence updates from the presence handler to the transaction queue, so we only need to send the presence event (and not the destinations) across the replication connection. Before we were duplicating by sending the full state across once per destination.	2017-04-10 16:48:30 +01:00
Erik Johnston	85be3dde81	Bail early if remote wouldn't be retried (#2064 ) * Bail early if remote wouldn't be retried * Don't always return true * Just use get_retry_limiter * Spelling	2017-03-29 11:48:27 +01:00
Erik Johnston	2a28b79e04	Batch sending of device list pokes	2017-03-24 14:44:49 +00:00
Richard van der Hoff	4bd597d9fc	push federation retry limiter down to matrixfederationclient rather than having to instrument everywhere we make a federation call, make the MatrixFederationHttpClient manage the retry limiter.	2017-03-23 09:28:46 +00:00
Richard van der Hoff	29ed09e80a	Fix assertion to stop transaction queue getting wedged ... and update some docstrings to correctly reflect the types being used. get_new_device_msgs_for_remote can return a long under some circumstances, which was being stored in last_device_list_stream_id_by_dest, and was then upsetting things on the next loop.	2017-03-15 12:16:55 +00:00
Richard van der Hoff	0c4cf9372b	Fix a race in transaction queue It was theoretically possible for a PDU to get queued and not sent for ages. On closer inspection I think there were bigger problems elsewhere, but we might as well fix this since it's easy.	2017-02-20 16:46:25 +00:00
Erik Johnston	df4ecff5a9	Correctly raise exceptions for ratelimitng. Ratelimit on 401	2017-02-01 15:42:19 +00:00
Erik Johnston	ae7a132f38	Better handle 404 response for federation /send/	2017-01-31 13:40:09 +00:00

1 2 3

114 Commits (0f6ec6d1aedc88a2057f50b77ce9d6a405177096)