Commit Graph

129 Commits (a189bb03abe06b2559aba5d31a4e3183941ea076)

Author SHA1 Message Date
Richard van der Hoff ed8ccc3737 Reinstate EDU-batching hacks
This reverts commit c7285607a3.
2019-03-13 14:42:11 +00:00
Erik Johnston 27dbc9ac42 Reenable presence tests and remove pointless change 2019-03-06 17:12:45 +00:00
Richard van der Hoff c7285607a3 Revert EDU-batching hacks from matrix-org-hotfixes
Firstly: we want to do this in a better way, which is the intention of
too many RRs, which means we need to make it happen again.

This reverts commits: 8d7c0264b 000d23090 eb0334b07 4d07dc0d1
2019-03-06 11:04:53 +00:00
Erik Johnston a6e2546980 Fix outbound federation 2019-03-05 14:50:37 +00:00
Erik Johnston dc510e0e43 Merge branch 'develop' of github.com:matrix-org/synapse into matrix-org-hotfixes 2019-03-05 14:41:13 +00:00
Richard van der Hoff 856c83f5f8
Avoid rebuilding Edu objects in worker mode (#4770)
In worker mode, on the federation sender, when we receive an edu for sending
over the replication socket, it is parsed into an Edu object. There is no point
extracting the contents of it so that we can then immediately build another Edu.
2019-03-04 12:57:44 +00:00
Richard van der Hoff 8d7c0264bc more fix edu batching hackery 2019-02-24 23:27:52 +00:00
Richard van der Hoff 000d230901 fix edu batching hackery 2019-02-24 23:19:37 +00:00
Richard van der Hoff eb0334b07c more edu batching hackery 2019-02-24 23:15:09 +00:00
Richard van der Hoff 4d07dc0d18 Add a delay to the federation loop for EDUs 2019-02-24 22:24:36 +00:00
Richard van der Hoff 82ca6d1f9f
Add metrics for number of outgoing EDUs, by type (#4695) 2019-02-20 14:13:14 +00:00
Erik Johnston 55d9024835 Use snder and not event ID domain to check if ours
The transaction queue only sends out events that we generate. This was
done by checking domain of event ID, but that can no longer be used.
Instead, we may as well use the sender field.
2019-01-29 16:54:23 +00:00
Erik Johnston 1371d5b798 Don't log stack traces for HTTP error responses 2019-01-08 12:28:30 +00:00
Erik Johnston b970cb0e96 Refactor request sending to have better excpetions (#4358)
* Correctly retry and back off if we get a HTTPerror response

* Refactor request sending to have better excpetions

MatrixFederationHttpClient blindly reraised exceptions to the caller
without differentiating "expected" failures (e.g. connection timeouts
etc) versus more severe problems (e.g. programming errors).

This commit adds a RequestSendFailed exception that is raised when
"expected" failures happen, allowing the TransactionQueue to log them as
warnings while allowing us to log other exceptions as actual exceptions.
2019-01-08 11:04:28 +00:00
Erik Johnston bc80b3f454 Add helpers for getting prev and auth events (#4139)
* Add helpers for getting prev and auth events

This is in preparation for allowing the event format to change between
room versions.
2018-11-06 00:35:15 +11:00
Richard van der Hoff b8a5b0097c
Various cleanups in the federation client code (#4031)
- Improve logging: log things in the right order, include destination and txids
  in all log lines, don't log successful responses twice

- Fix the docstring on TransportLayerClient.send_transaction

- Don't use treq.request, which is overcomplicated for our purposes: just use a
  twisted.web.client.Agent.

- simplify the logic for setting up the bodyProducer

- fix bytes/str confusions
2018-10-16 10:44:49 +01:00
Richard van der Hoff 965154d60a Fix complete fail to do the right thing 2018-09-28 12:45:54 +01:00
Richard van der Hoff 9453c65948 remove spurious federation checks on localhost
There's really no point in checking for destinations called "localhost" because
there is nothing stopping people creating other DNS entries which point to
127.0.0.1. The right fix for this is
https://github.com/matrix-org/synapse/issues/3953.

Blocking localhost, on the other hand, means that you get a surprise when
trying to connect a test server on localhost to an existing server (with a
'normal' server_name).
2018-09-26 16:53:52 +01:00
Erik Johnston 6707a3212c Limit the number of PDUs/EDUs per fedreation transaction 2018-09-06 15:23:55 +01:00
Amber Brown c334ca67bb
Integrate presence from hotfixes (#3694) 2018-08-18 01:08:45 +10:00
Richard van der Hoff 53bca4690b more metrics for the federation and appservice senders 2018-08-07 19:09:48 +01:00
Travis Ralston e908b86832 Remove pdu_failures from transactions
The field is never read from, and all the opportunities given to populate it are not utilized. It should be very safe to remove this.
2018-07-30 16:28:47 -06:00
Richard van der Hoff 667fba68f3 Run things as background processes
This fixes #3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.

(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
2018-07-18 20:55:05 +01:00
Richard van der Hoff 6e3fc657b4 Resource tracking for background processes
This introduces a mechanism for tracking resource usage by background
processes, along with an example of how it will be used.

This will help address #3518, but more importantly will give us better insights
into things which are happening but not being shown up by the request metrics.

We *could* do this with Measure blocks, but:
 - I think having them pulled out as a completely separate metric class will
   make it easier to distinguish top-level processes from those which are
   nested.

 - I want to be able to report on in-flight background processes, and I don't
   think we want to do this for *all* Measure blocks.
2018-07-18 10:50:33 +01:00
Amber Brown 49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown c2eff937ac
Populate synapse_federation_client_sent_pdu_destinations:count again (#3386) 2018-06-21 09:39:58 +01:00
Amber Brown a61738b316
Remove run_on_reactor (#3395) 2018-06-14 18:27:37 +10:00
Amber Brown c936a52a9e
Consistently use six's iteritems and wrap lazy keys/values in list() if they're not meant to be lazy (#3307) 2018-05-31 19:03:47 +10:00
Amber Brown e987079037 fixes 2018-05-23 13:03:51 -05:00
Amber Brown 071206304d cleanup pep8 errors 2018-05-22 16:54:22 -05:00
Amber Brown 85ba83eb51 fixes 2018-05-22 16:28:23 -05:00
Amber Brown df9f72d9e5 replacing portions 2018-05-21 19:47:37 -05:00
Richard van der Hoff 9255a6cb17 Improve exception handling for background processes
There were a bunch of places where we fire off a process to happen in the
background, but don't have any exception handling on it - instead relying on
the unhandled error being logged when the relevent deferred gets
garbage-collected.

This is unsatisfactory for a number of reasons:
 - logging on garbage collection is best-effort and may happen some time after
   the error, if at all
 - it can be hard to figure out where the error actually happened.
 - it is logged as a scary CRITICAL error which (a) I always forget to grep for
   and (b) it's not really CRITICAL if a background process we don't care about
   fails.

So this is an attempt to add exception handling to everything we fire off into
the background.
2018-04-27 11:07:40 +01:00
Erik Johnston f67e906e18 Set all metrics at the same time 2018-04-12 11:18:19 +01:00
Erik Johnston 4dae4a97ed Track last processed event received_ts 2018-04-11 14:27:09 +01:00
Erik Johnston 92e34615c5 Track where event stream processing have gotten up to 2018-04-11 12:13:40 +01:00
Erik Johnston a060dfa132 Use run_in_background instead 2018-04-10 14:25:11 +01:00
Erik Johnston 1246d23710 Preserve log contexts correctly 2018-04-10 12:04:32 +01:00
Erik Johnston d49cbf712f Log event ID on exception 2018-04-10 12:03:41 +01:00
Erik Johnston 6e025a97b4 Handle all events in a room correctly 2018-04-09 16:02:48 +01:00
Erik Johnston 11974f3787 Send federation events concurrently 2018-04-09 11:47:10 +01:00
Erik Johnston 145d14656b Handle exceptions in get_hosts_for_room when sending events over federation 2018-04-09 11:47:01 +01:00
Matthew Hodgson ab9f844aaf
Add federation_domain_whitelist option (#2820)
Add federation_domain_whitelist

gives a way to restrict which domains your HS is allowed to federate with.
useful mainly for gracefully preventing a private but internet-connected HS from trying to federate to the wider public Matrix network
2018-01-22 19:11:18 +01:00
Richard van der Hoff a027c2af8d Metrics for events processed in appservice and fed sender
More metrics I wished I'd had
2018-01-15 18:23:24 +00:00
Richard van der Hoff d4fb4f7c52 Clear logcontext before starting fed txn queue runner
These processes take a long time compared to the request, so there is lots of
"Entering|Restoring dead context" in the logs. Let's try to shut it up a bit.
2017-11-28 15:26:14 +00:00
Richard van der Hoff 01bbacf3c4 Fix up logcontext handling in (federation) TransactionQueue
Avoid using preserve_context_over_function, which has problems with respect to
logcontexts.
2017-10-06 22:39:25 +01:00
Erik Johnston 6e2a7ee1bc Remove spurious log lines 2017-06-07 11:05:17 +01:00
Erik Johnston dfbda5e025 Faster cache for get_joined_hosts 2017-05-25 17:24:44 +01:00
Erik Johnston ec5c4499f4 Make presence use cached users/hosts in room 2017-05-16 16:01:43 +01:00
Erik Johnston 7166854f41 Add cache for get_current_hosts_in_room 2017-05-02 10:36:35 +01:00