Commit Graph

8914 Commits (abd9914683109d50f3998627eaa859caa500d63c)

Author SHA1 Message Date
Amber Brown d86794325f Merge branch 'master' into develop 2018-10-04 22:41:52 +10:00
Amber Brown dd59dfc51f full version 2018-10-04 22:37:55 +10:00
Richard van der Hoff a59d899668 Pin to prometheus_client<0.4 to avoid renaming all of our metrics 2018-10-03 17:20:15 +01:00
Erik Johnston 8935ec5a93
Merge pull request #3999 from matrix-org/erikj/fix_3pid_invite_rejetion
Fix handling of rejected threepid invites
2018-10-03 14:47:41 +01:00
Erik Johnston 81e2813948
Merge pull request #3996 from matrix-org/erikj/fix_bg_iteration
Fix exception in background metrics collection
2018-10-03 14:14:38 +01:00
Erik Johnston 52e6e815be Sanitise error messages when user doesn't have permission to invite 2018-10-03 14:13:07 +01:00
Erik Johnston 69e857853f Fix handling of rejected threepid invites 2018-10-03 11:57:30 +01:00
Erik Johnston 495a9d06bb Fix exception handling in fetching remote profiles 2018-10-03 11:34:30 +01:00
Erik Johnston 7c570bff74 Fix exception in background metrics collection
We attempted to iterate through a list on a separate thread without
doing the necessary copying.
2018-10-03 11:28:01 +01:00
Richard van der Hoff 9693625e55 actually exclude outliers 2018-10-03 10:19:41 +01:00
Richard van der Hoff 3e39783d5d remove debugging 2018-10-02 23:44:14 +01:00
Richard van der Hoff ae61ade891 Fix bug in forward_extremity update logic
An event does not stop being a forward_extremity just because an outlier or
rejected event refers to it.
2018-10-02 23:36:08 +01:00
Amber Brown da8f82008d version 2018-10-03 02:15:48 +10:00
Erik Johnston 7258d081a5 Fix bug when invalidating destination retry timings 2018-10-02 15:47:57 +01:00
Richard van der Hoff 2b8d28b095
Merge pull request #3960 from matrix-org/rav/fix_missing_create_event_error
Fix error handling for missing auth_event
2018-10-02 13:57:54 +01:00
Amber Brown 7232917f12
Disable frozen dicts by default (#3987) 2018-10-02 22:53:47 +10:00
Erik Johnston 334e075dd8 Fix error when logging incomplete requests
If a connection is lost before a request is read from Request, Twisted
sets `method` (and `uri`) attributes to dummy values. These dummy values
have incorrect types (i.e. they're not bytes), and so things like
`__repr__` would raise an exception.

To fix this we had a helper method to return the method with a
consistent type.
2018-10-02 12:06:22 +01:00
Richard van der Hoff 8c41b0ca66
Merge pull request #3989 from matrix-org/rav/better_stacktraces
Avoid reraise, to improve stacktraces
2018-10-02 10:47:38 +01:00
Erik Johnston bc29946809
Merge pull request #3986 from matrix-org/erikj/fix_sync_with_redacted_state
Fix lazy loaded sync with rejected state events
2018-10-02 11:38:35 +02:00
Richard van der Hoff 8174c6725b Avoid reraise, to improve stacktraces 2018-10-01 18:50:34 +01:00
Richard van der Hoff 5908a8f6f5 Merge branch 'develop' into rav/fix_missing_create_event_error 2018-10-01 17:16:20 +01:00
Richard van der Hoff b5b93f45d5
Merge pull request #3968 from matrix-org/rav/fix_federation_errors
Fix exceptions when handling incoming transactions
2018-10-01 15:54:24 +01:00
Amber Brown 6e05fd032c
Fix userconsent on Python 3 (#3938) 2018-10-02 00:11:58 +10:00
Erik Johnston 82f922b4af Fix lazy loaded sync with rejected state events
In particular, we assume that the name and canonical alias events in
the state have not been rejected. In practice this may not be the case
(though we should probably think about fixing that) so lets ensure that
we gracefully handle that case, rather than 404'ing the sync request
like we do now.
2018-10-01 14:19:40 +01:00
Erik Johnston 8f5c23d0cd
Merge pull request #3933 from matrix-org/erikj/destination_retry_cache
Add a five minute cache to get_destination_retry_timings
2018-10-01 14:20:30 +02:00
Erik Johnston 4f3e3ac192 Correctly match 'dict.pop' api 2018-10-01 12:25:27 +01:00
Erik Johnston 8ea887856c Don't update eviction metrics on explicit removal 2018-10-01 12:00:58 +01:00
Richard van der Hoff 3deaad2fb4
Merge pull request #3964 from matrix-org/rav/remove_localhost_checks
remove spurious federation checks on localhost
2018-09-28 13:35:47 +01:00
Richard van der Hoff 965154d60a Fix complete fail to do the right thing 2018-09-28 12:45:54 +01:00
Richard van der Hoff 19475cf337 Remove redundant call to start_get_pdu_cache
I think this got forgotten in #3932. We were getting away with it because it
was the last call in this function.
2018-09-28 12:01:23 +01:00
Richard van der Hoff 9c8cec5dab Merge remote-tracking branch 'origin/develop' into erikj/destination_retry_cache 2018-09-28 10:51:09 +01:00
Richard van der Hoff f094f715cf Merge remote-tracking branch 'origin/develop' into rav/fix_federation_errors 2018-09-27 15:18:21 +01:00
Richard van der Hoff 36c62a67c4
Merge pull request #3794 from matrix-org/erikj/faster_typing
Improve performance of getting typing updates for replication
2018-09-27 15:14:26 +01:00
Richard van der Hoff e1e3e77bfd
Merge pull request #3967 from matrix-org/rav/federation_handler_cleanups
Clarifications in FederationHandler
2018-09-27 15:06:59 +01:00
Amber Brown 6de3884e5e
Merge pull request #3965 from matrix-org/rav/notify_app_services_bg_process
Run notify_app_services as a bg process
2018-09-27 23:40:30 +10:00
Amber Brown a512e637ac
Merge pull request #3970 from schnuffle/develop-py3
Replaced all occurences of e.message with str(e)
2018-09-27 23:39:45 +10:00
Amber Brown b3064532d0
Run our oldest supported configuration in CI (#3952) 2018-09-27 23:21:54 +10:00
Schnuffle dc5db01ff2 Replaced all occurences of e.message with str(e)
Signed-off-by: Schnuffle  <schnuffle@github.com>
2018-09-27 13:38:50 +02:00
Richard van der Hoff 51d33d5178
Merge pull request #3961 from matrix-org/neilj/lock_mau_upserts
fix #3854 MAU transaction errors
2018-09-27 12:31:37 +01:00
Richard van der Hoff 333bee27f5 Include event when resolving state for missing prevs
If we have a forward extremity for a room as `E`, and you receive `A`, `B`,
s.t. `A -> B -> E`, and `B` also points to an unknown event `X`, then we need
to do state res between `X` and `E`.

When that happens, we need to make sure we include `X` in the state that goes
into the state res alg.

Fixes #3934.
2018-09-27 11:37:39 +01:00
Richard van der Hoff bd61c82bdf Include state from remote servers in pdu handling
If we've fetched state events from remote servers in order to resolve the state
for a new event, we need to actually pass those events into
resolve_events_with_factory (so that it can do the state res) and then persist
the ones we need - otherwise other bits of the codebase get confused about why
we have state groups pointing to non-existent events.
2018-09-27 11:37:39 +01:00
Richard van der Hoff a215b698c4 Fix "unhashable type: 'list'" exception in federation handling
get_state_groups returns a map from state_group_id to a list of FrozenEvents,
so was very much the wrong thing to be putting as one of the entries in the
list passed to resolve_events_with_factory (which expects maps from
(event_type, state_key) to event id).

We actually want get_state_groups_ids().values() rather than
get_state_groups().

This fixes the main problem in #3923, but there are other problems with this
bit of code which get discovered once you do so.
2018-09-27 11:37:39 +01:00
Richard van der Hoff 28223841e0 more comments 2018-09-27 11:31:51 +01:00
Richard van der Hoff e3c159863d Clarifications in FederationHandler
* add some comments on things that look a bit bogus
* rename this `state` variable to avoid confusion with the `state` used
  elsewhere in this function. (There was no actual conflict, but it was
  a confusing bit of spaghetti.)
2018-09-27 11:31:51 +01:00
Richard van der Hoff 92abd3d6d6
Merge pull request #3966 from matrix-org/rav/rx_txn_logging_2
Logging improvements
2018-09-27 11:27:46 +01:00
Richard van der Hoff 4a15a3e4d5
Include eventid in log lines when processing incoming federation transactions (#3959)
when processing incoming transactions, it can be hard to see what's going on,
because we process a bunch of stuff in parallel, and because we may end up
recursively working our way through a chain of three or four events.

This commit creates a way to use logcontexts to add the relevant event ids to
the log lines.
2018-09-27 11:25:34 +01:00
Richard van der Hoff ae6ad4cf41
docstrings and unittests for storage.state (#3958)
I spent ages trying to figure out how I was going mad...
2018-09-27 11:22:25 +01:00
Richard van der Hoff e70b4ce069 Logging improvements
Some logging tweaks to help with debugging incoming federation transactions
2018-09-26 17:36:14 +01:00
Richard van der Hoff a87d419a85 Run notify_app_services as a bg process
This ensures that its resource usage metrics get recorded somewhere rather than
getting lost.

(It also fixes an error when called from a nested logging context which
completes before the bg process)
2018-09-26 17:00:40 +01:00
Richard van der Hoff 9453c65948 remove spurious federation checks on localhost
There's really no point in checking for destinations called "localhost" because
there is nothing stopping people creating other DNS entries which point to
127.0.0.1. The right fix for this is
https://github.com/matrix-org/synapse/issues/3953.

Blocking localhost, on the other hand, means that you get a surprise when
trying to connect a test server on localhost to an existing server (with a
'normal' server_name).
2018-09-26 16:53:52 +01:00
Richard van der Hoff 607eec0456 fix docstring for FederationClient.get_state_for_room
trivial fixes for docstring
2018-09-26 16:52:24 +01:00
Neil Johnson 28781b65e7 fix #3854 2018-09-26 16:16:41 +01:00
Richard van der Hoff 8afddf7afe Fix error handling for missing auth_event
When we were authorizing an event, if there was no `m.room.create` in its
auth_events, we would raise a SynapseError with a cryptic message, which then
meant that we would bail out of processing any incoming events, rather than
storing a rejection for the faulty event and moving on.

We should treat the absent event the same as any other auth failure, by
raising an AuthError, so that the event is marked as rejected.
2018-09-26 14:40:16 +01:00
Richard van der Hoff 4e8276a34a
Merge pull request #3956 from matrix-org/rav/fix_expiring_cache_len
Fix ExpiringCache.__len__ to be accurate
2018-09-26 13:10:13 +01:00
Richard van der Hoff 5b4028fa78 Merge branch 'rav/fix_expiring_cache_len' into erikj/destination_retry_cache 2018-09-26 12:55:53 +01:00
Richard van der Hoff 7ee94fc1ba Log which cache is throwing exceptions 2018-09-26 12:43:08 +01:00
Amber Brown 66a1d57adb
Merge pull request #3948 from matrix-org/rav/no_symlink_synctl
Move synctl into top dir to avoid a symlink
2018-09-26 21:41:58 +10:00
Amber Brown c2185f14d7
Merge pull request #3924 from matrix-org/rav/clean_up_on_receive_pdu
Comments and interface cleanup for on_receive_pdu
2018-09-26 21:41:26 +10:00
Erik Johnston 3baf6e1667 Fix ExpiringCache.__len__ to be accurate
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.

(This appears to have been a longstanding bug, but one which has been made more
of a problem by #3932 and #3933).

(This was originally done by Erik as part of #3933. I'm cherry-picking it
because really it's a fix in its own right)
2018-09-26 12:32:29 +01:00
Richard van der Hoff a1cd37390f Merge remote-tracking branch 'origin/develop' into erikj/destination_retry_cache 2018-09-25 12:03:54 +01:00
Richard van der Hoff 4c3e7eeec5
Merge pull request #3932 from matrix-org/erikj/auto_start_expiring_caches
Fix some instances of ExpiringCache not expiring cache items
2018-09-25 12:02:57 +01:00
Jérémy Farnaud 6cf261930a added "media-src: 'self'" to CSP for resources (#3578)
Synapse doesn’t allow for media resources to be played directly from
Chrome. It is a problem for users on other networks (e.g. IRC)
communicating with Matrix users through a gateway. The gateway sends
them the raw URL for the resource when a Matrix user uploads a video
and the video cannot be played directly in Chrome using that URL.

Chrome argues it is not authorized to play the video because of the
Content Security Policy. Chrome checks for the "media-src" policy which
is missing, and defauts to the "default-src" policy which is "none".

As Synapse already sends "object-src: 'self'" I thought it wouldn’t be
a problem to add "media-src: 'self'" to the CSP to fix this problem.
2018-09-25 11:55:02 +01:00
Richard van der Hoff 94f7befc31
Merge pull request #3925 from matrix-org/erikj/fix_producers_unregistered
Fix spurious exceptions when client closes conncetion
2018-09-25 11:52:06 +01:00
Richard van der Hoff c53336986d Move synctl into top dir to avoid a symlink
symlinks apparently break setuptools on python3 and alpine
(https://bugs.python.org/issue31940), so let's stop using a symlink and just
use the file directly.
2018-09-25 11:19:27 +01:00
Richard van der Hoff a9d84f4e44 We require attrs 16.0.0
Ref: https://github.com/matrix-org/synapse/issues/3945
2018-09-25 10:43:39 +01:00
Matthew Hodgson 787d22ed6c
Only lazy load self-members on initial sync
Given we have disabled lazy loading for incr syncs in #3840, we can make self-LL more efficient by only doing it on initial sync.  Also adds a bounds check for if/when we change our mind, so that we don't try to include LL members on sync responses with no timeline.
2018-09-25 00:49:26 +01:00
Amber Brown fbe5ba25f6 Merge branch 'master' into develop 2018-09-25 03:10:01 +10:00
Amber Brown 6b6cb32297 bump version 2018-09-25 02:54:34 +10:00
Amber Brown 04eed80a73 Merge branch 'master' into develop 2018-09-24 23:42:25 +10:00
Amber Brown e302f40e20 update version 2018-09-24 23:40:05 +10:00
Erik Johnston 19dc676d1a Fix ExpiringCache.__len__ to be accurate
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.
2018-09-21 16:25:42 +01:00
Erik Johnston fdd1a62e8d Add a five minute cache to get_destination_retry_timings
Hopefully helps with #3931
2018-09-21 14:56:12 +01:00
Erik Johnston 79eded1ae4 Make ExpiringCache slightly more performant 2018-09-21 14:52:21 +01:00
Erik Johnston 8601c24287 Fix some instances of ExpiringCache not expiring cache items
ExpiringCache required that `start()` be called before it would actually
start expiring entries. A number of places didn't do that.

This PR removes `start` from ExpiringCache, and automatically starts
backround reaping process on creation instead.
2018-09-21 14:19:46 +01:00
Erik Johnston ad53a5497d
Merge pull request #3927 from matrix-org/erikj/handle_background_errors
Handle exceptions thrown by background tasks
2018-09-21 09:26:30 +01:00
Matthew Hodgson a2ddaa90f2
Always LL ourselves if we're in a room to simplify clients (#3916)
Should fix https://github.com/vector-im/riot-web/issues/7209
2018-09-20 21:21:54 +01:00
Erik Johnston 94ae1dea3c Add missing logger 2018-09-20 17:05:34 +01:00
Erik Johnston 9ea408441f Handle exceptions thrown by background tasks
Fixes #3921
2018-09-20 16:15:21 +01:00
Erik Johnston b28a7ed503 Fix spurious exceptions when client closes conncetion
If a HTTP handler throws an exception while processing a request we
automatically write a JSON error response. If the handler had already
started writing a response twisted throws an exception.

We should check for this case and simple abort the connection if there
was an error after the response had started being written.
2018-09-20 13:44:20 +01:00
Neil Johnson 23b53b4ef8
Merge pull request #3868 from matrix-org/neilj/fix_room_invite_mail_links
Neilj/fix room invite mail links
2018-09-20 13:32:38 +01:00
Richard van der Hoff 703de4ec13 Comments and interface cleanup for on_receive_pdu
Add some informative comments about what's going on here.

Also, `sent_to_us_directly` and `get_missing` were doing the same thing (apart
from in `_handle_queued_pdus`, which looks like a bug), so let's get rid of
`get_missing` and use `sent_to_us_directly` consistently.
2018-09-20 13:06:55 +01:00
Amber Brown 1f3f5fcf52
Fix client IPs being broken on Python 3 (#3908) 2018-09-20 20:14:34 +10:00
Erik Johnston 3fd68d533b
Merge pull request #3914 from matrix-org/erikj/remove_retry_cache
Remove get_destination_retry_timings cache
2018-09-20 10:54:49 +01:00
Amber Brown aeca5a5ed5
Add a regression test for logging on failed connections (#3912) 2018-09-20 16:28:18 +10:00
Richard van der Hoff 642199570c
Improve the logging when handling a federation transaction (#3904)
Let's try to rationalise the logging that happens when we are processing an
incoming transaction, to make it easier to figure out what is going wrong when
they take ages. In particular:

- make everything start with a [room_id event_id] prefix
- make sure we log a warning when catching exceptions rather than just turning
  them into other, more cryptic, exceptions.
2018-09-19 17:28:18 +01:00
Erik Johnston ce846bb620 Merge branch 'develop' of github.com:matrix-org/synapse into erikj/faster_typing 2018-09-19 15:08:36 +01:00
Erik Johnston bbab6ebfd9 Fix up changelog and remove spurious comment 2018-09-19 14:45:14 +01:00
Erik Johnston 392a54128c pep8 2018-09-19 14:37:49 +01:00
Erik Johnston b9158ac2bf Remove get_destination_retry_timings cache
Currently we rely on the master to invalidate this cache promptly.
However, after having moved most federation endpoints off of master this
no longer happens, causing outbound fedeariont to get blackholed.

Fixes #3798
2018-09-19 14:22:57 +01:00
Erik Johnston 80d2d50f47 Fixup 2018-09-19 11:19:47 +01:00
Erik Johnston 9407bcf37a Replace custom DeferredTimeoutError with defer.TimeoutError 2018-09-19 11:07:29 +01:00
Erik Johnston 6c48aa0256 Run canceller first to allow it to generate correct error 2018-09-19 11:07:27 +01:00
Erik Johnston a334e1cace Update to use new timeout function everywhere.
The existing deferred timeout helper function (and the one into twisted)
suffer from a bug when a deferred's canceller throws an exception, #3842.

The new helper function doesn't suffer from this problem.
2018-09-19 10:39:40 +01:00
Amber Brown 47c02e6332
Merge pull request #3909 from turt2live/travis/fix-logging-1
Fix matrixfederationclient.py logging: Destination is a string
2018-09-19 18:14:47 +10:00
Amber Brown 3d6b24fb1b
Merge pull request #3907 from matrix-org/rav/set_sni_to_server_name
Set SNI to the server_name, not whatever was in the SRV record
2018-09-19 17:59:33 +10:00
Amber Brown f773ecbd61
Merge pull request #3903 from matrix-org/rav/increase_get_missing_events_timeout
Bump timeout on get_missing_events request
2018-09-19 17:57:48 +10:00
Travis Ralston 35aec19f0a Destination is a string 2018-09-18 15:29:30 -06:00
Richard van der Hoff 38ead946a9 Merge remote-tracking branch 'origin/develop' into neilj/fix_room_invite_mail_links 2018-09-18 19:02:45 +01:00
Richard van der Hoff a219ce8726
Use directory server for room joins (#3899)
When we do a join, always try the server we used for the alias lookup first.

Fixes #2418
2018-09-18 18:27:37 +01:00
Richard van der Hoff 31c15dcb80
Refactor matrixfederationclient to fix logging (#3906)
We want to wait until we have read the response body before we log the request
as complete, otherwise a confusing thing happens where the request appears to
have completed, but we later fail it.

To do this, we factor the salient details of a request out to a separate
object, which can then keep track of the txn_id, so that it can be logged.
2018-09-18 18:17:15 +01:00