Commit Graph

8249 Commits (f59be4eb0e4c96d6744b4a479a04991bd53a50a6)

Author SHA1 Message Date
Richard van der Hoff 8462c26485 Improvements to the Limiter
* give them names, to improve logging
* use a deque rather than a list for efficiency
2018-07-20 12:50:27 +01:00
Richard van der Hoff d7275eecf3 Add a sleep to the Limiter to fix stack overflows.
Fixes #3570
2018-07-20 12:37:12 +01:00
Amber Brown 9a8acdc720 Merge branch 'master' into develop 2018-07-19 21:16:44 +10:00
Amber Brown 38f53399a2 0.33 final 2018-07-19 21:11:40 +10:00
Amber Brown 95ccb6e2ec
Don't spew errors because we can't save metrics (#3563) 2018-07-19 20:58:18 +10:00
Matthew Hodgson 9e40834f74 yes, we do need to invalidate the device_id_exists_cache when deleting a remote device 2018-07-19 11:15:10 +01:00
Richard van der Hoff f1a15ea206 revert 00bc979
... we've fixed the things that caused the warnings, so we should reinstate the
warning.
2018-07-19 11:14:20 +01:00
Richard van der Hoff 1ffb7bec20 Merge remote-tracking branch 'origin/release-v0.33.0' into develop 2018-07-19 11:12:33 +01:00
Richard van der Hoff c754e006f4
Merge pull request #3556 from matrix-org/rav/background_processes
Run things as background processes
2018-07-19 11:04:18 +01:00
Amber Brown a97c845271
Move v1-only APIs into their own module & isolate deprecated ones (#3460) 2018-07-19 20:03:33 +10:00
Matthew Hodgson c0685f67c0 spell out that include_deleted_devices requires include_all_devices 2018-07-19 10:59:02 +01:00
Richard van der Hoff 00bc979137 Disable logcontext warning
Temporary workaround to #3518 while we release 0.33.0.
2018-07-19 10:51:15 +01:00
Erik Johnston 6f62a6ef21
Merge pull request #3554 from matrix-org/erikj/response_metrics_code
Add response code to response timer metrics
2018-07-19 10:25:18 +01:00
Richard van der Hoff 8c69b735e3 Make Distributor run its processes as a background process
This is more involved than it might otherwise be, because the current
implementation just drops its logcontexts and runs everything in the sentinel
context.

It turns out that we aren't actually using a bunch of the functionality here
(notably suppress_failures and the fact that Distributor.fire returns a
deferred), so the easiest way to fix this is actually by simplifying a bunch of
code.
2018-07-18 20:55:05 +01:00
Richard van der Hoff 667fba68f3 Run things as background processes
This fixes #3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.

(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
2018-07-18 20:55:05 +01:00
Erik Johnston 9b596177ae Add some room read only APIs to client_reader 2018-07-18 15:33:03 +01:00
Erik Johnston bacdf0cbf9 Move RoomContextHandler out of Handlers
This is in preparation for moving GET /context/ to a worker
2018-07-18 15:33:03 +01:00
Erik Johnston 8cb8df55e9 Split MessageHandler into read only and writers
This will let us call the read only parts from workers, and so be able
to move some APIs off of master, e.g. the `/state` API.
2018-07-18 15:33:03 +01:00
Richard van der Hoff 65d6a0e477
Merge pull request #3553 from matrix-org/rav/background_process_tracking
Resource tracking for background processes
2018-07-18 14:27:50 +01:00
Erik Johnston 5bd0a47fcd pep8 2018-07-18 14:19:00 +01:00
Erik Johnston e45a46b6e4 Add response code to response timer metrics 2018-07-18 13:59:36 +01:00
Richard van der Hoff dab00faa83
Merge pull request #3367 from matrix-org/rav/drop_re_signing_hacks
Remove event re-signing hacks
2018-07-18 12:46:27 +01:00
David Baker c91a44572e
Merge pull request #3514 from matrix-org/dbkr/turn_dont_add_defaults
Comment dummy TURN parameters in default config
2018-07-18 11:56:18 +01:00
Richard van der Hoff 6e3fc657b4 Resource tracking for background processes
This introduces a mechanism for tracking resource usage by background
processes, along with an example of how it will be used.

This will help address #3518, but more importantly will give us better insights
into things which are happening but not being shown up by the request metrics.

We *could* do this with Measure blocks, but:
 - I think having them pulled out as a completely separate metric class will
   make it easier to distinguish top-level processes from those which are
   nested.

 - I want to be able to report on in-flight background processes, and I don't
   think we want to do this for *all* Measure blocks.
2018-07-18 10:50:33 +01:00
Amber Brown 5f3d02f6eb bump to 0.33.0rc1 2018-07-18 12:52:56 +10:00
Richard van der Hoff 79eb339c66 add a comment 2018-07-17 14:53:34 +01:00
Richard van der Hoff d897be6a98 Fix visibility of events from erased users over federation 2018-07-17 14:02:07 +01:00
Richard van der Hoff 9c04b4abf9
Merge pull request #3541 from matrix-org/rav/optimize_filter_events_for_server
Refactor and optimze filter_events_for_server
2018-07-17 14:01:39 +01:00
Richard van der Hoff 94440ae994 fix imports 2018-07-17 11:51:26 +01:00
Amber Brown bc006b3c9d
Refactor REST API tests to use explicit reactors (#3351) 2018-07-17 20:43:18 +10:00
Richard van der Hoff 2172a3d8cb add a comment 2018-07-17 11:13:57 +01:00
Erik Johnston b2aa05a8d6 Use efficient .intersection 2018-07-17 11:07:04 +01:00
Erik Johnston 547b1355d3 Fix perf regression in PR #3530
The get_entities_changed function was changed to return all changed
entities since the given stream position, rather than only those changed
from a given list of entities. This resulted in the function incorrectly
returning large numbers of entities that, for example, caused large
increases in database usage.
2018-07-17 10:27:51 +01:00
Amber Brown 3fe0938b76
Merge pull request #3530 from matrix-org/erikj/stream_cache
Don't return unknown entities in get_entities_changed
2018-07-17 13:44:46 +10:00
Richard van der Hoff 09e29fb58b Attempt to make _filter_events_for_server more efficient 2018-07-16 14:06:09 +01:00
Krombel 78a9ddcf9a rerun isort with latest version 2018-07-16 14:23:25 +02:00
Richard van der Hoff ea69d35651 Move filter_events_for_server out of FederationHandler
for easier unit testing.
2018-07-16 13:06:24 +01:00
Krombel 4a27000548 check isort by travis 2018-07-16 13:57:33 +02:00
Amber Brown 8a4f05fefb
Fix develop because I broke it :( (#3535) 2018-07-14 09:51:00 +10:00
Amber Brown 8532953c04
Merge pull request #3534 from krombel/use_parse_and_asserts_from_servlet
Use parse and asserts from http.servlet
2018-07-14 09:09:19 +10:00
Amber Brown a2374b2c7f
fix sytests 2018-07-14 07:52:58 +10:00
Amber Brown 33b60c01b5
Make auth & transactions more testable (#3499) 2018-07-14 07:34:49 +10:00
Krombel 516f960ad8 add changelog 2018-07-13 22:19:19 +02:00
Krombel 3366b9c534 rename assert_params_in_request to assert_params_in_dict
the method "assert_params_in_request" does handle dicts and not
requests. A request body has to be parsed to json before this method
can be used
2018-07-13 21:53:01 +02:00
Krombel 32fd6910d0 Use parse_{int,str} and assert from http.servlet
parse_integer and parse_string can take a request and raise errors
in case we have wrong or missing params.
This PR tries to use them more to deduplicate some code and make it
better readable
2018-07-13 21:40:14 +02:00
Richard van der Hoff 2aba1f549c
Merge pull request #3533 from matrix-org/rav/fix_federation_ratelimite_queue
Make FederationRateLimiter queue requests properly
2018-07-13 16:59:18 +01:00
Richard van der Hoff 33b40d0a25 Make FederationRateLimiter queue requests properly
popitem removes the *most recent* item by default [1]. We want the oldest.

Fixes #3524

[1]: https://docs.python.org/2/library/collections.html#collections.OrderedDict.popitem
2018-07-13 16:19:40 +01:00
Erik Johnston 77b692e65d Don't return unknown entities in get_entities_changed
The stream cache keeps track of all entities that have changed since
a particular stream position, so get_entities_changed does not need to
return unknown entites when given a larger stream position.

This makes it consistent with the behaviour of has_entity_changed.
2018-07-13 15:26:10 +01:00
Matthew Hodgson ba22b6a456 typo 2018-07-13 12:03:39 +01:00
Richard van der Hoff 6dff49b8a9
Merge pull request #3521 from matrix-org/rav/optimise_stream_change_cache
Reduce set building in get_entities_changed
2018-07-12 12:08:49 +01:00
Matthew Hodgson 12ec58301f shift to using an explicit deleted flag on m.device_list_update EDUs
and generally make it work.
2018-07-12 11:39:43 +01:00
Richard van der Hoff fa5c2bc082 Reduce set building in get_entities_changed
This line shows up as about 5% of cpu time on a synchrotron:

    not_known_entities = set(entities) - set(self._entity_to_key)

Presumably the problem here is that _entity_to_key can be largeish, and
building a set for its keys every time this function is called is slow.

Here we rewrite the logic to avoid building so many sets.
2018-07-12 11:37:44 +01:00
Richard van der Hoff 482d17b58b Merge branch 'develop' into rav/enforce_report_api 2018-07-12 09:56:28 +01:00
Erik Johnston 0456e05977
Merge pull request #3505 from matrix-org/erikj/receipts_cahce
Use stream cache in get_linearized_receipts_for_room
2018-07-12 09:46:29 +01:00
Erik Johnston aff1dfdf3d Update return value docstring 2018-07-12 09:45:37 +01:00
Matthew Hodgson 5797f5542b WIP to announce deleted devices over federation
Previously we queued up the poke correctly when the device was deleted,
but then the actual EDU wouldn't get sent, as the device was no longer known.
Instead, we now send EDUs for deleted devices too if there's a poke for them.
2018-07-12 01:32:39 +01:00
David Baker 36f4fd3e1e Comment dummy TURN parameters in default config
This default config is parsed and used a base before the actual
config is overlaid, so with these values not commented out, the
code to detect when no turn params were set and refuse to generate
credentials was never firing because the dummy default was always set.
2018-07-11 15:49:29 +01:00
Erik Johnston 6ccefef07a Use 'is not None' and add comments 2018-07-10 18:12:39 +01:00
Matthew Hodgson ea752bdd99 s/becuase/because/g 2018-07-10 17:58:18 +01:00
Erik Johnston 05f5dabc10 Use stream cache in get_linearized_receipts_for_room
This avoids us from uncessarily hitting the database when there has been
no change for the room
2018-07-10 17:22:42 +01:00
Richard van der Hoff c3c29aa196
Attempt to include db threads in cpu usage stats (#3496)
Let's try to include time spent in the DB threads in the per-request/block cpu
usage metrics.
2018-07-10 16:12:36 +01:00
Richard van der Hoff 55370331da
Refactor logcontext resource usage tracking (#3501)
Factor out the resource usage tracking out to a separate object, which can be
passed around and copied independently of the logcontext itself.
2018-07-10 13:56:07 +01:00
Matthew Hodgson 16b10666e7 another typo 2018-07-10 12:28:42 +01:00
Matthew Hodgson 4ea391a6ae typo (i think) 2018-07-10 12:08:09 +01:00
Richard van der Hoff e31e5dee38 Add CPU metrics for _fetch_event_list
add a Measure block on _fetch_event_list, in the hope that we can better
measure CPU usage here.
2018-07-09 18:15:54 +01:00
Amber Brown 49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown 3060bcc8e9 version 2018-07-07 10:48:06 +10:00
Amber Brown e845fd41c2
Correct attrs package name in requirements (#3492) 2018-07-07 10:46:59 +10:00
Richard van der Hoff 1cfc2c4790 Prepare 0.32.1 release 2018-07-06 16:50:52 +01:00
Richard van der Hoff 1464a0578a Add explicit dependency on netaddr
the dependencies file, causing failures on upgrade (and presumably for new
installs).
2018-07-06 16:27:17 +01:00
Neil Johnson 277c561766 0.32.0 version bump, update changelog 2018-07-06 15:07:29 +01:00
Amber Brown d196fe42a9 bump version to 0.32.0rc1 2018-07-05 20:22:35 +10:00
Richard van der Hoff 3cf3e08a97 Implementation of server_acls
... as described at
https://docs.google.com/document/d/1EttUVzjc2DWe2ciw4XPtNpUpIl9lWXGEsy2ewDS7rtw.
2018-07-04 19:06:20 +01:00
Richard van der Hoff 546bc9e28b More server_name validation
We need to do a bit more validation when we get a server name, but don't want
to be re-doing it all over the shop, so factor out a separate
parse_and_validate_server_name, and do the extra validation.

Also, use it to verify the server name in the config file.
2018-07-04 18:59:51 +01:00
Erik Johnston 13f7adf84b
Merge pull request #3473 from matrix-org/erikj/thread_cache
Invalidate cache on correct thread
2018-07-04 10:11:38 +01:00
Erik Johnston 40252d13d1
Merge pull request #3474 from matrix-org/erikj/py3_auth
Fix up auth check
2018-07-04 09:41:33 +01:00
Richard van der Hoff a4ab491371
Merge branch 'develop' into rav/drop_re_signing_hacks 2018-07-04 07:13:38 +01:00
Richard van der Hoff 508196e08a
Reject invalid server names (#3480)
Make sure that server_names used in auth headers are sane, and reject them with
a sensible error code, before they disappear off into the depths of the system.
2018-07-03 14:36:14 +01:00
Erik Johnston 2c33b55738 Avoid relying on int vs None comparison
Python 3 doesn't support comparing None to ints
2018-07-02 11:40:32 +01:00
Erik Johnston cbf82dddf1 Ensure that we define sender_domain 2018-07-02 11:37:57 +01:00
Erik Johnston 3905c693c5 Invalidate cache on correct thread 2018-07-02 11:36:44 +01:00
Matthew Hodgson fc4f8f33be replace invalid utf8 with \ufffd 2018-07-02 11:33:02 +01:00
Matthew Hodgson 1c867f5391 a fix which doesn't NPE everywhere 2018-07-01 11:56:33 +01:00
Matthew Hodgson f131bf8d3e don't mix unicode strings with utf8-in-byte-strings
otherwise we explode with:

```
Traceback (most recent call last):
  File /usr/lib/python2.7/logging/handlers.py, line 78, in emit
    logging.FileHandler.emit(self, record)
  File /usr/lib/python2.7/logging/__init__.py, line 950, in emit
    StreamHandler.emit(self, record)
  File /usr/lib/python2.7/logging/__init__.py, line 887, in emit
    self.handleError(record)
  File /usr/lib/python2.7/logging/__init__.py, line 810, in handleError
    None, sys.stderr)
  File /usr/lib/python2.7/traceback.py, line 124, in print_exception
    _print(file, 'Traceback (most recent call last):')
  File /usr/lib/python2.7/traceback.py, line 13, in _print
    file.write(str+terminator)
  File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_io.py, line 170, in write
    self.log.emit(self.level, format=u{log_io}, log_io=line)
  File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_logger.py, line 144, in emit
    self.observer(event)
  File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_observer.py, line 136, in __call__
    errorLogger = self._errorLoggerForObserver(brokenObserver)
  File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_observer.py, line 156, in _errorLoggerForObserver
    if obs is not observer
  File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_observer.py, line 81, in __init__
    self.log = Logger(observer=self)
  File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_logger.py, line 64, in __init__
    namespace = self._namespaceFromCallingContext()
  File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/logger/_logger.py, line 42, in _namespaceFromCallingContext
    return currentframe(2).f_globals[__name__]
  File /home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/python/compat.py, line 93, in currentframe
    for x in range(n + 1):
RuntimeError: maximum recursion depth exceeded while calling a Python object
Logged from file site.py, line 129
  File /usr/lib/python2.7/logging/__init__.py, line 859, in emit
    msg = self.format(record)
  File /usr/lib/python2.7/logging/__init__.py, line 732, in format
    return fmt.format(record)
  File /usr/lib/python2.7/logging/__init__.py, line 471, in format
    record.message = record.getMessage()
  File /usr/lib/python2.7/logging/__init__.py, line 335, in getMessage
    msg = msg % self.args
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 4: ordinal not in range(128)
Logged from file site.py, line 129
```

...where the logger apparently recurses whilst trying to log the error, hitting the
maximum recursion depth and killing everything badly.
2018-07-01 05:08:58 +01:00
Erik Johnston e3b4043800
Merge pull request #3456 from matrix-org/hawkowl/federation-prevevent-checking
Check the state of prev_events a bit more thoroughly when coming over federation
2018-06-29 13:55:02 +01:00
Matthew Hodgson e72234f6bd fix tests 2018-06-28 20:56:07 +01:00
Matthew Hodgson f4f1cda928 add ip_range_whitelist parameter to limit where ASes can connect from 2018-06-28 20:32:00 +01:00
Amber Brown 6350bf925e
Attempt to be more performant on PyPy (#3462) 2018-06-28 14:49:57 +01:00
Amber Brown 72d2143ea8
Revert "Revert "Try to not use as much CPU in the StreamChangeCache"" (#3454) 2018-06-28 11:04:18 +01:00
Amber Brown 99800de63d try and clean up 2018-06-27 11:40:27 +01:00
Amber Brown f03a5d1a17 pep8 2018-06-27 11:38:14 +01:00
Amber Brown a7ecf34b70 cleanups 2018-06-27 11:36:03 +01:00
Amber Brown 77078d6c8e handle federation not telling us about prev_events 2018-06-27 11:27:32 +01:00
Matthew Hodgson 8057489b26
Revert "Try to not use as much CPU in the StreamChangeCache" 2018-06-26 18:09:01 +01:00
Matthew Hodgson d91efb06cf
Merge pull request #3451 from matrix-org/hawkowl/sorteddict-api
Try to not use as much CPU in the cache
2018-06-26 17:49:55 +01:00
Amber Brown 1202508067 fixes 2018-06-26 17:29:01 +01:00
Amber Brown bd3d329c88 fixes 2018-06-26 17:28:12 +01:00
Amber Brown abfe4b2957 try and make loading items from the cache faster 2018-06-26 17:25:34 +01:00
David Baker 028490afd4 Fix error on deleting users pending deactivation
Use simple_delete instead of simple_delete_one as commented
2018-06-26 10:52:52 +01:00
Matthew Hodgson c7f6b420ae
Merge pull request #3448 from matrix-org/matthew/gdpr-deactivate-admin-api
add GDPR erase param to deactivate API
2018-06-26 10:43:14 +01:00