MatrixSynapse

Commit Graph

Author	SHA1	Message	Date
Richard van der Hoff	fc149b4eeb	Merge remote-tracking branch 'origin/develop' into rav/use_run_in_background	2018-04-27 14:31:23 +01:00
Richard van der Hoff	6146332387	Merge remote-tracking branch 'origin/develop' into rav/deferred_timeout	2018-04-27 14:18:00 +01:00
Richard van der Hoff	2a13af23bc	Use run_in_background in preference to preserve_fn While I was going through uses of preserve_fn for other PRs, I converted places which only use the wrapped function once to use run_in_background, to avoid creating the function object.	2018-04-27 12:55:51 +01:00
Richard van der Hoff	9d2c1b8429	Backport deferred.addTimeout Twisted 16.0 doesn't have addTimeout, so let's backport it.	2018-04-27 12:52:30 +01:00
Richard van der Hoff	13843f771e	Trap exceptions thrown within run_in_background Turn any exceptions that get thrown synchronously within run_in_background into Failures instead.	2018-04-27 12:17:13 +01:00
Richard van der Hoff	9255a6cb17	Improve exception handling for background processes There were a bunch of places where we fire off a process to happen in the background, but don't have any exception handling on it - instead relying on the unhandled error being logged when the relevent deferred gets garbage-collected. This is unsatisfactory for a number of reasons: - logging on garbage collection is best-effort and may happen some time after the error, if at all - it can be hard to figure out where the error actually happened. - it is logged as a scary CRITICAL error which (a) I always forget to grep for and (b) it's not really CRITICAL if a background process we don't care about fails. So this is an attempt to add exception handling to everything we fire off into the background.	2018-04-27 11:07:40 +01:00
Richard van der Hoff	1ea904b9f0	Use deferred.addTimeout instead of time_bound_deferred This doesn't feel like a wheel we need to reinvent.	2018-04-23 00:53:18 +01:00
Richard van der Hoff	8dc4a6144b	Merge pull request #3107 from NotAFile/py3-bool-nonzero add __bool__ alias to __nonzero__ methods	2018-04-20 15:43:39 +01:00
Richard van der Hoff	c09a6daf09	Merge pull request #3110 from NotAFile/py3-six-queue Replace Queue with six.moves.queue	2018-04-20 15:35:00 +01:00
Richard van der Hoff	11a67b7c9d	Merge pull request #3093 from matrix-org/rav/response_cache_wrap Refactor ResponseCache usage	2018-04-20 11:31:17 +01:00
Adrian Tschira	878995e660	Replace Queue with six.moves.queue and a six.range change which I missed the last time Signed-off-by: Adrian Tschira <nota@notafile.com>	2018-04-16 00:46:21 +02:00
Adrian Tschira	f63ff73c7f	add __bool__ alias to __nonzero__ methods Signed-off-by: Adrian Tschira <nota@notafile.com>	2018-04-15 20:40:47 +02:00
Richard van der Hoff	d3347ad485	Revert "Use sortedcontainers instead of blist" This reverts commit `9fbe70a7dc`. It turns out that sortedcontainers.SortedDict is not an exact match for blist.sorteddict; in particular, `popitem()` removes things from the opposite end of the dict. This is trivial to fix, but I want to add some unit tests, and potentially some more thought about it, before we do so.	2018-04-13 11:16:43 +01:00
Richard van der Hoff	60f6014bb7	ResponseCache: fix handling of completed results Turns out that ObservableDeferred.observe doesn't return a deferred if the result is already completed. Fix handling and improve documentation.	2018-04-13 07:32:29 +01:00
Richard van der Hoff	b78395b7fe	Refactor ResponseCache usage Adds a `.wrap` method to ResponseCache which wraps up the boilerplate of a (get, set) pair, and then use it throughout the codebase. This will be largely non-functional, but does include the following functional changes: * federation_server.on_context_state_request: drops use of _server_linearizer which looked redundant and could cause incorrect cache misses by yielding between the get and the set. * RoomListHandler.get_remote_public_room_list(): fixes logcontext leaks * the wrap function includes some logging. I'm hoping this won't be too noisy on production.	2018-04-12 13:02:15 +01:00
Richard van der Hoff	d5c74b9f6c	Merge pull request #3092 from matrix-org/rav/response_cache_metrics Add metrics for ResponseCache	2018-04-12 12:59:36 +01:00
Richard van der Hoff	261124396e	Merge pull request #3059 from matrix-org/rav/doc_response_cache Document the behaviour of ResponseCache	2018-04-12 11:22:30 +01:00
Richard van der Hoff	b3384232a0	Add metrics for ResponseCache	2018-04-10 23:14:47 +01:00
Vincent Breitmoser	9fbe70a7dc	Use sortedcontainers instead of blist This commit drop-in replaces blist with SortedContainers. They are written in pure python so work with pypy, but perform as good as native implementations, at least in a couple benchmarks: http://www.grantjenks.com/docs/sortedcontainers/performance.html	2018-04-10 11:29:51 +02:00
Richard van der Hoff	13decdbf96	Revert "Merge pull request #3066 from matrix-org/rav/remove_redundant_metrics" We aren't ready to release this yet, so I'm reverting it for now. This reverts commit `d1679a4ed7`, reversing changes made to `e089100c62`.	2018-04-09 12:59:12 +01:00
Richard van der Hoff	3449da3bc7	Merge pull request #3068 from matrix-org/rav/fix_cache_invalidation Improve database cache performance	2018-04-05 17:21:44 +01:00
Richard van der Hoff	01afc563c3	Fix overzealous cache invalidation Fixes an issue where a cache invalidation would invalidate all pending entries, rather than just the entry that we intended to invalidate.	2018-04-05 16:24:04 +01:00
Richard van der Hoff	518f6de088	Remove redundant metrics which were deprecated in 0.27.0.	2018-04-04 19:46:28 +01:00
Richard van der Hoff	a9a74101a4	Document the behaviour of ResponseCache it looks like everything that uses ResponseCache expects to have to `make_deferred_yieldable` its results. It's debatable whether that is the best approach, but let's document it for now to avoid further confusion.	2018-04-04 09:06:22 +01:00
Richard van der Hoff	05630758f2	Use static JSONEncoders using json.dumps with custom options requires us to create a new JSONEncoder on each call. It's more efficient to create one upfront and reuse it.	2018-03-29 23:13:33 +01:00
Matthew Hodgson	8cbbfaefc1	404 correctly on missing paths via NoResource fixes https://github.com/matrix-org/synapse/issues/2043 and https://github.com/matrix-org/synapse/issues/2029	2018-03-23 10:32:50 +00:00
Erik Johnston	9a0d783c11	Add comments	2018-03-19 11:35:53 +00:00
Richard van der Hoff	5a6e54264d	Make 'unexpected logging context' into warnings I think we've now fixed enough of these that the rest can be logged at warning.	2018-03-15 18:40:38 +00:00
Erik Johnston	7c7706f42b	Fix bug where state cache used lots of memory The state cache bases its size on the sum of the size of entries. The size of the entry is calculated once on insertion, so it is important that the size of entries does not change. The DictionaryCache modified the entries size, which caused the state cache to incorrectly think it was smaller than it actually was.	2018-03-15 15:46:54 +00:00
Richard van der Hoff	20f40348d4	Factor run_in_background out from preserve_fn It annoys me that we create temporary function objects when there's really no need for it. Let's factor the gubbins out of preserve_fn and start using it.	2018-03-08 11:50:11 +00:00
Richard van der Hoff	3a75de923b	Rewrite make_deferred_yieldable avoiding inlineCallbacks ... because (a) it's actually simpler (b) it might be marginally more performant?	2018-03-01 12:40:05 +00:00
Richard van der Hoff	bc496df192	report metrics on number of cache evictions	2018-02-05 15:34:01 +00:00
Matthew Hodgson	ab9f844aaf	Add federation_domain_whitelist option (#2820 ) Add federation_domain_whitelist gives a way to restrict which domains your HS is allowed to federate with. useful mainly for gracefully preventing a private but internet-connected HS from trying to federate to the wider public Matrix network	2018-01-22 19:11:18 +01:00
Matthew Hodgson	d84f65255e	Merge pull request #2813 from matrix-org/matthew/registrations_require_3pid add registrations_require_3pid and allow_local_3pids	2018-01-22 13:57:22 +00:00
Matthew Hodgson	8fe253f19b	fix PR nitpicking	2018-01-19 18:23:45 +00:00
Matthew Hodgson	447f4f0d5f	rewrite based on PR feedback: * [ ] split config options into allowed_local_3pids and registrations_require_3pid * [ ] simplify and comment logic for picking registration flows * [ ] fix docstring and move check_3pid_allowed into a new util module * [ ] use check_3pid_allowed everywhere @erikjohnston PTAL	2018-01-19 15:33:55 +00:00
Erik Johnston	b6dc7044a9	Merge pull request #2804 from matrix-org/erikj/file_consumer Add decent impl of a FileConsumer	2018-01-18 16:31:33 +00:00
Richard van der Hoff	d57765fc8a	Fix bugs in block metrics ... which I introduced in #2785	2018-01-18 12:24:42 +00:00
Erik Johnston	be0dfcd4a2	Do logcontexts correctly	2018-01-18 11:57:57 +00:00
Erik Johnston	1432f7ccd5	Move test stuff to tests	2018-01-18 11:57:57 +00:00
Erik Johnston	2f18a2647b	Make all fields private	2018-01-18 11:57:54 +00:00
Erik Johnston	dc519602ac	Ensure we registerProducer isn't called twice	2018-01-18 11:07:17 +00:00
Erik Johnston	17b54389fe	Fix _notify_empty typo	2018-01-18 11:05:34 +00:00
Erik Johnston	28b338ed9b	Move definition of paused_producer to __init__	2018-01-18 11:04:41 +00:00
Erik Johnston	a177325b49	Fix comments	2018-01-18 11:02:43 +00:00
Erik Johnston	bc67e7d260	Add decent impl of a FileConsumer Twisted core doesn't have a general purpose one, so we need to write one ourselves. Features: - All writing happens in background thread - Supports both push and pull producers - Push producers get paused if the consumer falls behind	2018-01-17 16:43:03 +00:00
Richard van der Hoff	3d12d97415	Track DB scheduling delay per-request For each request, track the amount of time spent waiting for a db connection. This entails adding it to the LoggingContext and we may as well add metrics for it while we are passing.	2018-01-16 17:23:32 +00:00
Richard van der Hoff	6324b65f08	Track db txn time in millisecs ... to reduce the amount of floating-point foo we do.	2018-01-16 15:53:18 +00:00
Richard van der Hoff	44a498418c	Optimise LoggingContext creation and copying It turns out that the only thing we use the __dict__ of LoggingContext for is `request`, and given we create lots of LoggingContexts and then copy them every time we do a db transaction or log line, using the __dict__ seems a bit redundant. Let's try to optimise things by making the request attribute explicit.	2018-01-16 15:49:42 +00:00
Richard van der Hoff	39f4e29d01	Reorganise request and block metrics In order to circumvent the number of duplicate foo:count metrics increasing without bounds, it's time for a rearrangement. The following are all deprecated, and replaced with synapse_util_metrics_block_count: synapse_util_metrics_block_timer:count synapse_util_metrics_block_ru_utime:count synapse_util_metrics_block_ru_stime:count synapse_util_metrics_block_db_txn_count:count synapse_util_metrics_block_db_txn_duration:count The following are all deprecated, and replaced with synapse_http_server_response_count: synapse_http_server_requests synapse_http_server_response_time:count synapse_http_server_response_ru_utime:count synapse_http_server_response_ru_stime:count synapse_http_server_response_db_txn_count:count synapse_http_server_response_db_txn_duration:count The following are renamed (the old metrics are kept for now, but deprecated): synapse_util_metrics_block_timer:total -> synapse_util_metrics_block_time_seconds synapse_util_metrics_block_ru_utime:total -> synapse_util_metrics_block_ru_utime_seconds synapse_util_metrics_block_ru_stime:total -> synapse_util_metrics_block_ru_stime_seconds synapse_util_metrics_block_db_txn_count:total -> synapse_util_metrics_block_db_txn_count synapse_util_metrics_block_db_txn_duration:total -> synapse_util_metrics_block_db_txn_duration_seconds synapse_http_server_response_time:total -> synapse_http_server_response_time_seconds synapse_http_server_response_ru_utime:total -> synapse_http_server_response_ru_utime_seconds synapse_http_server_response_ru_stime:total -> synapse_http_server_response_ru_stime_seconds synapse_http_server_response_db_txn_count:total -> synapse_http_server_response_db_txn_count synapse_http_server_response_db_txn_duration:total synapse_http_server_response_db_txn_duration_seconds	2018-01-15 17:09:44 +00:00
Richard van der Hoff	b2cd6accf5	Remove __PreservingContextDeferred too	2017-11-14 23:00:10 +00:00
Richard van der Hoff	7e6fa29cb5	Remove preserve_context_over_{fn, deferred} Both of these functions ae known to leak logcontexts. Replace the remaining calls to them and kill them off.	2017-11-14 11:22:42 +00:00
Richard van der Hoff	bf993db11c	Logging and logcontext fixes for Limiter Add some logging to the Limiter in a similar spirit to the Linearizer, to help debug issues. Also fix a logcontext leak. Also refactor slightly to avoid throwing exceptions.	2017-11-07 00:48:57 +00:00
Richard van der Hoff	0be99858f3	fix vars named `l` E741 says "do not use variables named ‘l’, ‘O’, or ‘I’".	2017-10-23 15:56:38 +01:00
Richard van der Hoff	eaaabc6c4f	replace 'except:' with 'except Exception:' what could possibly go wrong	2017-10-23 15:52:32 +01:00
Richard van der Hoff	2e9f5ea31a	Fix logcontext handling for persist_events * don't use preserve_context_over_deferred, which is known broken. * remove a redundant preserve_fn. * add/improve some comments	2017-10-17 10:59:30 +01:00
Richard van der Hoff	cc794d60e7	Merge pull request #2532 from matrix-org/rav/fix_linearizer Fix stackoverflow and logcontexts from linearizer	2017-10-11 17:29:32 +01:00
Richard van der Hoff	f30c4ed2bc	logformatter: fix AttributeError make sure we have the relevant fields before we try to log them.	2017-10-11 17:26:17 +01:00
Richard van der Hoff	4fad8efbfb	Fix stackoverflow and logcontexts from linearizer 1. make it not blow out the stack when there are more than 50 things waiting for a lock. Fixes https://github.com/matrix-org/synapse/issues/2505. 2. Make it not mess up the log contexts.	2017-10-11 15:05:05 +01:00
Richard van der Hoff	3cc852d339	Fancy logformatter to format exceptions better This is a bit of an experimental change at this point; the idea is to see if it helps us track down where our stack overflows are coming from by logging the stack when the exception was caught and turned into a Failure. (We'll also need `edf2704420`). If we deploy this, we'll be able to enable it via the log config yaml.	2017-10-09 17:44:42 +01:00
Richard van der Hoff	148428ce76	Fix logcontext handling for concurrently_execute Avoid preserve_context_over_deferred, which is broken.	2017-10-06 22:24:28 +01:00
David Baker	8ad5f34908	pep8	2017-09-26 19:21:41 +01:00
David Baker	9fd086e506	unnecessary parens	2017-09-26 17:59:46 +01:00
David Baker	0b03a97708	Add module_loader.py	2017-09-26 17:56:41 +01:00
Erik Johnston	495f075b41	Increase default cache factor size.	2017-07-04 09:58:32 +01:00
Erik Johnston	b5e8d529e6	Define CACHE_SIZE_FACTOR once	2017-07-04 09:56:44 +01:00
Erik Johnston	c72058bcc6	Use an ExpiringCache for storing registration sessions This is because pruning them was a significant performance drain on matrix.org	2017-06-29 14:08:37 +01:00
Erik Johnston	efc2b7db95	Rewrite conditional	2017-06-09 13:35:15 +01:00
Erik Johnston	eed59dcc1e	Fix has_any_entity_changed Occaisonally has_any_entity_changed would throw the error: "Set changed size during iteration" when taking the max of the `sorteddict`. While its uncertain how that happens, its quite inefficient to iterate over the entire dict anyway so we change to using the more traditional `bisect_*` functions.	2017-06-09 11:44:01 +01:00
Erik Johnston	304880d185	Add stream change cache	2017-05-31 15:46:36 +01:00
Erik Johnston	bd7bb5df71	Pull out if statement from for loop	2017-05-22 15:12:19 +01:00
Erik Johnston	e3417a06e2	Update list cache to handle one arg case We update the normal cache descriptors to handle caches with a single argument specially so that the key wasn't a 1-tuple. We need to update the cache list to be aware of this.	2017-05-22 15:04:42 +01:00
Erik Johnston	bbfe4e996c	Make get_state_groups_from_groups faster. Most of the time was spent copying a dict to filter out sentinel values that indicated that keys did not exist in the dict. The sentinel values were added to ensure that we cached the non-existence of keys. By updating DictionaryCache to keep track of which keys were known to not exist itself we can remove a dictionary copy.	2017-05-17 15:12:15 +01:00
Erik Johnston	ffad4fe35b	Don't update event cache hit ratio from get_joined_users Otherwise the hit ration of plain get_events gets completely skewed by calls to get_joined_users* functions.	2017-05-08 16:06:17 +01:00
Erik Johnston	d2d8ed4884	Optimise caches with single key	2017-05-04 14:18:46 +01:00
Richard van der Hoff	2e996271fe	Instantiate DeferredTimedOutError correctly Call `super` correctly, so that we correctly initialise the `errcode` field. Fixes https://github.com/matrix-org/synapse/issues/2179.	2017-05-02 13:26:17 +01:00
Erik Johnston	d9aa645f86	Reduce size of joined_user cache The _get_joined_users_from_context cache stores a mapping from user_id to avatar_url and display_name. Instead of storing those in a dict, store them in a namedtuple as that uses much less memory. We also try converting the string to ascii to further reduce the size.	2017-04-25 14:38:51 +01:00
Erik Johnston	efab1dadde	Remove DEBUG_CACHES	2017-04-25 10:54:09 +01:00
Erik Johnston	119cb9bbcf	Reduce cache size by not storing deferreds Currently the cache descriptors store deferreds rather than raw values, this is a simple way of triggering only one database hit and sharing the result if two callers attempt to get the same value. However, there are a few caches that simply store a mapping from string to string (or int). These caches can have a large number of entries, under the assumption that each entry is small. However, the size of a deferred (specifically the size of ObservableDeferred) is signigicantly larger than that of the raw value, 2kb vs 32b. This PR therefore changes the cache descriptors to store the raw values rather than the deferreds. As a side effect cached storage function now either return a deferred or the actual value, as the cached list decriptor already does. This is fine as we always end up just yield'ing on the returned value eventually, which handles that case correctly.	2017-04-25 10:23:11 +01:00
Erik Johnston	d134d0935e	Only intern ascii strings	2017-04-24 14:07:48 +01:00
Richard van der Hoff	e2eebf1696	Fix fixme in preserve_fn `preserve_fn` is no longer used as a decorator anywhere, so we can safely fix a fixme therein.	2017-04-03 15:38:02 +01:00
Erik Johnston	4d17add8de	Remove unused instance variable	2017-03-31 09:38:27 +01:00
Erik Johnston	5b5b171f3e	Docs	2017-03-30 17:05:53 +01:00
Erik Johnston	b282fe7170	Revert log context change	2017-03-30 17:03:59 +01:00
Erik Johnston	6194a64ae9	Doc new instance variables	2017-03-30 14:19:10 +01:00
Erik Johnston	014fee93b3	Manually calculate cache key as getcallargs is expensive This is because getcallargs recomputes the getargspec, amongst other things, which we don't need to do as its already been done	2017-03-30 14:14:46 +01:00
Erik Johnston	86780a8bc3	Don't convert to deferreds when not necessary	2017-03-30 14:14:36 +01:00
Richard van der Hoff	f9b4bb05e0	Fix the logcontext handling in the cache wrappers (#2077 ) The cache wrappers had a habit of leaking the logcontext into the reactor while the lookup function was running, and then not restoring it correctly when the lookup function had completed. It's all the fault of `preserve_context_over_{fn,deferred}` which are basically a bit broken.	2017-03-30 13:22:24 +01:00
Richard van der Hoff	9397edb28b	Merge pull request #2050 from matrix-org/rav/federation_backoff push federation retry limiter down to matrixfederationclient	2017-03-23 22:27:01 +00:00
Richard van der Hoff	06ce7335e9	Merge pull request #2052 from matrix-org/rav/time_bound_deferred Fix time_bound_deferred to throw the right exception	2017-03-23 22:22:54 +00:00
Richard van der Hoff	5a16cb4bf0	Ignore backoff history for invites, aliases, and roomdirs Add a param to the federation client which lets us ignore historical backoff data for federation queries, and set it for a handful of operations.	2017-03-23 12:23:22 +00:00
Richard van der Hoff	b88a323ffb	Fix time_bound_deferred to throw the right exception Due to a failure to instantiate DeferredTimedOutError, time_bound_deferred would throw a CancelledError when the deferred timed out, which was rather confusing.	2017-03-23 12:07:11 +00:00
Richard van der Hoff	4bd597d9fc	push federation retry limiter down to matrixfederationclient rather than having to instrument everywhere we make a federation call, make the MatrixFederationHttpClient manage the retry limiter.	2017-03-23 09:28:46 +00:00
Richard van der Hoff	19b9366d73	Fix a couple of logcontext leaks Use preserve_fn to correctly manage the logcontexts around things we don't want to yield on.	2017-03-23 00:17:46 +00:00
Richard van der Hoff	95f21c7a66	Fix caching of remote servers' signature keys The `@cached` decorator on `KeyStore._get_server_verify_key` was missing its `num_args` parameter, which meant that it was returning the wrong key for any server which had more than one recorded key. By way of a fix, change the default for `num_args` to be all arguments. To implement that, factor out a common base class for `CacheDescriptor` and `CacheListDescriptor`.	2017-03-22 15:11:30 +00:00
Richard van der Hoff	bd08ee7a46	Merge pull request #2026 from matrix-org/rav/logcontext_docs Logcontext docs	2017-03-20 12:05:21 +00:00
Richard van der Hoff	f40c2db05a	Stop preserve_fn leaking context into the reactor Fix a bug in ``logcontext.preserve_fn`` which made it leak context into the reactor, and add a test for it. Also, get rid of ``logcontext.reset_context_after_deferred``, which tried to do the same thing but had its own, different, set of bugs.	2017-03-18 00:07:43 +00:00
Richard van der Hoff	d2d146a314	Logcontext docs	2017-03-17 23:59:28 +00:00
Richard van der Hoff	2abe85d50e	Merge pull request #2016 from matrix-org/rav/queue_pdus_during_join Queue up federation PDUs while a room join is in progress	2017-03-17 11:32:44 +00:00
Richard van der Hoff	29ed09e80a	Fix assertion to stop transaction queue getting wedged ... and update some docstrings to correctly reflect the types being used. get_new_device_msgs_for_remote can return a long under some circumstances, which was being stored in last_device_list_stream_id_by_dest, and was then upsetting things on the next loop.	2017-03-15 12:16:55 +00:00

1 2 3 4 5 ...

437 Commits (55370331da54c46c04253b009865097fe9e95191)