Commit Graph

168 Commits (8123b2f90945e6bb3d65cf48730c2133b46d2b9a)

Author SHA1 Message Date
Richard van der Hoff 13683a3a22
Extend StreamChangeCache to support multiple entities per stream ID (#7303)
First some background: StreamChangeCache is used to keep track of what "entities" have 
changed since a given stream ID. So for example, we might use it to keep track of when the last
to-device message for a given user was received [1], and hence whether we need to pull any to-device messages from the database on a sync [2].

Now, it turns out that StreamChangeCache didn't support more than one thing being changed at
a given stream_id (this was part of the problem with #7206). However, it's entirely valid to send
to-device messages to more than one user at a time.

As it turns out, this did in fact work, because *some* methods of StreamChangeCache coped
ok with having multiple things changing on the same stream ID, and it seems we never actually
use the methods which don't work on the stream change caches where we allow multiple
changes at the same stream ID. But that feels horribly fragile, hence: let's update
StreamChangeCache to properly support this, and add some typing and some more tests while
we're at it.

[1]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L301
[2]: https://github.com/matrix-org/synapse/blob/release-v1.12.3/synapse/storage/data_stores/main/deviceinbox.py#L47-L51
2020-04-22 13:45:40 +01:00
Richard van der Hoff 0f8f02bc39
On catchup, process each row with its own stream id (#7286)
Other parts of the code (such as the StreamChangeCache) assume that there will
not be multiple changes with the same stream id.

This code was introduced in #7024, and I hope this fixes #7206.
2020-04-20 11:43:29 +01:00
Erik Johnston ed630ea17c
Reduce amount of logging at INFO level. (#6862)
A lot of the things we log at INFO are now a bit superfluous, so lets
make them DEBUG logs to reduce the amount we log by default.

Co-Authored-By: Brendan Abolivier <babolivier@matrix.org>
Co-authored-by: Brendan Abolivier <github@brendanabolivier.com>
2020-02-06 13:31:05 +00:00
Hubert Chathi cb2db17994
look up cross-signing keys from the DB in bulk (#6486) 2019-12-12 12:03:28 -05:00
Erik Johnston f166a8d1f5 Remove SnapshotCache in favour of ResponseCache 2019-12-09 13:42:49 +00:00
V02460 affcc2cc36 Fix LruCache callback deduplication (#6213) 2019-11-07 09:43:51 +00:00
Andrew Morgan 54fef094b3
Remove usage of deprecated logger.warn method from codebase (#6271)
Replace every instance of `logger.warn` with `logger.warning` as the former is deprecated.
2019-10-31 10:23:24 +00:00
Erik Johnston e6c7e239ef Update docstring 2019-10-29 11:48:30 +00:00
Erik Johnston d0d8a22c13 Quick fix to ensure cache descriptors always return deferreds 2019-10-28 13:33:04 +00:00
Amber Brown 864f144543
Fix up some typechecking (#6150)
* type checking fixes

* changelog
2019-10-02 05:29:01 -07:00
Erik Johnston 17e1e80726 Retry well-known lookup before expiry.
This gives a bit of a grace period where we can attempt to refetch a
remote `well-known`, while still using the cached result if that fails.

Hopefully this will make the well-known resolution a bit more torelant
of failures, rather than it immediately treating failures as "no result"
and caching that for an hour.
2019-08-13 16:20:38 +01:00
Richard van der Hoff 618bd1ee76
Fix some error cases in the caching layer. (#5749)
There was some inconsistent behaviour in the caching layer around how
exceptions were handled - particularly synchronously-thrown ones.

This seems to be most easily handled by pushing the creation of
ObservableDeferreds down from CacheDescriptor to the Cache.
2019-07-25 15:59:45 +01:00
Richard van der Hoff 418635e68a
Add a prometheus metric for active cache lookups. (#5750)
* Add a prometheus metric for active cache lookups.

* changelog
2019-07-24 11:33:13 +01:00
Amber Brown 4806651744
Replace returnValue with return (#5736) 2019-07-23 23:00:55 +10:00
Amber Brown 463b072b12
Move logging utilities out of the side drawer of util/ and into logging/ (#5606) 2019-07-04 00:07:04 +10:00
Andrew Morgan ef8c62758c
Prevent multiple upgrades on the same room at once (#5051)
Closes #4583

Does slightly less than #5045, which prevented a room from being upgraded multiple times, one after another. This PR still allows that, but just prevents two from happening at the same time.

Mostly just to mitigate the fact that servers are slow and it can take a moment for the room upgrade to actually complete. We don't want people sending another request to upgrade the room when really they just thought the first didn't go through.
2019-06-25 14:19:21 +01:00
Amber Brown 32e7c9e7f2
Run Black. (#5482) 2019-06-20 19:32:02 +10:00
Richard van der Hoff bc5f6e1797
Add a caching layer to .well-known responses (#4516) 2019-01-30 10:55:25 +00:00
Amber Brown e1728dfcbe
Make scripts/ and scripts-dev/ pass pyflakes (and the rest of the codebase on py3) (#4068) 2018-10-20 11:16:55 +11:00
Erik Johnston 4f3e3ac192 Correctly match 'dict.pop' api 2018-10-01 12:25:27 +01:00
Erik Johnston 8ea887856c Don't update eviction metrics on explicit removal 2018-10-01 12:00:58 +01:00
Richard van der Hoff 5b4028fa78 Merge branch 'rav/fix_expiring_cache_len' into erikj/destination_retry_cache 2018-09-26 12:55:53 +01:00
Richard van der Hoff 7ee94fc1ba Log which cache is throwing exceptions 2018-09-26 12:43:08 +01:00
Erik Johnston 3baf6e1667 Fix ExpiringCache.__len__ to be accurate
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.

(This appears to have been a longstanding bug, but one which has been made more
of a problem by #3932 and #3933).

(This was originally done by Erik as part of #3933. I'm cherry-picking it
because really it's a fix in its own right)
2018-09-26 12:32:29 +01:00
Erik Johnston 19dc676d1a Fix ExpiringCache.__len__ to be accurate
It used to try and produce an estimate, which was sometimes negative.
This caused metrics to be sad, so lets always just calculate it from
scratch.
2018-09-21 16:25:42 +01:00
Erik Johnston fdd1a62e8d Add a five minute cache to get_destination_retry_timings
Hopefully helps with #3931
2018-09-21 14:56:12 +01:00
Erik Johnston 79eded1ae4 Make ExpiringCache slightly more performant 2018-09-21 14:52:21 +01:00
Erik Johnston 8601c24287 Fix some instances of ExpiringCache not expiring cache items
ExpiringCache required that `start()` be called before it would actually
start expiring entries. A number of places didn't do that.

This PR removes `start` from ExpiringCache, and automatically starts
backround reaping process on creation instead.
2018-09-21 14:19:46 +01:00
Amber Brown b37c472419
Rename async to async_helpers because `async` is a keyword on Python 3.7 (#3678) 2018-08-10 23:50:21 +10:00
Richard van der Hoff a8cbce0ced fix invalidation 2018-07-27 16:17:17 +01:00
Richard van der Hoff f102c05856 Rewrite cache list decorator
Because it was complicated and annoyed me. I suspect this will be more
efficient too.
2018-07-27 13:47:04 +01:00
Richard van der Hoff 03751a6420 Fix some looping_call calls which were broken in #3604
It turns out that looping_call does check the deferred returned by its
callback, and (at least in the case of client_ips), we were relying on this,
and I broke it in #3604.

Update run_as_background_process to return the deferred, and make sure we
return it to clock.looping_call.
2018-07-26 11:48:08 +01:00
Richard van der Hoff 667fba68f3 Run things as background processes
This fixes #3518, and ensures that we get useful logs and metrics for lots of
things that happen in the background.

(There are certainly more things that happen in the background; these are just
the common ones I've found running a single-process synapse locally).
2018-07-18 20:55:05 +01:00
Erik Johnston b2aa05a8d6 Use efficient .intersection 2018-07-17 11:07:04 +01:00
Erik Johnston 547b1355d3 Fix perf regression in PR #3530
The get_entities_changed function was changed to return all changed
entities since the given stream position, rather than only those changed
from a given list of entities. This resulted in the function incorrectly
returning large numbers of entities that, for example, caused large
increases in database usage.
2018-07-17 10:27:51 +01:00
Erik Johnston 77b692e65d Don't return unknown entities in get_entities_changed
The stream cache keeps track of all entities that have changed since
a particular stream position, so get_entities_changed does not need to
return unknown entites when given a larger stream position.

This makes it consistent with the behaviour of has_entity_changed.
2018-07-13 15:26:10 +01:00
Richard van der Hoff fa5c2bc082 Reduce set building in get_entities_changed
This line shows up as about 5% of cpu time on a synchrotron:

    not_known_entities = set(entities) - set(self._entity_to_key)

Presumably the problem here is that _entity_to_key can be largeish, and
building a set for its keys every time this function is called is slow.

Here we rewrite the logic to avoid building so many sets.
2018-07-12 11:37:44 +01:00
Amber Brown 49af402019 run isort 2018-07-09 16:09:20 +10:00
Amber Brown 72d2143ea8
Revert "Revert "Try to not use as much CPU in the StreamChangeCache"" (#3454) 2018-06-28 11:04:18 +01:00
Matthew Hodgson 8057489b26
Revert "Try to not use as much CPU in the StreamChangeCache" 2018-06-26 18:09:01 +01:00
Amber Brown 1202508067 fixes 2018-06-26 17:29:01 +01:00
Amber Brown bd3d329c88 fixes 2018-06-26 17:28:12 +01:00
Amber Brown abfe4b2957 try and make loading items from the cache faster 2018-06-26 17:25:34 +01:00
Richard van der Hoff 43e02c409d Disable partial state group caching for wildcard lookups
When _get_state_for_groups is given a wildcard filter, just do a complete
lookup. Hopefully this will give us the best of both worlds by not filling up
the ram if we only need one or two keys, but also making the cache still work
for the federation reader usecase.
2018-06-22 11:52:07 +01:00
Amber Brown f7869f8f8b
Port to sortedcontainers (with tests!) (#3332) 2018-06-06 00:13:57 +10:00
Erik Johnston 042eedfa2b Add hacky cache factor override system 2018-06-04 15:39:28 +01:00
Amber Brown c936a52a9e
Consistently use six's iteritems and wrap lazy keys/values in list() if they're not meant to be lazy (#3307) 2018-05-31 19:03:47 +10:00
Amber Brown debff7ae09
Merge pull request #3281 from NotAFile/py3-six-isinstance
remaining isintance fixes
2018-05-30 12:44:46 +10:00
Amber Brown 357c74a50f add comment about why unreg 2018-05-28 19:14:41 +10:00
Amber Brown 754826a830 Merge remote-tracking branch 'origin/develop' into 3218-official-prom 2018-05-28 18:57:23 +10:00