Commit Graph

2008 Commits (8075504a600b47ac93faf9b605ade691ae0fbcd3)

Author SHA1 Message Date
Richard van der Hoff 0a08cd1065
Merge pull request #8548 from matrix-org/rav/deferred_cache
Rename Cache to DeferredCache, and related changes
2020-10-15 11:42:07 +01:00
Richard van der Hoff 4433d01519
Merge pull request #8537 from matrix-org/rav/simplify_locally_reject_invite
Simplify `_locally_reject_invite`
2020-10-15 10:20:19 +01:00
Richard van der Hoff 470dedd266 Combine the two sets of DeferredCache tests 2020-10-14 23:49:27 +01:00
Richard van der Hoff 4182bb812f move DeferredCache into its own module 2020-10-14 23:38:14 +01:00
Richard van der Hoff 9f87da0a84 Rename Cache->DeferredCache 2020-10-14 23:38:14 +01:00
Erik Johnston 1264c8ac89
Add basic tests for sync/pagination with vector clock tokens. (#8488)
These are tests for #8439
2020-10-14 13:53:20 +01:00
Erik Johnston 921a3f8a59
Fix not sending events over federation when using sharded event persisters (#8536)
* Fix outbound federaion with multiple event persisters.

We incorrectly notified federation senders that the minimum persisted
stream position had advanced when we got an `RDATA` from an event
persister.

Notifying of federation senders already correctly happens in the
notifier, so we just delete the offending line.

* Change some interfaces to use RoomStreamToken.

By enforcing use of `RoomStreamTokens` we make it less likely that
people pass in random ints that they got from somewhere random.
2020-10-14 13:27:51 +01:00
Richard van der Hoff a34b17e492 Simplify `_locally_reject_invite`
Update `EventCreationHandler.create_event` to accept an auth_events param, and
use it in `_locally_reject_invite` instead of reinventing the wheel.
2020-10-13 23:58:48 +01:00
Richard van der Hoff d9d86c2996 Remove redundant `token_id` parameter to create_event
this is always the same as requester.access_token_id.
2020-10-13 23:06:36 +01:00
Patrick Cloke 629a951b49
Move additional tasks to the background worker, part 4 (#8513) 2020-10-13 08:20:32 -04:00
Erik Johnston b2486f6656
Fix message duplication if something goes wrong after persisting the event (#8476)
Should fix #3365.
2020-10-13 12:07:56 +01:00
Erik Johnston 8de3703d21
Make event persisters periodically announce position over replication. (#8499)
Currently background proccesses stream the events stream use the "minimum persisted position" (i.e. `get_current_token()`) rather than the vector clock style tokens. This is broadly fine as it doesn't matter if the background processes lag a small amount. However, in extreme cases (i.e. SyTests) where we only write to one event persister the background processes will never make progress.

This PR changes it so that the `MultiWriterIDGenerator` keeps the current position of a given instance as up to date as possible (i.e using the latest token it sees if its not in the process of persisting anything), and then periodically announces that over replication. This then allows the "minimum persisted position" to advance, albeit with a small lag.
2020-10-12 15:51:41 +01:00
Patrick Cloke d35a451399
Clean-up some broken/unused code in the test framework (#8514) 2020-10-09 14:19:29 -04:00
Richard van der Hoff 9789b1fba5
Fix threadsafety in ThreadedMemoryReactorClock (#8497)
This could, very occasionally, cause:

```
tests.test_visibility.FilterEventsForServerTestCase.test_large_room
===============================================================================
[ERROR]
Traceback (most recent call last):
  File "/src/tests/rest/media/v1/test_media_storage.py", line 86, in test_ensure_media_is_in_local_cache
    self.wait_on_thread(x)
  File "/src/tests/unittest.py", line 296, in wait_on_thread
    self.reactor.advance(0.01)
  File "/src/.tox/py35/lib/python3.5/site-packages/twisted/internet/task.py", line 826, in advance
    self._sortCalls()
  File "/src/.tox/py35/lib/python3.5/site-packages/twisted/internet/task.py", line 787, in _sortCalls
    self.calls.sort(key=lambda a: a.getTime())
builtins.ValueError: list modified during sort

tests.rest.media.v1.test_media_storage.MediaStorageTests.test_ensure_media_is_in_local_cache
```
2020-10-09 17:22:25 +01:00
Andrew Morgan 66ac4b1e34
Allow modules to create and send events into rooms (#8479)
This PR allows Synapse modules making use of the `ModuleApi` to create and send non-membership events into a room. This can useful to have modules send messages, or change power levels in a room etc. Note that they must send event through a user that's already in the room.

The non-membership event limitation is currently arbitrary, as it's another chunk of work and not necessary at the moment.
2020-10-09 13:46:36 +01:00
Patrick Cloke c9c0ad5e20
Remove the deprecated Handlers object (#8494)
All handlers now available via get_*_handler() methods on the HomeServer.
2020-10-09 07:24:34 -04:00
Hubert Chathi a97cec18bb
Invalidate the cache when an olm fallback key is uploaded (#8501) 2020-10-08 13:24:46 -04:00
Erik Johnston ae5b2a72c0
Reduce serialization errors in MultiWriterIdGen (#8456)
We call `_update_stream_positions_table_txn` a lot, which is an UPSERT
that can conflict in `REPEATABLE READ` isolation level. Instead of doing
a transaction consisting of a single query we may as well run it outside
of a transaction.
2020-10-07 15:15:57 +01:00
Hubert Chathi 4cb44a1585
Add support for MSC2697: Dehydrated devices (#8380)
This allows a user to store an offline device on the server and
then restore it at a subsequent login.
2020-10-07 08:00:17 -04:00
Richard van der Hoff 43c622885c
Merge pull request #8463 from matrix-org/rav/clean_up_event_handling
Reduce inconsistencies between codepaths for membership and non-membership events.
2020-10-07 12:20:44 +01:00
Richard van der Hoff 4f0637346a
Combine `SpamCheckerApi` with the more generic `ModuleApi`. (#8464)
Lots of different module apis is not easy to maintain.

Rather than adding yet another ModuleApi(hs, hs.get_auth_handler()) incantation, first add an hs.get_module_api() method and use it where possible.
2020-10-07 12:03:26 +01:00
Hubert Chathi 3cd78bbe9e
Add support for MSC2732: olm fallback keys (#8312) 2020-10-06 13:26:29 -04:00
Richard van der Hoff a024461130
Additional tests for third-party event rules (#8468)
* Optimise and test state fetching for 3p event rules

Getting all the events at once is much more efficient than getting them
individually

* Test that 3p event rules can modify events
2020-10-06 16:31:31 +01:00
Richard van der Hoff 9c0b168cff
Merge pull request #8467 from matrix-org/rav/fix_3pevent_rules
Fix third-party event modules for `check_visibility_can_be_modified` check
2020-10-06 11:32:53 +01:00
Richard van der Hoff 785437dc0d
Update default room version to 6 (#8461)
Per https://github.com/matrix-org/matrix-doc/pull/2788
2020-10-05 21:40:51 +01:00
Richard van der Hoff 4cd1448d0e Fix third-party event modules for `check_visibility_can_be_modified` check
PR #8292 tried to maintain backwards compat with modules which don't provide a
`check_visibility_can_be_modified` method, but the tests weren't being run,
and the check didn't work.
2020-10-05 20:29:52 +01:00
Richard van der Hoff e775b5bb5b kill off `send_nonmember_event`
This is now redundant, and we can just call `handle_new_client_event` directly.
2020-10-05 19:04:10 +01:00
Andrew Morgan 0991a2da93
Allow ThirdPartyEventRules modules to manipulate public room state (#8292)
This PR allows `ThirdPartyEventRules` modules to view, manipulate and block changes to the state of whether a room is published in the public rooms directory.

While the idea of whether a room is in the public rooms list is not kept within an event in the room, `ThirdPartyEventRules` generally deal with controlling which modifications can happen to a room. Public rooms fits within that idea, even if its toggle state isn't controlled through a state event.
2020-10-05 14:57:46 +01:00
Erik Johnston e3debf9682
Add logging on startup/shutdown (#8448)
This is so we can tell what is going on when things are taking a while to start up.

The main change here is to ensure that transactions that are created during startup get correctly logged like normal transactions.
2020-10-02 15:20:45 +01:00
Erik Johnston ec10bdd32b
Speed up unit tests when using PostgreSQL (#8450) 2020-10-02 15:09:31 +01:00
Patrick Cloke 62894673e6
Allow background tasks to be run on a separate worker. (#8369) 2020-10-02 08:23:15 -04:00
Erik Johnston 6c5d5e507e
Add unit test for event persister sharding (#8433) 2020-10-02 09:57:12 +01:00
BBBSnowball 05ee048f2c
Add config option for always using "userinfo endpoint" for OIDC (#7658)
This allows for connecting to certain IdPs, e.g. GitLab.
2020-10-01 13:54:35 -04:00
Erik Johnston 7941372ec8
Make token serializing/deserializing async (#8427)
The idea is that in future tokens will encode a mapping of instance to position. However, we don't want to include the full instance name in the string representation, so instead we'll have a mapping between instance name and an immutable integer ID in the DB that we can use instead. We'll then do the lookup when we serialize/deserialize the token (we could alternatively pass around an `Instance` type that includes both the name and ID, but that turns out to be a lot more invasive).
2020-09-30 20:29:19 +01:00
Richard van der Hoff a0a1ba6973
Merge pull request #8425 from matrix-org/rav/extremity_metrics
Add an improved "forward extremities" metric
2020-09-30 19:33:27 +01:00
Patrick Cloke 8b40843392
Allow additional SSO properties to be passed to the client (#8413) 2020-09-30 13:02:43 -04:00
Richard van der Hoff 6d2d42f8fb Rewrite BucketCollector
This was a bit unweildy for what I wanted: in particular, I wanted to assign
each measurement straight into a bucket, rather than storing an intermediate
Counter which didn't do any bucketing at all.

I've replaced it with something that is hopefully a bit easier to use.

(I'm not entirely sure what the difference between a HistogramMetricFamily and
a GaugeHistogramMetricFamily is, but given our counters can go down as well as
up the latter *sounds* more accurate?)
2020-09-30 16:49:15 +01:00
Erik Johnston ea70f1c362
Various clean ups to room stream tokens. (#8423) 2020-09-29 21:48:33 +01:00
Erik Johnston b1433bf231
Don't table scan events on worker startup (#8419)
* Fix table scan of events on worker startup.

This happened because we assumed "new" writers had an initial stream
position of 0, so the replication code tried to fetch all events written
by the instance between 0 and the current position.

Instead, set the initial position of new writers to the current
persisted up to position, on the assumption that new writers won't have
written anything before that point.

* Consider old writers coming back as "new".

Otherwise we'd try and fetch entries between the old stale token and the
current position, even though it won't have written any rows.

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>

Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
2020-09-29 16:42:19 +01:00
Will Hunt 8676d8ab2e
Filter out appservices from mau count (#8404)
This is an attempt to fix #8403.
2020-09-29 13:11:02 +01:00
Andrew Morgan 1c6b8752b8
Only assert valid next_link params when provided (#8417)
Broken in https://github.com/matrix-org/synapse/pull/8275 and has yet to be put in a release. Fixes https://github.com/matrix-org/synapse/issues/8418.

`next_link` is an optional parameter. However, we were checking whether the `next_link` param was valid, even if it wasn't provided. In that case, `next_link` was `None`, which would clearly not be a valid URL.

This would prevent password reset and other operations if `next_link` was not provided, and the `next_link_domain_whitelist` config option was set.
2020-09-29 12:36:44 +01:00
Richard van der Hoff 1c262431f9
Fix handling of connection timeouts in outgoing http requests (#8400)
* Remove `on_timeout_cancel` from `timeout_deferred`

The `on_timeout_cancel` param to `timeout_deferred` wasn't always called on a
timeout (in particular if the canceller raised an exception), so it was
unreliable. It was also only used in one place, and to be honest it's easier to
do what it does a different way.

* Fix handling of connection timeouts in outgoing http requests

Turns out that if we get a timeout during connection, then a different
exception is raised, which wasn't always handled correctly.

To fix it, catch the exception in SimpleHttpClient and turn it into a
RequestTimedOutError (which is already a documented exception).

Also add a description to RequestTimedOutError so that we can see which stage
it failed at.

* Fix incorrect handling of timeouts reading federation responses

This was trapping the wrong sort of TimeoutError, so was never being hit.

The effect was relatively minor, but we should fix this so that it does the
expected thing.

* Fix inconsistent handling of `timeout` param between methods

`get_json`, `put_json` and `delete_json` were applying a different timeout to
the response body to `post_json`; bring them in line and test.

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Erik Johnston <erik@matrix.org>
2020-09-29 10:29:21 +01:00
Erik Johnston bd380d942f
Add checks for postgres sequence consistency (#8402) 2020-09-28 18:00:30 +01:00
Richard van der Hoff 5e3ca12b15
Create a mechanism for marking tests "logcontext clean" (#8399) 2020-09-28 17:58:33 +01:00
Richard van der Hoff fec6f9ac17
Fix occasional "Re-starting finished log context" from keyring (#8398)
* Fix test_verify_json_objects_for_server_awaits_previous_requests

It turns out that this wasn't really testing what it thought it was testing
(in particular, `check_context` was turning failures into success, which was
making the tests pass even though it wasn't clear they should have been.

It was also somewhat overcomplex - we can test what it was trying to test
without mocking out perspectives servers.

* Fix warnings about finished logcontexts in the keyring

We need to make sure that we finish the key fetching magic before we run the
verifying code, to ensure that we don't mess up our logcontexts.
2020-09-25 12:29:54 +01:00
Tdxdxoz abd04b6af0
Allow existing users to login via OpenID Connect. (#8345)
Co-authored-by: Benjamin Koch <bbbsnowball@gmail.com>

This adds configuration flags that will match a user to pre-existing users
when logging in via OpenID Connect. This is useful when switching to
an existing SSO system.
2020-09-25 07:01:45 -04:00
Erik Johnston f112cfe5bb
Fix MultiWriteIdGenerator's handling of restarts. (#8374)
On startup `MultiWriteIdGenerator` fetches the maximum stream ID for
each instance from the table and uses that as its initial "current
position" for each writer. This is problematic as a) it involves either
a scan of events table or an index (neither of which is ideal), and b)
if rows are being persisted out of order elsewhere while the process
restarts then using the maximum stream ID is not correct. This could
theoretically lead to race conditions where e.g. events that are
persisted out of order are not sent down sync streams.

We fix this by creating a new table that tracks the current positions of
each writer to the stream, and update it each time we finish persisting
a new entry. This is a relatively small overhead when persisting events.
However for the cache invalidation stream this is a much bigger relative
overhead, so instead we note that for invalidation we don't actually
care about reliability over restarts (as there's no caches to
invalidate) and simply don't bother reading and writing to the new table
in that particular case.
2020-09-24 16:53:51 +01:00
Erik Johnston ac11fcbbb8
Add EventStreamPosition type (#8388)
The idea is to remove some of the places we pass around `int`, where it can represent one of two things:

1. the position of an event in the stream; or
2. a token that partitions the stream, used as part of the stream tokens.

The valid operations are then:

1. did a position happen before or after a token;
2. get all events that happened before or after a token; and
3. get all events between two tokens.

(Note that we don't want to allow other operations as we want to change the tokens to be vector clocks rather than simple ints)
2020-09-24 13:24:17 +01:00
Erik Johnston cbabb312e0
Use `async with` for ID gens (#8383)
This will allow us to hit the DB after we've finished using the generated stream ID.
2020-09-23 16:11:18 +01:00
Dirk Klimpel 8998217540
Fixed a bug with reactivating users with the admin API (#8362)
Fixes: #8359 

Trying to reactivate a user with the admin API (`PUT /_synapse/admin/v2/users/<user_name>`) causes an internal server error.

Seems to be a regression in #8033.
2020-09-22 18:19:01 +01:00