Commit Graph

78 Commits (0df618f81324a9f2cc2b064c33bd9dfc22391402)

Author SHA1 Message Date
Andrew Morgan f4e6495b5d
Performance improvements and refactor of Ratelimiter (#7595)
While working on https://github.com/matrix-org/synapse/issues/5665 I found myself digging into the `Ratelimiter` class and seeing that it was both:

* Rather undocumented, and
* causing a *lot* of config checks

This PR attempts to refactor and comment the `Ratelimiter` class, as well as encourage config file accesses to only be done at instantiation. 

Best to be reviewed commit-by-commit.
2020-06-05 10:47:20 +01:00
Richard van der Hoff 00db90f409
Fix recording of federation stream token (#7564)
A couple of changes of significance:

 * remove the `_last_ack < federation_position` condition, so that
   updates will still be correctly processed after restart

 * Correctly wire up send_federation_ack to the right class.
2020-05-26 11:41:38 +01:00
Richard van der Hoff 164f50f5f2
fix mypy for tests/replication (#7518) 2020-05-18 10:43:05 +01:00
Richard van der Hoff 6c1f7c722f
Fix limit logic for AccountDataStream (#7384)
Make sure that the AccountDataStream presents complete updates, in the right
order.

This is much the same fix as #7337 and #7358, but applied to a different stream.
2020-05-15 19:03:25 +01:00
Erik Johnston 18c1e52d82
Clean up replication unit tests. (#7490) 2020-05-13 16:01:47 +01:00
Richard van der Hoff d5aa7d93ed
Fix catchup-on-reconnect for the Federation Stream (#7374)
looks like we managed to break this during the refactorathon.
2020-05-05 14:15:57 +01:00
Erik Johnston 0e719f2398
Thread through instance name to replication client. (#7369)
For in memory streams when fetching updates on workers we need to query the source of the stream, which currently is hard coded to be master. This PR threads through the source instance we received via `POSITION` through to the update function in each stream, which can then be passed to the replication client for in memory streams.
2020-05-01 17:19:56 +01:00
Erik Johnston 3085cde577
Use `stream.current_token()` and remove `stream_positions()` (#7172)
We move the processing of typing and federation replication traffic into their handlers so that `Stream.current_token()` points to a valid token. This allows us to remove `get_streams_to_replicate()` and `stream_positions()`.
2020-05-01 15:21:35 +01:00
Erik Johnston 37f6823f5b
Add instance name to RDATA/POSITION commands (#7364)
This is primarily for allowing us to send those commands from workers, but for now simply allows us to ignore echoed RDATA/POSITION commands that we sent (we get echoes of sent commands when using redis). Currently we log a WARNING on the master process every time we receive an echoed RDATA.
2020-04-29 16:23:08 +01:00
Erik Johnston 3eab76ad43
Don't relay REMOTE_SERVER_UP cmds to same conn. (#7352)
For direct TCP connections we need the master to relay REMOTE_SERVER_UP
commands to the other connections so that all instances get notified
about it. The old implementation just relayed to all connections,
assuming that sending back to the original sender of the command was
safe. This is not true for redis, where commands sent get echoed back to
the sender, which was causing master to effectively infinite loop
sending and then re-receiving REMOTE_SERVER_UP commands that it sent.

The fix is to ensure that we only relay to *other* connections and not
to the connection we received the notification from.

Fixes #7334.
2020-04-29 14:10:59 +01:00
Richard van der Hoff c2e1a2110f
Fix limit logic for EventsStream (#7358)
* Factor out functions for injecting events into database

I want to add some more flexibility to the tools for injecting events into the
database, and I don't want to clutter up HomeserverTestCase with them, so let's
factor them out to a new file.

* Rework TestReplicationDataHandler

This wasn't very easy to work with: the mock wrapping was largely superfluous,
and it's useful to be able to inspect the received rows, and clear out the
received list.

* Fix AssertionErrors being thrown by EventsStream

Part of the problem was that there was an off-by-one error in the assertion,
but also the limit logic was too simple. Fix it all up and add some tests.
2020-04-29 12:30:36 +01:00
Erik Johnston fce663889b
Add some replication tests (#7278)
Specifically some tests for the typing stream, which means we test streams that fetch missing updates via HTTP (rather than via the DB).

We also shuffle things around a bit so that we create two separate `HomeServer` objects, rather than trying to insert a slaved store into places.

Note: `test_typing.py` is heavily inspired by `test_receipts.py`
2020-04-28 17:42:03 +01:00
Richard van der Hoff 82d8b1dd1f
Another go at fixing one-word commands (#7326)
I messed this up last time I tried (#7239 / e13c6c7).
2020-04-22 14:34:31 +01:00
Erik Johnston 51f7eaf908
Add ability to run replication protocol over redis. (#7040)
This is configured via the `redis` config options.
2020-04-22 13:07:41 +01:00
Erik Johnston 5016b162fc
Move client command handling out of TCP protocol (#7185)
The aim here is to move the command handling out of the TCP protocol classes and to also merge the client and server command handling (so that we can reuse them for redis protocol). This PR simply moves the client paths to the new `ReplicationCommandHandler`, a future PR will move the server paths too.
2020-04-06 09:58:42 +01:00
Erik Johnston 4cff617df1
Move catchup of replication streams to worker. (#7024)
This changes the replication protocol so that the server does not send down `RDATA` for rows that happened before the client connected. Instead, the server will send a `POSITION` and clients then query the database (or master out of band) to get up to date.
2020-03-25 14:54:01 +00:00
Richard van der Hoff a564b92d37
Convert `*StreamRow` classes to inner classes (#7116)
This just helps keep the rows closer to their streams, so that it's easier to
see what the format of each stream is.
2020-03-23 13:59:11 +00:00
Richard van der Hoff 8ef8fb2c1c
Read the room version from database when fetching events (#6874)
This is a precursor to giving EventBase objects the knowledge of which room version they belong to.
2020-03-04 13:11:04 +00:00
Richard van der Hoff 799001f2c0
Add a `make_event_from_dict` method (#6858)
... and use it in places where it's trivial to do so.

This will make it easier to pass room versions into the FrozenEvent
constructors.
2020-02-07 15:30:04 +00:00
Erik Johnston 48c3a96886
Port synapse.replication.tcp to async/await (#6666)
* Port synapse.replication.tcp to async/await

* Newsfile

* Correctly document type of on_<FOO> functions as async

* Don't be overenthusiastic with the asyncing....
2020-01-16 09:16:12 +00:00
Erik Johnston 28c98e51ff
Add `local_current_membership` table (#6655)
Currently we rely on `current_state_events` to figure out what rooms a
user was in and their last membership event in there. However, if the
server leaves the room then the table may be cleaned up and that
information is lost. So lets add a table that separately holds that
information.
2020-01-15 14:59:33 +00:00
Erik Johnston 2284eb3a53
Add database config class (#6513)
This encapsulates config for a given database and is the way to get new
connections.
2019-12-18 10:45:12 +00:00
Erik Johnston 852f80d8a6 Fixup tests 2019-12-06 16:02:50 +00:00
Amber Brown 0f87b912ab
Implementation of MSC2314 (#6176) 2019-11-28 08:54:07 +11:00
Erik Johnston 3ca4c7c516 Use new EventPersistenceStore 2019-10-23 16:15:03 +01:00
Amber Brown b36c82576e
Run Black on the tests again (#5170) 2019-05-10 00:12:11 -05:00
Richard van der Hoff 297bf2547e
Fix sync bug when accepting invites (#4956)
Hopefully this time we really will fix #4422.

We need to make sure that the cache on
`get_rooms_for_user_with_stream_ordering` is invalidated *before* the
SyncHandler is notified for the new events, and we can now do so reliably via
the `events` stream.
2019-04-02 12:42:39 +01:00
Richard van der Hoff a5798de067 Move replication.tcp.streams into a package 2019-03-27 21:13:14 +00:00
Richard van der Hoff 9bde730ef8
Fix bug where read-receipts lost their timestamps (#4927)
Make sure that they are sent correctly over the replication stream.

Fixes: #4898
2019-03-25 16:38:05 +00:00
Brendan Abolivier a4c3a361b7
Add rate-limiting on registration (#4735)
* Rate-limiting for registration

* Add unit test for registration rate limiting

* Add config parameters for rate limiting on auth endpoints

* Doc

* Fix doc of rate limiting function

Co-Authored-By: babolivier <contact@brendanabolivier.com>

* Incorporate review

* Fix config parsing

* Fix linting errors

* Set default config for auth rate limiting

* Fix tests

* Add changelog

* Advance reactor instead of mocked clock

* Move parameters to registration specific config and give them more sensible default values

* Remove unused config options

* Don't mock the rate limiter un MAU tests

* Rename _register_with_store into register_with_store

* Make CI happy

* Remove unused import

* Update sample config

* Fix ratelimiting test for py2

* Add non-guest test
2019-03-05 14:25:33 +00:00
Erik Johnston b86d05a279 Clean up event accesses and tests
This is in preparation to refactor FrozenEvent to support different
event formats for different room versions
2018-11-02 13:44:14 +00:00
Amber Brown 7232917f12
Disable frozen dicts by default (#3987) 2018-10-02 22:53:47 +10:00
Richard van der Hoff 31c15dcb80
Refactor matrixfederationclient to fix logging (#3906)
We want to wait until we have read the response body before we log the request
as complete, otherwise a confusing thing happens where the request appears to
have completed, but we later fail it.

To do this, we factor the salient details of a request out to a separate
object, which can then keep track of the txn_id, so that it can be logged.
2018-09-18 18:17:15 +01:00
Amber Brown 77055dba92
Fix tests on postgresql (#3740) 2018-09-04 02:21:48 +10:00
Erik Johnston 4d664278af Merge branch 'develop' of github.com:matrix-org/synapse into erikj/refactor_state_handler 2018-08-20 14:49:43 +01:00
Amber Brown 99dd975dae
Run tests under PostgreSQL (#3423) 2018-08-13 16:47:46 +10:00
black 8b3d9b6b19 Run black. 2018-08-10 23:54:09 +10:00
Erik Johnston 3e19beb941 Fix tests 2018-08-09 14:58:49 +01:00
Richard van der Hoff f59be4eb0e Fix unit tests
on_notifier_poke no longer runs synchonously, so we have to do a different hack
to make sure that the replication data has been sent. Let's actually listen for
its arrival.
2018-07-25 10:30:36 +01:00
Erik Johnston 8fbe418777 Fix unit tests 2018-07-23 13:33:49 +01:00
Amber Brown 49af402019 run isort 2018-07-09 16:09:20 +10:00
Richard van der Hoff bd348f0af6 remove dead filter_events_for_clients
This is only used by filter_events_for_client, so we can simplify the whole
thing by just doing one user at a time, and removing a dead storage function to
boot.
2018-06-12 09:51:31 +01:00
Erik Johnston cb9f8e527c s/replication_client/federation_client/ 2018-03-13 13:26:52 +00:00
Erik Johnston 6ea27fafad Fix tests 2018-03-13 10:55:47 +00:00
Erik Johnston e440e28456 Fix unit tests 2018-02-20 11:41:40 +00:00
Erik Johnston 4810f7effd Remove context.push_actions 2018-02-15 15:47:06 +00:00
Erik Johnston 3d33eef6fc
Store state groups separately from events (#2784)
* Split state group persist into seperate storage func

* Add per database engine code for state group id gen

* Move store_state_group to StateReadStore

This allows other workers to use it, and so resolve state.

* Hook up store_state_group

* Fix tests

* Rename _store_mult_state_groups_txn

* Rename StateGroupReadStore

* Remove redundant _have_persisted_state_group_txn

* Update comments

* Comment compute_event_context

* Set start val for state_group_id_seq

... otherwise we try to recreate old state groups

* Update comments

* Don't store state for outliers

* Update comment

* Update docstring as state groups are ints
2018-02-06 14:31:24 +00:00
Richard van der Hoff 5c431f421c Matthew's fixes to the unit tests
Extracted from https://github.com/matrix-org/synapse/pull/2820
2018-01-22 16:46:16 +00:00
Erik Johnston 85a0d6c7ab Remove test of replication resource 2017-04-11 10:59:27 +01:00
Erik Johnston 3a1f3f8388 Change slave storage to use new replication interface
As the TCP replication uses a slightly different API and streams than
the HTTP replication.

This breaks HTTP replication.
2017-04-03 15:34:19 +01:00