Commit Graph

57 Commits (d02e4b2825b96db6a76ce82430337f88a9ff74a5)

Author SHA1 Message Date
Patrick Cloke d1eb1b96e8
Register the /devices endpoint on workers. (#9092) 2021-01-13 12:35:40 -05:00
Erik Johnston c9195744a4
Move more encryption endpoints off master (#9068) 2021-01-11 18:01:27 +00:00
Richard van der Hoff 671138f658
Clean up exception handling in the startup code (#9059)
Factor out the exception handling in the startup code to a utility function,
and fix the some logging and exit code stuff.
2021-01-11 15:55:05 +00:00
Erik Johnston b530eaa262
Allow running sendToDevice on workers (#9044) 2021-01-07 20:19:26 +00:00
Patrick Cloke 68bb26da69
Allow redacting events on workers (#8994)
Adds the redacts endpoint to workers that have the client listener.
2020-12-29 07:40:12 -05:00
Patrick Cloke 30fba62108
Apply an IP range blacklist to push and key revocation requests. (#8821)
Replaces the `federation_ip_range_blacklist` configuration setting with an
`ip_range_blacklist` setting with wider scope. It now applies to:

* Federation
* Identity servers
* Push notifications
* Checking key validitity for third-party invite events

The old `federation_ip_range_blacklist` setting is still honored if present, but
with reduced scope (it only applies to federation and identity servers).
2020-12-02 11:09:24 -05:00
Erik Johnston 921a3f8a59
Fix not sending events over federation when using sharded event persisters (#8536)
* Fix outbound federaion with multiple event persisters.

We incorrectly notified federation senders that the minimum persisted
stream position had advanced when we got an `RDATA` from an event
persister.

Notifying of federation senders already correctly happens in the
notifier, so we just delete the offending line.

* Change some interfaces to use RoomStreamToken.

By enforcing use of `RoomStreamTokens` we make it less likely that
people pass in random ints that they got from somewhere random.
2020-10-14 13:27:51 +01:00
Patrick Cloke e4f72ddc44
Move additional tasks to the background worker (#8458) 2020-10-07 11:27:56 -04:00
Patrick Cloke 62894673e6
Allow background tasks to be run on a separate worker. (#8369) 2020-10-02 08:23:15 -04:00
Patrick Cloke 8a4a4186de
Simplify super() calls to Python 3 syntax. (#8344)
This converts calls like super(Foo, self) -> super().

Generated with:

    sed -i "" -Ee 's/super\([^\(]+\)/super()/g' **/*.py
2020-09-18 09:56:44 -04:00
Patrick Cloke c619253db8
Stop sub-classing object (#8249) 2020-09-04 06:54:56 -04:00
Erik Johnston 0f1afbe8dc Change HomeServer definition to work with typing.
Duplicating function signatures between server.py and server.pyi is
silly. This commit changes that by changing all `build_*` methods to
`get_*` methods and changing the `_make_dependency_method` to work work
as a descriptor that caches the produced value.

There are some changes in other files that were made to fix the typing
in server.py.
2020-08-11 18:00:17 +01:00
Erik Johnston 7620912d84
Add health check endpoint (#8048) 2020-08-07 14:21:24 +01:00
Erik Johnston a7bdf98d01
Rename database classes to make some sense (#8033) 2020-08-05 21:38:57 +01:00
Olivier Wilkinson (reivilibre) 3aa36b782c Merge branch 'master' into develop 2020-07-30 15:18:36 +01:00
Patrick Cloke 3950ae51ef
Ensure that remove_pusher is always async (#7981) 2020-07-30 06:56:55 -04:00
Erik Johnston 2c1b9d6763
Update worker docs with recent enhancements (#7969) 2020-07-29 23:22:13 +01:00
Erik Johnston 84d099ae11
Fix typing replication not being handled on master (#7959)
Handling of incoming typing stream updates from replication was not
hooked up on master, effecting set ups where typing was handled on a
different worker.

This is really only a problem if the master process is also handling
sync requests, which is unlikely for those that are at the stage of
moving typing off.

The other observable effect is that if a worker restarts or a
replication connect drops then the typing worker will issue a
`POSITION typing`, triggering master process to try and stream *all*
typing updates from position 0.

Fixes #7907
2020-07-27 14:10:53 +01:00
Patrick Cloke 00e57b755c
Convert synapse.app to async/await. (#7868) 2020-07-17 07:08:56 -04:00
Erik Johnston f2e38ca867
Allow moving typing off master (#7869) 2020-07-16 15:12:54 +01:00
Erik Johnston f299441cc6
Add ability to shard the federation sender (#7798) 2020-07-10 18:26:36 +01:00
Patrick Cloke 8fa7fdd4cb
Pass original request headers from workers to the main process. (#7797) 2020-07-09 07:34:46 -04:00
Richard van der Hoff 03619324fc
Create a ListenerConfig object (#7681)
This ended up being a bit more invasive than I'd hoped for (not helped by
generic_worker duplicating some of the code from homeserver), but hopefully
it's an improvement.

The idea is that, rather than storing unstructured `dict`s in the config for
the listener configurations, we instead parse it into a structured
`ListenerConfig` object.
2020-06-16 12:44:07 +01:00
Patrick Cloke 7d2532be36
Discard RDATA from already seen positions. (#7648) 2020-06-15 08:44:54 -04:00
Erik Johnston ef3934ec8f Ensure we persist and ack the same token 2020-05-27 19:45:42 +01:00
Erik Johnston 35c308731d Speed up processing of federation stream RDATA rows.
Instead of storing and sending an ACK for every single row we send
synchronously, we instead do it asynchronously while batching up
updates.
2020-05-27 19:34:07 +01:00
Richard van der Hoff 04729b86f8
Fix incorrect exception handling in KeyUploadServlet.on_POST (#7563)
Introduced in #7556
2020-05-26 11:42:22 +01:00
Richard van der Hoff 00db90f409
Fix recording of federation stream token (#7564)
A couple of changes of significance:

 * remove the `_last_ack < federation_position` condition, so that
   updates will still be correctly processed after restart

 * Correctly wire up send_federation_ack to the right class.
2020-05-26 11:41:38 +01:00
Erik Johnston e5c67d04db
Add option to move event persistence off master (#7517) 2020-05-22 16:11:35 +01:00
Patrick Cloke 4429764c9f
Return 200 OK for all OPTIONS requests (#7534) 2020-05-22 09:30:07 -04:00
Erik Johnston 547e4dd83e
Fix exception reporting due to HTTP request errors. (#7556)
These are business as usual errors, rather than stuff we want to log at
error.
2020-05-22 11:39:20 +01:00
Richard van der Hoff 0bbbd10513
Stub out GET presence requests in the frontend proxy (#7545)
We don't really make any promises about returning accurate presence data when
presence is disabled, so we may as well just return a static response, rather
than making the master handle a request.
2020-05-21 14:36:46 +01:00
Erik Johnston 51055c8c44
Allow ReplicationRestResource to be added to workers (#7515)
This allows workers to talk to each other over HTTP replication.
2020-05-18 12:24:48 +01:00
Erik Johnston 03aff4c75e
Add a worker store for search insertion. (#7516)
This is required as both event persistence and the background update needs access to this function. It should be perfectly safe for two workers to write to that table at the same time.
2020-05-15 17:22:47 +01:00
Erik Johnston 4734a7bbe4
Move EventStream handling into default ReplicationDataHandler (#7493)
This is so that the logic can happen on both master and workers when we move event persistence out.
2020-05-14 14:01:39 +01:00
Erik Johnston 1124111a12
Allow censoring of events to happen on workers. (#7492)
This is safe as we can now write to cache invalidation stream on workers, and is required for when we move event persistence off master.
2020-05-13 17:15:40 +01:00
Erik Johnston 0e719f2398
Thread through instance name to replication client. (#7369)
For in memory streams when fetching updates on workers we need to query the source of the stream, which currently is hard coded to be master. This PR threads through the source instance we received via `POSITION` through to the update function in each stream, which can then be passed to the replication client for in memory streams.
2020-05-01 17:19:56 +01:00
Erik Johnston 3085cde577
Use `stream.current_token()` and remove `stream_positions()` (#7172)
We move the processing of typing and federation replication traffic into their handlers so that `Stream.current_token()` points to a valid token. This allows us to remove `get_streams_to_replicate()` and `stream_positions()`.
2020-05-01 15:21:35 +01:00
Patrick Cloke 627b0f5f27
Persist user interactive authentication sessions (#7302)
By persisting the user interactive authentication sessions to the database, this fixes
situations where a user hits different works throughout their auth session and also
allows sessions to persist through restarts of Synapse.
2020-04-30 13:47:49 -04:00
Erik Johnston 38919b521e
Run replication streamers on workers (#7146)
Currently we never write to streams from workers, but that will change soon
2020-04-28 13:34:12 +01:00
Richard van der Hoff 71a1abb8a1
Stop the master relaying USER_SYNC for other workers (#7318)
Long story short: if we're handling presence on the current worker, we shouldn't be sending USER_SYNC commands over replication.

In an attempt to figure out what is going on here, I ended up refactoring some bits of the presencehandler code, so the first 4 commits here are non-functional refactors to move this code slightly closer to sanity. (There's still plenty to do here :/). Suggest reviewing individual commits.

Fixes (I hope) #7257.
2020-04-22 22:39:04 +01:00
Richard van der Hoff 2aa5bf13c8 Merge branch 'release-v1.12.4' into develop 2020-04-22 13:09:23 +01:00
Richard van der Hoff 974c0d726a
Support GET account_data requests on a worker (#7311) 2020-04-21 10:46:30 +01:00
Erik Johnston 5016b162fc
Move client command handling out of TCP protocol (#7185)
The aim here is to move the command handling out of the TCP protocol classes and to also merge the client and server command handling (so that we can reuse them for redis protocol). This PR simply moves the client paths to the new `ReplicationCommandHandler`, a future PR will move the server paths too.
2020-04-06 09:58:42 +01:00
Richard van der Hoff bae32740da
Remove some `run_in_background` calls in replication code (#7203)
By running this stuff with `run_in_background`, it won't be correctly reported
against the relevant CPU usage stats.

Fixes #7202
2020-04-03 12:29:30 +01:00
Erik Johnston db098ec994 Fix starting workers when federation sending not split out. 2020-03-31 11:25:21 +01:00
Erik Johnston 4f21c33be3
Remove usage of "conn_id" for presence. (#7128)
* Remove `conn_id` usage for UserSyncCommand.

Each tcp replication connection is assigned a "conn_id", which is used
to give an ID to a remotely connected worker. In a redis world, there
will no longer be a one to one mapping between connection and instance,
so instead we need to replace such usages with an ID generated by the
remote instances and included in the replicaiton commands.

This really only effects UserSyncCommand.

* Add CLEAR_USER_SYNCS command that is sent on shutdown.

This should help with the case where a synchrotron gets restarted
gracefully, rather than rely on 5 minute timeout.
2020-03-30 16:37:24 +01:00
Erik Johnston 4cff617df1
Move catchup of replication streams to worker. (#7024)
This changes the replication protocol so that the server does not send down `RDATA` for rows that happened before the client connected. Instead, the server will send a `POSITION` and clients then query the database (or master out of band) to get up to date.
2020-03-25 14:54:01 +00:00
Erik Johnston b1cfaf08af
Merge pull request #7133 from matrix-org/erikj/fix_worker_startup
Fix starting workers when federation sending not split out.
2020-03-25 09:42:39 +00:00
Erik Johnston c816072d47 Fix starting workers when federation sending not split out. 2020-03-24 10:35:00 +00:00