Commit Graph

15555 Commits (6774f265b4761302cab84405a181d437e69b89c0)

Author SHA1 Message Date
Eric Eastwood 703a8f9c67
Instrument `state` and `state_group` storage related things (tracing) (#15610)
Instrument `state` and `state_group` storage related things (tracing) so it's a little more clear where these database transactions are coming from as there is a lot of wires crossing in these functions.

Part of `/messages` performance investigation: https://github.com/matrix-org/synapse/issues/13356
2023-05-19 12:26:58 -05:00
Eric Eastwood ca3c07e833
Trace how many new events from the backfill response we need to process (#15633)
You can kinda derive this information from how many `_process_pulled_event` spans there are but it would be nice to quickly glance.
2023-05-19 11:18:45 -05:00
reivilibre 736199b763
Remove old R30 because R30v2 supercedes it (#10428)
R30v2 has been out since 2021-07-19 (https://github.com/matrix-org/synapse/pull/10332)
and we started collecting stats on 2021-08-16. Since it's been over a year now
(almost 2 years), this is enough grace period for us to now rip it out.
2023-05-19 11:13:44 -05:00
Patrick Cloke 1e89976b26
Rename blacklist/whitelist internally. (#15620)
Avoid renaming configuration settings for now and rename internal code
to use blocklist and allowlist instead.
2023-05-19 12:25:25 +00:00
Patrick Cloke 89a23c9406
Do not allow deactivated users to login with JWT. (#15624)
To improve the organization of this code it moves the JWT login
checks to a separate handler and then fixes the bug (and a
deprecation warning).
2023-05-19 08:06:54 -04:00
Patrick Cloke 07771fa487
Remove experimental configuration flags & unstable values for faster joins (#15625)
Synapse will no longer send (or respond to) the unstable flags
for faster joins. These were only available behind a configuration
flag and handled in parallel with the stable flags.
2023-05-19 07:23:09 -04:00
Sean Quah d0de452d12
Fix `HomeServer`s leaking during `trial` test runs (#15630)
This change fixes two memory leaks during `trial` test runs.

Garbage collection is disabled during each test case and a gen-0 GC is
run at the end of each test. However, when the gen-0 GC is run, the
`TestCase` object usually still holds references to the `HomeServer`
used during the test. As a result, the `HomeServer` gets promoted to
gen-1 and then never garbage collected.

Fix this by periodically running full GCs.

Additionally, fix `HomeServer`s leaking after tests that touch inbound
federation due to `FederationRateLimiter`s adding themselves to a global
set, by turning the set into a `WeakSet`.

Resolves #15622.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-05-19 11:17:12 +01:00
Nick Mills-Barrett ad50510a06
Handle missing previous read marker event. (#15464)
If the previous read marker is pointing to an event that no longer exists
(e.g. due to retention) then assume that the newly given read marker
is newer.
2023-05-18 14:37:31 -04:00
Jonathan de Jong e5b4d93770
Update Mutual Rooms (MSC2666) implementation (#15621)
To track changes in MSC2666:

- The change from `/mutual_rooms/{user_id}` to `/mutual_rooms?user_id={user_id}`.
- The addition of `next_batch_token` (and logic).
- Unstable flag now being `uk.half-shot.msc2666.query_mutual_rooms`.
- The error code when your own user is requested.
2023-05-18 12:49:12 -04:00
Sean Quah 68dcd2cbcb
Re-type config paths in `ConfigError`s to be `StrSequence`s (#15615)
Part of #14809.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-05-18 11:11:30 +01:00
Sean Quah e15aa00bc0
Fix error message when `app_service_config_files` validation fails (#15614)
The second argument of `ConfigError` is a path, passed as an optional
`Iterable[str]` and not a `str`. If a string is passed directly,
Synapse unhelpfully emits "Error in configuration at
a.p.p._.s.e.r.v.i.c.e._.c.o.n.f.i.g._.f.i.l.e.s'" when the config
option has the wrong data type.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-05-18 10:58:13 +01:00
Quentin Gliech 41b9def9f2
Add a new admin API to create a new device for a user. (#15611)
This allows an external service (e.g. the matrix-authentication-service)
to create devices for users.
2023-05-17 14:39:06 +00:00
Patrick Cloke 4ee82c0576
Apply url_preview_url_blacklist to oEmbed and pre-cached images (#15601)
There are two situations which were previously not properly checked:

1. If the requested URL was replaced with an oEmbed URL, then the
   oEmbed URL was not checked against url_preview_url_blacklist.
2. Follow-up URLs (either via autodiscovery of oEmbed or to pre-cache
   images) were not checked against url_preview_url_blacklist.
2023-05-16 16:25:01 -04:00
Patrick Cloke 375b0a8a11
Update code to refer to "workers". (#15606)
A bunch of comments and variables are out of date and use
obsolete terms.
2023-05-16 15:56:38 -04:00
Shay 9f6ff6a0eb
Add not null constraint to column `full_user_id` of tables `profiles` and `user_filters` (#15537) 2023-05-16 10:57:39 -07:00
Eric Eastwood 77cda342be `traceback.format_exception(...)` usage that is compatible with Python 3.7 and 3.11 (#15599)
* Usage that is compatible with Python 3.8 and 3.11

> Since Python 3.10, instead of passing value and tb, an exception object can
  be passed as the first argument. If value and tb are provided, the first
  argument is ignored in order to provide backwards compatibility.
>
> -- https://docs.python.org/3/library/traceback.html

* Add changelog
2023-05-16 12:33:18 -05:00
Eric Eastwood c51d2e6199
Fix subscriptable type usage in Python <3.9 (#15604)
Fix the following `mypy` errors when running `mypy` with Python 3.7:
```
synapse/storage/controllers/stats.py:58: error: "Counter" is not subscriptable, use "typing.Counter" instead  [misc]

tests/test_state.py:267: error: "dict" is not subscriptable, use "typing.Dict" instead  [misc]
```

Part of https://github.com/matrix-org/synapse/issues/15603

In Python 3.9, `typing` is deprecated and the types are subscriptable (generics) by default, https://peps.python.org/pep-0585/#implementation
2023-05-16 12:19:46 -05:00
Eric Eastwood b6a7d49b6f
`traceback.format_exception(...)` usage that is compatible with Python 3.7 and 3.11 (#15599)
* Usage that is compatible with Python 3.8 and 3.11

> Since Python 3.10, instead of passing value and tb, an exception object can
  be passed as the first argument. If value and tb are provided, the first
  argument is ignored in order to provide backwards compatibility.
>
> -- https://docs.python.org/3/library/traceback.html

* Add changelog
2023-05-16 14:56:42 +01:00
Shay ba572647b2
Export `run_as_background_process` from the module API (#15577) 2023-05-15 13:11:21 -07:00
Patrick Cloke f2905d827f
Implement MSC3821 to update redaction rules (`third_party_invite.signed`) (#15563)
Updates the redaction rules to protect enough information that the
event can still be properly verified.
2023-05-15 15:02:24 -04:00
Patrick Cloke eb3c1823d8
Reject instead of erroring on invalid membership events. (#15564)
Instead of resulting in an internal server error for invalid events,
return that the event is invalid.
2023-05-15 15:01:29 -04:00
Patrick Cloke ba6b21c81e
Implement MSC3389 to protect relations from redaction. (#15565)
MSC3389 proposes protecting the relation type & parent event ID
from redaction. This keeps the relation information intact after
redaction which helps with some UX flaws (e.g. deleting an
event causes it to no longer be in a thread, which is confusing).
2023-05-15 12:58:09 +00:00
Michael Weimann 3690d5bd89
Add an unstable feature flag for MSC3981 to the /versions endpoint (#15558)
Signed-off-by: Michael Weimann <michaelw@matrix.org>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2023-05-15 10:54:49 +02:00
Patrick Cloke def480442d
Declare support for Matrix 1.6 (#15559)
Adds logging for key server requests which include a key ID.
This is technically in violation of the 1.6 spec, but is the only
way to remain backwards compatibly with earlier versions of
Synapse (and possibly other homeservers) which *did* include
the key ID.
2023-05-12 07:31:50 -04:00
Erik Johnston 808105bd31
Revert "Set thread_id column to non-null for event_push_{actions,actions_staging,summary} (#15437)" (#15580)
This reverts commit a7b3e9ce65.
2023-05-12 11:38:16 +01:00
Eric Eastwood d19d1edbcf
Print full startup/initialization error (#15569)
I found the error in the **Before** really vague and obtuse and didn't realize port `5432` corresponded to the Postgres port until searching the codebase. It says to check the logs but that wasn't my first instinct. It's just more obvious if we just print the full thing which gives context of the error type and the traceback to the relevant area of code.

#### Before

```
$ poetry run python -m synapse.app.homeserver -c homeserver.yaml
**********************************************************************************
 Error during initialisation:
    connection to server at "localhost" (::1), port 5432 failed: Connection refused
 	Is the server running on that host and accepting TCP/IP connections?
 connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused
 	Is the server running on that host and accepting TCP/IP connections?
 
 There may be more information in the logs.
**********************************************************************************
```

#### After

```sh
$ poetry run python -m synapse.app.homeserver -c homeserver.yaml
**********************************************************************************
 Error during initialisation:
     Traceback (most recent call last):
       File "/home/eric/Documents/github/element/synapse/synapse/app/homeserver.py", line 352, in setup
         hs.setup()
       File "/home/eric/Documents/github/element/synapse/synapse/server.py", line 337, in setup
         self.datastores = Databases(self.DATASTORE_CLASS, self)
       File "/home/eric/Documents/github/element/synapse/synapse/storage/databases/__init__.py", line 65, in __init__
         with make_conn(database_config, engine, "startup") as db_conn:
       File "/home/eric/Documents/github/element/synapse/synapse/storage/database.py", line 161, in make_conn
         native_db_conn = engine.module.connect(**db_params)
       File "/home/eric/.cache/pypoetry/virtualenvs/matrix-synapse-xCtC9ulO-py3.10/lib/python3.10/site-packages/psycopg2/__init__.py", line 122, in connect
         conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
     psycopg2.OperationalError: connection to server at "localhost" (::1), port 5432 failed: Connection refused
     	Is the server running on that host and accepting TCP/IP connections?
     connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused
     	Is the server running on that host and accepting TCP/IP connections?
 
 
 There may be more information in the logs.
**********************************************************************************
```
2023-05-11 11:50:46 -05:00
Roel ter Maat 2611433b70
Add redis SSL configuration options (#15312)
* Add SSL options to redis config

* fix lint issues

* Add documentation and changelog file

* add missing . at the end of the changelog

* Move client context factory to new file

* Rename ssl to tls and fix typo

* fix lint issues

* Added when redis attributes were added
2023-05-11 13:02:51 +01:00
Jason Little e4f545c452
Remove `worker_replication_*` settings (#15491)
* Add master to the instance_map as part of Complement, have ReplicationEndpoint look at instance_map for master.

* Fix typo in drive by.

* Remove unnecessary worker_replication_* bits from unit tests and add master to instance_map(hopefully in the right place)

* Several updates:

1. Switch from master to main for naming the main process in the instance_map. Add useful constants for easier adjustment of names in the future.
2. Add backwards compatibility for worker_replication_* to allow time to transition to new style. Make sure to prioritize declaring main directly on the instance_map.
3. Clean up old comments/commented out code.
4. Adjust unit tests to match with new code.
5. Adjust Complement setup infrastructure to only add main to the instance_map if workers are used and remove now unused options from the worker.yaml template.

* Initial Docs upload

* Changelog

* Missed some commented out code that can go now

* Remove TODO comment that no longer holds true.

* Fix links in docs

* More docs

* Remove debug logging

* Apply suggestions from code review

Co-authored-by: reivilibre <olivier@librepush.net>

* Apply suggestions from code review

Co-authored-by: reivilibre <olivier@librepush.net>

* Update version to latest, include completeish before/after examples in upgrade notes.

* Fix up and docs too

---------

Co-authored-by: reivilibre <olivier@librepush.net>
2023-05-11 11:30:56 +01:00
Andrew Morgan 722ccc30b5
Add an unstable feature flag for MSC3391 to the /versions endpoint (#15562) 2023-05-11 10:38:32 +01:00
Tulir Asokan 86d541f37c
Stabilize MSC2659 support for AS ping endpoint. (#15528) 2023-05-09 15:02:36 -04:00
Jason Little d3bd03559b
HTTP Replication Client (#15470)
Separate out a HTTP client for replication in preparation for
also supporting using UNIX sockets. The major difference from
the base class is that this does not use treq to handle HTTP
requests.
2023-05-09 14:25:20 -04:00
Travis Ralston ab4535b608
Add config option to prevent media downloads from listed domains. (#15197)
This stops media (and thumbnails) from being accessed from the
listed domains. It does not delete any already locally cached media,
but will prevent accessing it.

Note that admin APIs are unaffected by this change.
2023-05-09 14:08:51 -04:00
Patrick Cloke 4b4e0dc3ce
Error if attempting to set m.push_rules account data, per MSC4010. (#15555)
m.push_rules, like m.fully_read, is a special account data type that cannot
be set using the normal /account_data endpoint. Return an error instead
of allowing data that will not be used to be stored.
2023-05-09 10:34:10 -04:00
Patrick Cloke 2bfe3f0b81
Use account data constants in more places. (#15554) 2023-05-09 07:23:27 -04:00
Patrick Cloke 28bceef84e
Check appservices for devices during a /user/devices query. (#15539)
MSC3984 proxies /keys/query requests to appservices, but servers will
can also requests devices / keys from the /user/devices endpoint.

The formats are close enough that we can "proxy" that /user/devices to
appservices (by calling /keys/query) and then change the format of the
returned data before returning it over federation.
2023-05-05 15:18:47 -04:00
Patrick Cloke 36df9c5e36
Implement MSC4009 to widen the allowed Matrix ID grammar (#15536)
Behind a configuration flag this adds + to the list of allowed
characters in Matrix IDs. The main feature this enables is
using full E.164 phone numbers as Matrix IDs.
2023-05-05 12:13:50 -04:00
Zdziszek a0f53afd62
Handle `DNSNotImplementedError` in SRV resolver (#15523)
Signed-off-by: Zdzichu <zdzichu.rks@protonmail.com>
2023-05-05 15:54:32 +01:00
Andrew Morgan 7c95b65873
Clean up and clarify "Create or modify Account" Admin API documentation (#15544) 2023-05-05 15:51:46 +01:00
Sean Quah e46d5f3586
Factor out an `is_mine_server_name` method (#15542)
Add an `is_mine_server_name` method, similar to `is_mine_id`.

Ideally we would use this consistently, instead of sometimes comparing
against `hs.hostname` and other times reaching into
`hs.config.server.server_name`.

Also fix a bug in the tests where `hs.hostname` would sometimes differ
from `hs.config.server.server_name`.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-05-05 15:06:22 +01:00
Andrew Morgan 2e59e97ebd
Move ThirdPartyEventRules into module_api/callbacks (#15535) 2023-05-04 14:18:22 +00:00
Patrick Cloke ded8f3d349
Update the base rules to remove the dont_notify action. (MSC3987) (#15534)
A dont_notify action is a no-op (and coalesce is undefined). These are
both considered no-ops by the spec, per MSC3987 and the predefined
push rules were updated to remove dont_notify from the list of actions.
2023-05-04 11:54:13 +00:00
Sandro 5f8822854d
Use oEmbed for YouTube Shorts (#15025)
It seems that YouTube Short previews do not work in some
regions, but the oEmbed information for those areas is still
valid.

This causes YouTube Shorts to always use (only) the oEmbed
endpoint which is a minor regression for regions where the URL
preview was already working -- some of the additional video
metadata is lost. It is not likely that clients are using this today
and it is more beneficial to have a limited preview working everywhere
than unused metadata in the Open Graph response.
2023-05-03 12:54:42 -04:00
Sean Quah 8aee823393 Merge branch 'release-v1.83' into develop 2023-05-03 15:23:16 +01:00
Erik Johnston 28ac1a1a91
Speed up deleting of old rows in `event_push_actions` (#15531)
Enforce that we use index scans (rather than seq scans), which we also do for state queries. The reason to enforce this is that we can't correctly get PostgreSQL to understand the distribution of `stream_ordering` depends on `highlight`, and so it always defaults (on matrix.org) to sequential scans.
2023-05-03 13:42:43 +00:00
Erik Johnston fc3a878220
Speed up rebuilding of the user directory for local users (#15529)
The idea here is to batch up the work.
2023-05-03 13:41:37 +00:00
Sean Quah 3b837d856c
Revert "Reduce the size of the HTTP connection pool for non-pushers" (#15530)
#15514 introduced a regression where Synapse would encounter
`PartialDownloadError`s when fetching OpenID metadata for certain
providers on startup. Due to #8088, this prevents Synapse from starting
entirely.

Revert the change while we decide what to do about the regression.
2023-05-03 13:09:20 +01:00
Patrick Cloke a7b3e9ce65
Set thread_id column to non-null for event_push_{actions,actions_staging,summary} (#15437)
Updates the database schema to require a thread_id (by adding a
constraint that the column is non-null) for event_push_actions,
event_push_actions_staging, and event_push_actions_summary.

For PostgreSQL we add the constraint as NOT VALID, then
VALIDATE the constraint a background job to avoid locking
the table during an upgrade.

For SQLite we simply rebuild the table & copy the data.
2023-05-03 07:49:03 -04:00
Sean Quah 04e79e6a18
Add config option to forget rooms automatically when users leave them (#15224)
This is largely based off the stats and user directory updater code.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-05-03 12:27:33 +01:00
Shay 0e8aa2a1b2
Remove references to supporting per-user flag for msc2654 (#15522) 2023-05-02 14:21:36 -07:00
Erik Johnston 4de271a7fc
Allow adding random delay to push (#15516)
This is to discourage timing based profiling on the push gateways.
2023-05-02 16:45:44 +00:00
Patrick Cloke 6aca4e7cb8
Reduce the size of the HTTP connection pool for non-pushers. (#15514)
Pushers tend to make many connections to the same HTTP host
(e.g. a new event comes in, causes events to be pushed, and then
the homeserver connects to the same host many times). Due to this
the per-host HTTP connection pool size was increased, but this does
not make sense for other SimpleHttpClients.

Add a parameter for the connection pool and override it for pushers
(making a separate SimpleHttpClient for pushers with the increased
configuration).

This returns the HTTP connection pool settings to the default Twisted
ones for non-pusher HTTP clients.
2023-05-02 09:29:40 -04:00
Patrick Cloke 07b1c70d6b
Initial implementation of MSC3981: recursive relations API (#15315)
Adds an optional keyword argument to the /relations API which
will recurse a limited number of event relationships.

This will cause the API to return not just the events related to the
parent event, but also events related to those related to the parent
event, etc.

This is disabled by default behind an experimental configuration
flag and is currently implemented using prefixed parameters.
2023-05-02 07:59:55 -04:00
Shay 89f6fb0d5a
Add an admin API endpoint to support per-user feature flags (#15344) 2023-04-28 11:33:45 -07:00
Patrick Cloke 57aeeb308b
Add support for claiming multiple OTKs at once. (#15468)
MSC3983 provides a way to request multiple OTKs at once from appservices,
this extends this concept to the Client-Server API.

Note that this will likely be spit out into a separate MSC, but is currently part of
MSC3983.
2023-04-27 12:57:46 -04:00
Patrick Cloke 6efa674004
Add type hints to schema deltas (#15497)
Cleans-up the schema delta files:

* Removes no-op functions.
* Adds missing type hints to function parameters.
* Fixes any issues with type hints.

This also renames one (very old) schema delta to avoid a conflict
that mypy complains about.
2023-04-27 12:44:53 +00:00
Patrick Cloke a346b43837
Check databases/__init__ and main/cache with mypy. (#15496) 2023-04-27 07:59:14 -04:00
mcalinghee 486c059479
Disable push rule evaluation for rooms excluded from sync (#15361)
* no push for excluded room from sync

* add changelog
Signed-off-by: Maghen Calinghee <maghen.calinghee@beta.gouv.fr>

* correct changelog
2023-04-27 11:32:02 +01:00
Shay 301b4156d5
Add column `full_user_id` to tables `profiles` and `user_filters`. (#15458) 2023-04-26 16:03:26 -07:00
Mathieu Velten 247e6a8a78
Add a module API to send an HTTP push notification (#15387)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2023-04-26 21:10:51 +02:00
Erik Johnston 9900f7c231
Add admin endpoint to query room sizes (#15482) 2023-04-26 16:00:11 +00:00
Patrick Cloke 8e9739449d
Add unstable /keys/claim endpoint which always returns fallback keys. (#15462)
It can be useful to always return the fallback key when attempting to
claim keys. This adds an unstable endpoint for `/keys/claim` which
always returns fallback keys in addition to one-time-keys.

The fallback key(s) are not marked as "used" unless there are no
corresponding OTKs.

This is currently defined in MSC3983 (although likely to be split out
to a separate MSC). The endpoint shape may change or be requested
differently (i.e. a keyword parameter on the current endpoint), but the
core logic should be reasonable.
2023-04-25 13:30:41 -04:00
Nick Mills-Barrett c55293c230
Re re introduce membership tables event stream ordering (#15356) 2023-04-25 09:44:29 +01:00
Quentin Gliech 8b3a502996
Experimental support for MSC3970: per-device transaction IDs (#15318) 2023-04-25 09:37:09 +01:00
Patrick Cloke ea5c3ede4f
Finish type hints for federation client HTTP code. (#15465) 2023-04-24 13:12:06 -04:00
Alok Kumar Singh 197fbb123b
Remove legacy code of single user device resync api (#15418)
* Removed single-user resync usage and updated it to use multi-user counterpart

Signed-off-by: Alok Kumar Singh alokaks601@gmail.com
2023-04-21 12:06:39 +01:00
Patrick Cloke 5e024a0645
Modify StoreKeyFetcher to read from server_keys_json. (#15417)
Before this change:

* `PerspectivesKeyFetcher` and `ServerKeyFetcher` write to `server_keys_json`.
* `PerspectivesKeyFetcher` also writes to `server_signature_keys`.
* `StoreKeyFetcher` reads from `server_signature_keys`.

After this change:

* `PerspectivesKeyFetcher` and `ServerKeyFetcher` write to `server_keys_json`.
* `PerspectivesKeyFetcher` also writes to `server_signature_keys`.
* `StoreKeyFetcher` reads from `server_keys_json`.

This results in `StoreKeyFetcher` now using the results from `ServerKeyFetcher`
in addition to those from `PerspectivesKeyFetcher`, i.e. keys which are directly
fetched from a server will now be pulled from the database instead of refetched.

An additional minor change is included to avoid creating a `PerspectivesKeyFetcher`
(and checking it) if no `trusted_key_servers` are configured.

The overall impact of this should be better usage of cached results:

* If a server has no trusted key servers configured then it should reduce how often keys
  are fetched.
* if a server's trusted key server does not have a requested server's keys cached then it
  should reduce how often keys are directly fetched.
2023-04-20 12:30:32 -04:00
Andrew Morgan aec639e3e3
Move Spam Checker callbacks to a dedicated file (#15453) 2023-04-18 00:57:40 +00:00
Jason Little e12d788bb7
Switch `InstanceLocationConfig` to a pydantic `BaseModel` (#15431)
* Switch InstanceLocationConfig to a pydantic BaseModel, apply Strict* types and add a few helper methods(that will make more sense in follow up work).

Co-authored-by: David Robertson <davidr@element.io>
2023-04-17 23:53:43 +00:00
Jason Little c9326140dc
Refactor `SimpleHttpClient` to pull out reusable methods (#15427)
Pulls out some methods to `BaseHttpClient` to eventually be
reused in other contexts.
2023-04-14 20:46:04 +00:00
David Robertson 8a47d6e3a6
More precise type for LoggingTransaction.execute (#15432)
* More precise type for LoggingTransaction.execute
* Add an annotation for stream_ordering_month_ago

This would have spotted the error that was fixed in "Add comma missing from #15382. (#15429)"
2023-04-14 18:04:49 +00:00
Dirk Klimpel 24b61f32ff
Disable directory listing for `StaticResource` (#15438) 2023-04-14 13:49:47 -04:00
Dirk Klimpel e4a25d022c
Load `/capabilities` endpoint on workers (#15436) 2023-04-14 12:26:07 -04:00
Erik Johnston b5192355f6
User directory background update speedup (#15435)
c.f. #15264

The two changes are:
1. Add indexes so that the select / deletes don't do sequential scans
2. Don't repeatedly call `SELECT count(*)` each iteration, as that's slow
2023-04-14 16:10:32 +01:00
Mathieu Velten dabbb94faf
Delete pushers after calling on_logged_out module hook on device delete (#15410) 2023-04-14 14:12:37 +02:00
Dirk Klimpel 4af0aec54d
Load `/directory/room/{roomAlias}` endpoint on workers (#15333)
* Enable `directory`

* move to worker store

* newsfile

* disable `ClientDirectoryListServer` and `ClientAppserviceDirectoryListServer` for workers
2023-04-14 10:24:06 +01:00
Patrick Cloke d751f65e71
Remove registration fallback code. (#15405)
The registration fallback is broken and unspecced. This removes it
since there is no plan to spec it.

Note that this does not modify the login fallback code.
2023-04-13 11:36:29 -04:00
reivilibre edae20f926
Improve robustness when handling a perspective key response by deduplicating received server keys. (#15423)
* Change `store_server_verify_keys` to take a `Mapping[(str, str), FKR]`

This is because we already can't handle duplicate keys — leads to cardinality violation

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-04-13 15:35:03 +01:00
reivilibre 38272be037
Add comma missing from #15382. (#15429)
* Add missing comma

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-04-13 15:06:25 +01:00
Patrick Cloke 2503126d52
Implement MSC2174: move redacts to a content property. (#15395)
This moves `redacts` from being a top-level property to
a `content` property in a new room version.

MSC2176 (which was previously implemented) states to not
`redact` this property.
2023-04-13 13:47:07 +00:00
Dirk Klimpel c9723a1c1f
Only load the SSO redirect servlet if SSO is enabled. (#15421) 2023-04-13 13:08:00 +00:00
Dirk Klimpel be36600327
Disable loading `RefreshTokenServlet` on workers (#15428) 2023-04-13 13:28:55 +02:00
Will Hunt 253e86a72e
Throw if the appservice config list is the wrong type (#15425)
* raise a ConfigError on an invalid app_service_config_files

* changelog

* Move config check to read_config

* Add test

* Ensure list also contains strings
2023-04-12 11:28:46 +00:00
Patrick Cloke d07d255830
Implement MSC2175: remove the creator field from create events. (#15394) 2023-04-06 16:26:28 -04:00
Erik Johnston 485b9fdefb
Don't keep old stream_ordering_to_exterm around (#15382) 2023-04-06 16:42:39 +00:00
Patrick Cloke 72b43bec8b Merge remote-tracking branch 'origin/release-v1.81' into develop 2023-04-06 11:44:26 -04:00
Patrick Cloke 83649b891d
Implement MSC3989 to redact the origin field. (#15393)
This will be done in a future room version, for now an unstable
room version is added which redacts the origin field.
2023-04-05 14:42:46 -04:00
Quentin Gliech 6eb3edec47
Fix the 'set_device_id_for_pushers_txn' background update. (#15391)
Refer to the correct field from the response when updating
the background update progress.
2023-04-05 07:49:15 -04:00
Shay 6b23d74ad1
Delete server-side backup keys when deactivating an account. (#15181) 2023-04-04 20:16:08 +00:00
Erik Johnston 79d2e2e79c
Speed up membership queries for users with forgotten rooms (#15385) 2023-04-04 14:11:34 +01:00
Sean Quah 89a71e7390
Fix a rare bug where initial /syncs would fail (#15383)
This change fixes a rare bug where initial /syncs would fail with a
`KeyError` under the following circumstances:
 1. A user fast joins a remote room.
 2. The user is kicked from the room before the room's full state has
    been synced.
 3. A second local user fast joins the room.
 4. Events are backfilled into the room with a higher topological
    ordering than the original user's leave. They are assigned a
    negative stream ordering. It's not clear how backfill happened here,
    since it is expected to be equivalent to syncing the full state.
 5. The second local user leaves the room before the room's full state
    has been synced. The homeserver does not complete the sync.
 6. The original user performs an initial /sync with lazy_load_members
    enabled.
     * Because they were kicked from the room, the room is included in
       the /sync response even though the include_leave option is not
       specified.
     * To populate the room's timeline, `_load_filtered_recents` /
       `get_recent_events_for_room` fetches events with a lower stream
       ordering than the leave event and picks the ones with the highest
       topological orderings (which are most recent). This captures the
       backfilled events after the leave, since they have a negative
       stream ordering. These events are filtered out of the timeline,
       since the user was not in the room at the time and cannot view
       them. The sync code ends up with an empty timeline for the room
       that notably does not include the user's leave event.
       This seems buggy, but at least we don't disclose events the user
       isn't allowed to see.
     * Normally, `compute_state_delta` would fetch the state at the
       start and end of the room's timeline to generate the sync
       response. Since the timeline is empty, it fetches the state at
       `min(now, last event in the room)`, which corresponds with the
       second user's leave. The state during the entirety of the second
       user's membership does not include the membership for the first
       user because of partial state.
       This part is also questionable, since we are fetching state from
       outside the bounds of the user's membership.
     * `compute_state_delta` then tries and fails to find the user's
       membership in the auth events of timeline events. Because there
       is no timeline event whose auth events are expected to contain
       the user's membership, a `KeyError` is raised.

Also contains a drive-by fix for a separate unlikely race condition.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-04-04 13:10:25 +01:00
Patrick Cloke cf2f2934ad
Call appservices on modern paths, falling back to legacy paths. (#15317)
This uses the specced /_matrix/app/v1/... paths instead of the
"legacy" paths. If the homeserver receives an error it will retry
using the legacy path.
2023-04-03 13:20:32 -04:00
Jason Little 56efa9b167
Experimental Unix socket support (#15353)
* Add IReactorUNIX to ISynapseReactor type hint.

* Create listen_unix().

Two options, 'path' to the file and 'mode' of permissions(not umask, recommend 666 as default as
nginx/other reverse proxies write to it and it's setup as user www-data)

For the moment, leave the option to always create a PID lockfile turned on by default

* Create UnixListenerConfig and wire it up.

Rename ListenerConfig to TCPListenerConfig, then Union them together into ListenerConfig.
This spidered around a bit, but I think I got it all. Metrics and manhole have been placed
behind a conditional in case of accidental putting them onto a unix socket.

Use new helpers to get if a listener is configured for TLS, and to help create a site tag
for logging.

There are 2 TODO things in parse_listener_def() to finish up at a later point.

* Refactor SynapseRequest to handle logging correctly when using a unix socket.

This prevents an exception when an IP address can not be retrieved for a request.

* Make the 'Synapse now listening on Unix socket' log line a little prettier.

* No silent failures on generic workers when trying to use a unix socket with metrics or manhole.

* Inline variables in app/_base.py

* Update docstring for listen_unix() to remove reference to a hardcoded permission of 0o666 and add a few comments saying where the default IS declared.

* Disallow both a unix socket and a ip/port combo on the same listener resource

* Linting

* Changelog

* review: simplify how listen_unix returns(and get rid of a type: ignore)

* review: fix typo from ConfigError in app/homeserver.py

* review: roll conditional for http_options.tag into get_site_tag() helper(and add docstring)

* review: enhance the conditionals for checking if a port or path is valid, remove a TODO line

* review: Try updating comment in get_client_ip_if_available to clarify what is being retrieved and why

* Pretty up how 'Synapse now listening on Unix Socket' looks by decoding the byte string.

* review: In parse_listener_def(), raise ConfigError if neither socket_path nor port is declared(and fix a typo)
2023-04-03 10:27:51 +01:00
Jason Robinson 157092d97a
Fix copyright year in SSO footer template (#15358) 2023-03-31 18:20:40 +01:00
Erik Johnston 6204c3663e
Revert pruning of old devices (#15360)
* Revert "Fix registering a device on an account with lots of devices (#15348)"

This reverts commit f0d8f66eaa.

* Revert "Delete stale non-e2e devices for users, take 3 (#15183)"

This reverts commit 78cdb72cd6.
2023-03-31 13:51:51 +01:00
Olivier Wilkinson (reivilibre) 72d2ceaa9a Revert "Set thread_id column to non-null for event_push_{actions,actions_staging,summary} (#15350)"
This reverts commit 2a234b788e.

See #15359 for context.
2023-03-31 12:10:10 +01:00
Patrick Cloke 2a234b788e
Set thread_id column to non-null for event_push_{actions,actions_staging,summary} (#15350)
Clean-up from adding the thread_id column, which was initially
null but backfilled with values. It is desirable to require it to now
be non-null.

In addition to altering this column to be non-null, we clean up
obsolete background jobs, indexes, and just-in-time updating
code.
2023-03-30 15:11:31 -04:00
Mathieu Velten 6f68e32bfb
to_device updates could be dropped when consuming the replication stream (#15349)
Co-authored-by: reivilibre <oliverw@matrix.org>
2023-03-30 19:41:14 +02:00
Erik Johnston 91c3f32673
Speed up SQLite unit test CI (#15334)
Tests now take 40% of the time.
2023-03-30 16:21:12 +01:00
Patrick Cloke ae4acda1bb
Implement MSC3984 to proxy /keys/query requests to appservices. (#15321)
If enabled, for users which are exclusively owned by an application
service then the appservice will be queried for devices in addition
to any information stored in the Synapse database.
2023-03-30 08:39:38 -04:00
Sean Quah d9f694932c
Fix spinloop during partial state sync when a prev event is in backoff (#15351)
Previously, we would spin in a tight loop until
`update_state_for_partial_state_event` stopped raising
`FederationPullAttemptBackoffError`s. Replace the spinloop with a wait
until the backoff period has expired.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-03-30 13:36:41 +01:00
Warren Bailey a3bad89d57
Add the ability to enable/disable registrations when in the OIDC flow (#14978)
Signed-off-by: Warren Bailey <warren@warrenbailey.net>
2023-03-30 11:09:41 +00:00
Mathieu Velten 9228ae633f
Add some clarification to the doc/comments regarding TCP replication (#15354) 2023-03-30 12:51:35 +02:00
Cyberes 9d641d88b7
Fix missing app variable in mail subject for password resets (#15352)
* Update mailer.py

Fix `KeyError: 'app'`

* Create 15352.bugfix

Signed-off-by: Cyberes <cyberes@evulid.cc>

---------

Signed-off-by: Cyberes <cyberes@evulid.cc>
2023-03-30 11:44:53 +01:00
Erik Johnston f0d8f66eaa
Fix registering a device on an account with lots of devices (#15348)
Fixes up #15183
2023-03-29 13:37:06 +00:00
Erik Johnston 5350b5d04d
Revert "Reintroduce membership tables event stream ordering (#15128)" (#15347)
This reverts commit e6af49fbea.
2023-03-29 13:24:28 +01:00
Erik Johnston 78cdb72cd6
Delete stale non-e2e devices for users, take 3 (#15183)
This should help reduce the number of devices e.g. simple bots the repeatedly login rack up.

We only delete non-e2e devices as they should be safe to delete, whereas if we delete e2e devices for a user we may accidentally break their ability to receive e2e keys for a message.
2023-03-29 12:07:14 +01:00
DeepBlueV7.X 753d1d9cde
Fix joining rooms you have been unbanned from (#15323)
* Fix joining rooms you have been unbanned from

Since forever synapse did not allow you to join a room after you have
been unbanned from it over federation. This was not actually because of
the unban event not federating. Synapse simply used outdated state to
validate the join transition. This skips the validation if we are not in
the room and for that reason won't have the current room state.

Fixes #1563

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>

* Add changelog

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>

* Update changelog.d/15323.bugfix

---------

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
2023-03-29 08:37:27 +00:00
Patrick Cloke 5282ba1e2b
Implement MSC3983 to proxy /keys/claim queries to appservices. (#15314)
Experimental support for MSC3983 is behind a configuration flag.
If enabled, for users which are exclusively owned by an application
service then the appservice will be queried for one-time keys *if*
there are none uploaded to Synapse.
2023-03-28 18:26:27 +00:00
dependabot[bot] bd4d958aaf
Bump ruff from 0.0.252 to 0.0.259 (#15328)
* Bump ruff from 0.0.252 to 0.0.259

Bumps [ruff](https://github.com/charliermarsh/ruff) from 0.0.252 to 0.0.259.
- [Release notes](https://github.com/charliermarsh/ruff/releases)
- [Changelog](https://github.com/charliermarsh/ruff/blob/main/BREAKING_CHANGES.md)
- [Commits](https://github.com/charliermarsh/ruff/compare/v0.0.252...v0.0.259)

---
updated-dependencies:
- dependency-name: ruff
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fix new warnings

* Mypy

* Newsfile

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Erik Johnston <erik@matrix.org>
2023-03-28 09:46:47 +01:00
Erik Johnston 96f163d932
Prune old typing notifications (#15332)
Rather than keeping them around forever in memory, slowing things down.

Fixes #11750.
2023-03-27 14:32:36 +01:00
Dirk Klimpel 4fc85e5a92
Load `/password_policy` endpoint on workers. (#15331) 2023-03-27 07:37:17 -04:00
reivilibre d5324ee111
Add developer documentation for the Federation Sender and add a documentation mechanism using Sphinx. (#15265)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2023-03-24 16:41:10 +00:00
reivilibre 5f7c908280
As an optimisation, use `TRUNCATE` on Postgres when clearing the user directory tables. (#15316) 2023-03-24 15:31:12 +00:00
Quentin Gliech 5b70f240cf
Make cleaning up pushers depend on the device_id instead of the token_id (#15280)
This makes it so that we rely on the `device_id` to delete pushers on logout,
instead of relying on the `access_token_id`. This ensures we're not removing
pushers on token refresh, and prepares for a world without access token IDs
(also known as the OIDC).

This actually runs the `set_device_id_for_pushers` background update, which
was forgotten in #13831.

Note that for backwards compatibility it still deletes pushers based on the
`access_token` until the background update finishes.
2023-03-24 11:09:39 -04:00
Patrick Cloke 68a6717312
Reject mentions on the C-S API which are invalid. (#15311)
Invalid mentions data received over the Client-Server API should
be rejected with a 400 error. This will hopefully stop clients from
sending invalid data, although does not help with data received
over federation.
2023-03-24 08:31:14 -04:00
Nick Mills-Barrett e6af49fbea
Reintroduce membership tables event stream ordering (#15128)
* Add `event_stream_ordering` column to membership state tables

Specifically this adds the column to `current_state_events`,
`local_current_membership` and `room_memberships`. Each of these tables
is regularly joined with the `events` table to get the stream ordering
and denormalising this into each table will yield significant query
performance improvements once used.

* Make denormalised `event_stream_ordering` columns foreign keys
* Add comment in schema file explaining new denormalised columns
* Add triggers to enforce consistency of `event_stream_ordering` columns
* Re-order purge room tables to account for foreign keys
* Bump schema version to 75

Co-authored-by: David Robertson <david.m.robertson1@gmail.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2023-03-24 11:44:01 +00:00
reivilibre 98fd558382
Add a primitive helper script for listing worker endpoints. (#15243)
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
2023-03-23 12:11:14 +00:00
David Robertson 3b0083c92a
Use immutabledict instead of frozendict (#15113)
Additionally:

* Consistently use `freeze()` in test

---------

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: 6543 <6543@obermui.de>
2023-03-22 17:15:34 +00:00
Shay 7f02fafa28
Add a check to SQLite port DB script to ensure that the sqlite database passed to the script exists before trying to port from it (#15306) 2023-03-22 08:36:42 -07:00
David Robertson 1bc9985eb7
Have replication clients remove _INT_STREAM_POS (#15309)
* Have replication clients remove _INT_STREAM_POS

Suppose worker A makes an internal http request from worker B. B may
make changes that A later learns about over replication. We want A's
request to block until it has seen those changes—mainly to ensure A's
caches are invalidated promptly. This helps provide read-after-write
consistency, eliminating entire categories of races and test flakes.

To implement this, B includes a top-level field `_INT_STREAM_POS` in its
response JSON. Roughly speaking, the field's value tells A what to wait
for. But we weren't removing that internal field before A's request
completed!

Introduced in https://github.com/matrix-org/synapse/pull/14820.
Fixes #15308.

* Changelog
2023-03-22 12:53:55 +00:00
Shay 72f3f23c4d
Change the parameter `immediate` of `send_device_messages` to default to `True` (#15297) 2023-03-21 17:59:55 -07:00
Patrick Cloke 1bc4feb6c9
Apply & bundle edits for non-message events. (#15295) 2023-03-21 14:19:54 -04:00
Shay 96bcc5d902
Revert "check sqlite database file exists before porting/#14692" (#15301) 2023-03-21 10:49:25 -07:00
Andrew Morgan ec9224bf9a
Make `POST /_matrix/client/v3/rooms/{roomId}/report/{eventId}` endpoint return 404 if event exists, but the user lacks access (#15300) 2023-03-21 13:24:03 +00:00
Andrew Morgan b6aef59334
Make `EventHandler.get_event` return `None` when the requested event is not found (#15298) 2023-03-21 13:23:47 +00:00
Erik Johnston 827f198177
Fix error when sending message into deleted room. (#15235)
When a room is deleted in Synapse we remove the event forward
extremities in the room, so if (say a bot) tries to send a message into
the room we error out due to not being able to calculate prev events for
the new event *before* we check if the sender is in the room.

Fixes #8094
2023-03-21 09:13:43 +00:00
Patrick Cloke a5fb382a29
Separate HTTP preview code and URL previewer. (#15269)
Separates REST layer code from the actual URL previewing.
2023-03-20 14:32:26 -04:00
Shay 5ab7146e19
Add Synapse-Trace-Id to access-control-expose-headers header (#14974) 2023-03-20 11:14:05 -07:00
Patrick Cloke 25006acc17
Add /versions flag for MSC3952. (#15293) 2023-03-20 11:47:21 -04:00
Jason Little 3d70cc393f
Load `/register/available` endpoint on workers (#15268) 2023-03-17 09:50:31 -04:00
Patrick Cloke afb216c202
Remove no-op send_command for Redis replication. (#15274)
With Redis commands do not need to be re-issued by the main
process (they fan-out to all processes at once) and thus it is no
longer necessary to worry about them reflecting recursively forever.
2023-03-16 11:13:30 -04:00
Tulir Asokan b0a0fb5c97
Implement MSC2659: application service ping endpoint (#15249)
Signed-off-by: Tulir Asokan <tulir@maunium.net>
2023-03-16 15:00:03 +01:00
reivilibre 1f5473465d
Refresh remote profiles that have been marked as stale, in order to fill the user directory. [rei:userdirpriv] (#14756)
* Scaffolding for background process to refresh profiles

* Add scaffolding for background process to refresh profiles for a given server

* Implement the code to select servers to refresh from

* Ensure we don't build up multiple looping calls

* Make `get_profile` able to respect backoffs

* Add logic for refreshing users

* When backing off, schedule a refresh when the backoff is over

* Wake up the background processes when we receive an interesting state event

* Add tests

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Add comment about 1<<62

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-03-16 11:44:11 +00:00
Andrew Morgan 4953cd71df
Move Account Validity callbacks to a dedicated file (#15237) 2023-03-16 10:35:31 +00:00
reivilibre f54f877f27
Preparatory work to fix the user directory assuming that any remote membership state events represent a profile change. [rei:userdirpriv] (#14755)
* Remove special-case method for new memberships only, use more generic method

* Only collect profiles from state events in public rooms

* Add a table to track stale remote user profiles

* Add store methods to set and delete rows in this new table

* Mark remote profiles as stale when a member state event comes in to a private room

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Simplify by removing Optionality of `event_id`

* Replace names and avatars with None if they're set to dodgy things

I think this makes more sense anyway.

* Move schema delta to 74 (I missed the boat?)

* Turns out these can be None after all

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-03-16 09:55:19 +00:00
Patrick Cloke 3bf973edc7
Remove unused class: DirectTcpReplicationClientFactory. (#15272) 2023-03-15 15:42:20 -04:00
reivilibre 63d87c08c8
Add schema comments about the `destinations` and `destination_rooms` tables. (#15247) 2023-03-15 09:25:58 +00:00
reivilibre d0fe417f5c
Remove unused store method `_set_destination_retry_timings_emulated`. (#15266) 2023-03-14 17:32:46 +00:00
Patrick Cloke e7b559d2ca
Avoid unneeded work if auto-join rooms aren't configured. (#15262)
It is not necessary to reach out to the database to check some
parameters if the auto-join rooms are not configured, or (in some cases)
if auto-create rooms is not configured.
2023-03-14 08:18:49 -04:00
David Robertson a1c9869394
Merge branch 'release-v1.79' into develop 2023-03-13 18:35:21 +00:00
David Robertson c071cd5a0e
Ensure fed-sender catchup does not block for full state (#15248)
* Reproduce bad scenario in test
* Avoid catchup optimisation for partial state rooms
2023-03-13 12:31:19 +00:00
David Robertson 4bb26c95a9
Refactor `filter_events_for_server` (#15240)
* Tweak docstring and type hint

* Flip logic and provide better name

* Separate decision from action

* Track a set of strings, not EventBases

* Require explicit boolean options from callers

* Add explicit option for partial state rooms

* Changelog

* Rename param
2023-03-10 15:31:25 +00:00
Andrew Morgan e157c63f68
Fix missing conditional for registering `on_remove_user_third_party_identifier` module api callbacks (#15227 2023-03-10 10:35:18 +00:00
David Robertson ce54477f6f
Give PyCharm some help with `@cache_in_self` (#15238)
* Give PyCharm some help with `@cache_in_self`

* Changelog

* Fix import for old python versions
2023-03-09 19:12:09 +00:00
Sean Quah caf43c3d7c
Faster joins: Fix spurious errors on incremental sync (#15232)
When pushing events in partial state rooms down incremental /sync, we
try to find the `m.room.member` state event for their senders by digging
through their auth events, so that we can present the membership to the
client. Events usually have a membership event in their auth events,
with the exception of the `m.room.create` event and a user's first join
into the room.

When implementing #13477, we took the case of a user's first join into
account, but forgot to handle the `m.room.create` case. This change
fixes that.

Signed-off-by: Sean Quah <seanq@matrix.org>
2023-03-09 14:18:39 +00:00
Patrick Cloke 3d060eae6c
Add missing type hints to `synapse.storage.database`. (#15230) 2023-03-09 07:10:09 -05:00
Patrick Cloke e7c3832ba6
Pull in netaddr type hints. (#15231)
And fix any issues from having those type hints.
2023-03-09 07:09:49 -05:00
Shay be4ea209e8
Add topic and name events to group of events that are batch persisted when creating a room. (#15229) 2023-03-08 19:27:20 -08:00
Patrick Cloke 88efc75bab
Include the room ID in more purge room log lines. (#15222) 2023-03-08 20:08:56 +00:00
Shay a368d30c1c
More speedups/fixes to creating batched events (#15195) 2023-03-07 13:54:39 -08:00