Commit Graph

19663 Commits (2735b3e6f203813e72ca1845225dedd7d791dbb7)

Author SHA1 Message Date
Patrick Cloke c3ccad7785
Only do restricted join rules signature checks for room versions 8/9. (#10927)
Otherwise the presence of a (bogus, unused) field could cause
auth checks to fail.
2021-09-28 08:44:19 -04:00
Erik Johnston 3c50192d3f 1.44.0rc1 2021-09-28 13:42:21 +01:00
Erik Johnston a8bbf08576
Fix debian package builds. (#10931)
This was due to dh-virtualenv builds being broken due to Shpinx removing
deprecated APIs.
2021-09-28 12:13:51 +01:00
Erik Johnston 707d5e4e48
Encode JSON responses on a thread in C, mk2 (#10905)
Currently we use `JsonEncoder.iterencode` to write JSON responses, which ensures that we don't block the main reactor thread when encoding huge objects. The downside to this is that `iterencode` falls back to using a pure Python encoder that is *much* less efficient and can easily burn a lot of CPU for huge responses. To fix this, while still ensuring we don't block the reactor loop, we encode the JSON on a threadpool using the standard `JsonEncoder.encode` functions, which is backed by a C library.

Doing so, however, requires `respond_with_json` to have access to the reactor, which it previously didn't. There are two ways of doing this:

1. threading through the reactor object, which is a bit fiddly as e.g. `DirectServeJsonResource` doesn't currently take a reactor, but is exposed to modules and so is a PITA to change; or
2. expose the reactor in `SynapseRequest`, which requires updating a bunch of servlet types.

I went with the latter as that is just a mechanical change, and I think makes sense as a request already has a reactor associated with it (via its http channel).
2021-09-28 09:37:58 +00:00
Erik Johnston d37841787a
Sign the git tag in release script (#10925) 2021-09-27 15:39:49 +01:00
Sean Quah f7768f62cb
Avoid storing URL cache files in storage providers (#10911)
URL cache files are short-lived and it does not make sense to offload
them (eg. to the cloud) or back them up.
2021-09-27 12:55:27 +01:00
Sean Quah 6c83c27107
Fix race conditions when creating media store and config directories (#10913) 2021-09-27 11:29:23 +01:00
Eric Eastwood d138187045
Document changes to schema version 61 - 64 (#10917)
As pointed out by @richvdh, https://github.com/matrix-org/synapse/pull/10838#discussion_r715424244

Retroactively summarize `61` - `64`
2021-09-24 17:09:12 -05:00
Brendan Abolivier b10257e879
Add a spamchecker callback to allow or deny room creation based on invites (#10898)
This is in the context of creating new module callbacks that modules in https://github.com/matrix-org/synapse-dinsic can use, in an effort to reconcile the spam checker API in synapse-dinsic with the one in mainline.

This adds a callback that's fairly similar to user_may_create_room except it also allows processing based on the invites sent at room creation.
2021-09-24 16:38:23 +02:00
David Robertson ea01d4c2de
Update postgresql testing script (#10906)
- Use sytest:bionic. Sytest:latest is two years old (do we want
  CI to push out latest at all?) and comes with Python 3.5, which we
  explictly no longer support. The script now runs under PostgreSQL 10
  as a result.
- Advertise script in the docs
- Move pg testing script to scripts-dev directory
- Write to host as the script's exector, not root

A few changes to make it speedier to re-run the tests:

- Create blank DB in the container, not the script, so we don't have to
  `initdb` each time
- Use a named volume to persist the tox environment, so we don't have to
  fetch and install a bunch of packages from PyPI each time

Co-authored-by: reivilibre <olivier@librepush.net>
2021-09-24 14:27:09 +00:00
Richard van der Hoff 0420d4e6a5
Stop trying to auth/persist events whose auth events we do not have. (#10907) 2021-09-24 14:01:45 +01:00
Patrick Cloke bb7fdd821b
Use direct references for configuration variables (part 5). (#10897) 2021-09-24 07:25:21 -04:00
Richard van der Hoff 85551b7a85
Factor out common code for persisting fetched auth events (#10896)
* Factor more stuff out of `_get_events_and_persist`

It turns out that the event-sorting algorithm in `_get_events_and_persist` is
also useful in other circumstances. Here we move the current
`_auth_and_persist_fetched_events` to `_auth_and_persist_fetched_events_inner`,
and then factor the sorting part out to `_auth_and_persist_fetched_events`.

* `_get_remote_auth_chain_for_event`: remove redundant `outlier` assignment

`get_event_auth` returns events with the outlier flag already set, so this is
redundant (though we need to update a test where `get_event_auth` is mocked).

* `_get_remote_auth_chain_for_event`: move existing-event tests earlier

Move a couple of tests outside the loop. This is a bit inefficient for now, but
a future commit will make it better. It should be functionally identical.

* `_get_remote_auth_chain_for_event`: use `_auth_and_persist_fetched_events`

We can use the same codepath for persisting the events fetched as part of an
auth chain as for those fetched individually by `_get_events_and_persist` for
building the state at a backwards extremity.

* `_get_remote_auth_chain_for_event`: use a dict for efficiency

`_auth_and_persist_fetched_events` sorts the events itself, so we no longer
need to care about maintaining the ordering from `get_event_auth` (and no
longer need to sort by depth in `get_event_auth`).

That means that we can use a map, making it easier to filter out events we
already have, etc.

* changelog

* `_auth_and_persist_fetched_events`: improve docstring
2021-09-24 11:56:33 +01:00
Richard van der Hoff 261c9763c4
Simplify `_auth_and_persist_fetched_events` (#10901)
Combine the two loops over the list of events, and hence get rid of
`_NewEventInfo`. Also pass the event back alongside the context, so that it's
easier to process the result.
2021-09-24 11:56:13 +01:00
Erik Johnston 50022cff96
Add reactor to `SynapseRequest` and fix up types. (#10868) 2021-09-24 11:01:25 +01:00
Jason Robinson fa74536384
Fix AuthBlocking check when requester is appservice (#10881)
If the MAU count had been reached, Synapse incorrectly blocked appservice users even though they've been explicitly configured not to be tracked (the default). This was due to bypassing the relevant if as it was chained behind another earlier hit if as an elif.

Signed-off-by: Jason Robinson <jasonr@matrix.org>
2021-09-24 10:41:18 +01:00
David Robertson 7f3352743e
Improve typing in user_directory files (#10891)
* Improve typing in user_directory files

This makes the user_directory.py in storage pass most of mypy's
checks (including `no-untyped-defs`). Unfortunately that file is in the
tangled web of Store class inheritance so doesn't pass mypy at the moment.

The handlers directory has already been mypyed.

Co-authored-by: reivilibre <olivier@librepush.net>
2021-09-24 10:38:22 +01:00
Kokokokoka e704cc2a48
In `_purge_history_txn`, ensure that txn.fetchall has elements before accessing rows (#10690)
This change adds a check for row existence before accessing row element, this should fix issue #10669
Signed-off-by: Vasya Boytsov vasiliy.boytsov@phystech.edu
2021-09-24 09:19:51 +00:00
Callum Brown 90d9fc7505
Allow `.` and `~` chars in registration tokens (#10887)
Per updates to MSC3231 in order to use the same grammar
as other identifiers.
2021-09-23 17:58:12 +00:00
Richard van der Hoff a7304adc7d
Factor out `_get_remote_auth_chain_for_event` from `_update_auth_events_and_context_for_auth` (#10884)
* Reload auth events from db after fetching and persisting

In `_update_auth_events_and_context_for_auth`, when we fetch the remote auth
tree and persist the returned events: load the missing events from the database
rather than using the copies we got from the remote server.

This is mostly in preparation for additional refactors, but does have an
advantage in that if we later get around to checking the rejected status, we'll
be able to make use of it.

* Factor out `_get_remote_auth_chain_for_event` from `_update_auth_events_and_context_for_auth`

* changelog
2021-09-23 17:34:33 +01:00
Patrick Cloke 47854c71e9
Use direct references for configuration variables (part 4). (#10893) 2021-09-23 12:03:01 -04:00
David Robertson a10988983a
Break down cache expiry reasons in grafana (#10880)
A follow-up to #10829
2021-09-23 14:45:32 +01:00
David Robertson dcfd864970
Fix reactivated users not being added to the user directory (#10782)
Co-authored-by: Dirk Klimpel <5740567+dklimpel@users.noreply.github.com>
Co-authored-by: reivilibre <olivier@librepush.net>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2021-09-23 12:02:13 +00:00
Patrick Cloke e584534403
Use direct references for some configuration variables (part 3) (#10885)
This avoids the overhead of searching through the various
configuration classes by directly referencing the class that
the attributes are in.

It also improves type hints since mypy can now resolve the
types of the configuration variables.
2021-09-23 07:13:34 -04:00
Andrew Morgan aa2c027792
Remove unnecessary parentheses around tuples returned from methods (#10889) 2021-09-23 11:59:07 +01:00
Richard van der Hoff 26f2bfedbf
Factor out a separate `EventContext.for_outlier` (#10883)
Constructing an EventContext for an outlier is actually really simple, and
there's no sense in going via an `async` method in the `StateHandler`.

This also means that we can resolve a bunch of FIXMEs.
2021-09-22 17:58:57 +01:00
Hillery Shay f78b68a96b
Treat "\u0000" as "\u0020" for the purposes of message search (message indexing) (#10820)
* add test to check if null code points are being inserted

* add logic to detect and replace null code points before insertion into db

* lints

* add license to test

* change approach to null substitution

* add type hint for SearchEntry

* Add changelog entry

Signed-off-by: H.Shay <shaysquared@gmail.com>

* updated changelog

* update chanelog message

* remove duplicate changelog

* Update synapse/storage/databases/main/events.py remove extra space

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* rename and move test file, update tests, delete old test file

* fix typo in comments

* update _find_highlights_in_postgres to replace null byte with space

* replace null byte in sqlite search insertion

* beef up and reorganize test for this pr

* update changelog

* add type hints and update docstring

* check db engine directly vs using env variable

* refactor tests to be less repetetive

* move rplace logic into seperate function

* requested changes

* Fix typo.

* Update synapse/storage/databases/main/search.py

Co-authored-by: reivilibre <olivier@librepush.net>

* Update changelog.d/10820.misc

Co-authored-by: Aaron Raimist <aaron@raim.ist>

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: reivilibre <olivier@librepush.net>
Co-authored-by: Aaron Raimist <aaron@raim.ist>
2021-09-22 08:25:26 -07:00
Tulir Asokan 03db6701d5
Fix invalidating OTK count cache after claim (#10875)
The invalidation was missing in `_claim_e2e_one_time_key_returning`,
which is used on SQLite 3.24+ and Postgres. This could break e2ee if
nothing else happened to invalidate the caches before the keys ran out.

Signed-off-by: Tulir Asokan <tulir@beeper.com>
2021-09-22 15:31:05 +01:00
Richard van der Hoff 8f2a52766b
Ensure we mark sent knocks as outliers (#10873) 2021-09-22 15:20:18 +01:00
Patrick Cloke 6fc8be9a1b
Include more information in oEmbed previews. (#10819)
* Improved titles (fall back to the author name if there's not title) and include the site name.
* Handle photo/video payloads.
* Include the original URL in the Open Graph response.
* Fix the expiration time (by properly converting from seconds to milliseconds).
2021-09-22 09:45:20 -04:00
Sean Quah 9391de3f37
Fix /initialSync error due to unhashable `RoomStreamToken` (#10827)
The deprecated /initialSync endpoint maintains a cache of responses,
using parameter values as part of the cache key. When a `from` or `to`
parameter is specified, it gets converted into a `StreamToken`, which
contains a `RoomStreamToken` and forms part of the cache key.
`RoomStreamToken`s need to be made hashable for this to work.
2021-09-22 14:43:26 +01:00
Patrick Cloke 52913d56a5
Add documentation for experimental feature flags. (#10865) 2021-09-22 13:41:42 +00:00
David Robertson 724aef9a87
Opt out of cache expiry for `get_users_who_share_room_with_user` (#10826)
* Allow LruCaches to opt out of time-based expiry
* Don't expire `get_users_who_share_room` & friends
2021-09-22 14:21:58 +01:00
David Teller 80828eda06
Extend ModuleApi with the methods we'll need to reject spam based on …IP - resolves #10832 (#10833)
Extend ModuleApi with the methods we'll need to reject spam based on IP - resolves #10832

Signed-off-by: David Teller <davidt@element.io>
2021-09-22 13:09:43 +00:00
Richard van der Hoff 4ecf51812e
Include outlier status in `str(event)` for V2/V3 events (#10879)
I meant to do this before, in #10591, but because I'm stupid I forgot to do it
for V2 and V3 events.

I've factored the common code out to `EventBase` to save us having two copies
of it.

This means that for `FrozenEvent` we replace `self.get("event_id", None)` with
`self.event_id`, which I think is safe. `get()` is an alias for
`self._dict.get()`, whereas `event_id()` is an `@property` method which looks
up `self._event_id`, which is populated during construction from the same
dict. We don't seem to rely on the fallback, because if the `event_id` key is
absent from the dict then construction of the `EventBase` object will
fail.

Long story short, the only way this could change behaviour is if
`event_dict["event_id"]` is changed *after* the `EventBase` object is
constructed without updating the `_event_id` field, or vice versa - either of
which would be very problematic anyway and the behavior of `str(event)` is the
least of our worries.
2021-09-22 12:30:59 +01:00
David Robertson a2d7195e01
Track why we're evicting from caches (#10829)
So we can see distinguish between "evicting because the cache is too big" and "evicting because the cache entries haven't been recently used".
2021-09-22 10:59:52 +01:00
Eric Eastwood 51e2db3598
Rename MSC2716 things from `chunk` to `batch` to match `/batch_send` endpoint (#10838)
See https://github.com/matrix-org/matrix-doc/pull/2716#discussion_r684574497

Dropping support for older MSC2716 room versions so we don't have to worry about
supporting both chunk and batch events.
2021-09-21 15:06:28 -05:00
Patrick Cloke 4054dfa409
Add type hints for event streams. (#10856) 2021-09-21 13:34:26 -04:00
Erik Johnston b25a494779
Add types to http.site (#10867) 2021-09-21 16:41:27 +00:00
Patrick Cloke ebd8baf61f
Clear our destination directories before copying files to GitHub pages. (#10869)
This should fix stale deleted files being still accessible.
2021-09-21 16:32:46 +00:00
Patrick Cloke ba7a91aea5
Refactor oEmbed previews (#10814)
The major change is moving the decision of whether to use oEmbed
further up the call-stack. This reverts the _download_url method to
being a "dumb" functionwhich takes a single URL and downloads it
(as it was before #7920).

This also makes more minor refactorings:

* Renames internal variables for clarity.
* Factors out shared code between the HTML and rich oEmbed
  previews.
* Fixes tests to preview an oEmbed image.
2021-09-21 16:09:57 +00:00
Brendan Abolivier 2843058a8b
Test that state events sent by modules correctly end up in the room's state (#10835)
Test for #10830

Ideally the test would also make sure the new state event comes down sync, but this is probably good enough.
2021-09-21 17:40:20 +02:00
Hillery Shay 5fca3c8ae6
Allow Synapse Admin API's Room Search to accept non-ASCII characters (#10859)
* add tests for checking if room search works with non-ascii char

* change encoding on parse_string to UTF-8

* lints

* properly encode search term

* lints

* add changelog file

* update changelog number

* set changelog entry filetype to .bugfix

* Revert "set changelog entry filetype to .bugfix"

This reverts commit be8e5a3142.

* update changelog message and file type

* change parse_string default encoding back to ascii and update room search admin api calll to parse string

* refactor tests

* Update tests/rest/admin/test_room.py

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2021-09-21 08:04:35 -07:00
Eric Eastwood ee557b5375
Rename `/batch_send` query parameter from `?prev_event` to more obvious usage with `?prev_event_id` (MSC2716) (#10839)
As mentioned in https://github.com/matrix-org/matrix-doc/pull/2716#discussion_r705872887
and https://github.com/matrix-org/synapse/issues/10737
2021-09-21 14:10:01 +01:00
David Robertson 706b0e41a1
Merge tag 'v1.43.0' into develop 2021-09-21 14:05:00 +01:00
David Robertson 60453315bd
Always add local users to the user directory (#10796)
It's a simplification, but one that'll help make the user directory logic easier
to follow with the other changes upcoming. It's not strictly required for those
changes, but this will help simplify the resulting logic that listens for
`m.room.member` events and generally make the logic easier to follow.

This means the config option `search_all_users` ends up controlling the
search query only, and not the data we store. The cost of doing so is an
extra row in the `user_directory` and `user_directory_search` tables for
each local user which

- belongs to no public rooms
- belongs to no private rooms of size ≥ 2

I think the cost of this will be marginal (since they'll already have entries
 in `users` and `profiles` anyway).

As a small upside, a homeserver whose directory was built with this
change can toggle `search_all_users` without having to rebuild their
directory.

Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2021-09-21 12:02:34 +00:00
David Robertson 9ffa787eb2
Fix typo again 2021-09-21 12:24:47 +01:00
David Robertson 9b5782d51d
Specify MSC name; fix typo
one day I'll learn how to spell hierarchy
2021-09-21 12:10:50 +01:00
David Robertson c17e698e1b
Point to upgrade notes 2021-09-21 12:01:54 +01:00
David Robertson 6c92ba3eac
Move deprecation notice from 1.43 rc to release 2021-09-21 11:52:37 +01:00