Commit Graph

70 Commits (c1ac2a81350f3b5b86f4c53a585eccd17e3b8e75)

Author SHA1 Message Date
Denis Kasak 337f38cac3
Implement a content type allow list for URL previews (#11936)
This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately.

This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302.

Signed-off-by: Denis Kasak dkasak@termina.org.uk
2022-02-10 15:43:01 +00:00
Patrick Cloke 807efd26ae
Support rendering previews with data: URLs in them (#11767)
Images which are data URLs will no longer break URL
previews and will properly be "downloaded" and
thumbnailed.
2022-01-24 08:58:18 -05:00
Patrick Cloke eb39da6782
Move HTML parsing to a separate file for URL previews. (#11566)
* Splits the logic for parsing HTML from the resource handling code.
* Fix a circular import in the oEmbed code (which uses the HTML parsing code).
* Renames some of the HTML parsing methods to:
  * Make it clear which methods are "internal" to the module.
  * Clarify what the methods do.
2021-12-13 17:55:07 +00:00
Sean Quah 858d80bf0f
Fix media repository failing when media store path contains symlinks (#11446) 2021-12-02 16:05:24 +00:00
Sean Quah 91f2bd0907 Prevent the media store from writing outside of the configured directory
Also tighten validation of server names by forbidding invalid characters
in IPv6 addresses and empty domain labels.
2021-11-19 13:39:15 +00:00
Shay f5c6a80886
Handle missing Content-Type header when accessing remote media (#11200)
* add code to handle missing content-type header and a test to verify that it works

* add handling for missing content-type in the /upload endpoint as well

* slightly refactor test code to put private method in approriate place

* handle possible null value for content-type when pulling from the local db

* add changelog

* refactor test and add code to handle missing content-type in cached remote media

* requested changes

* Update changelog.d/11200.bugfix

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>

Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
2021-11-01 10:26:02 -07:00
Patrick Cloke 732bbf6737
Be more lenient when parsing the version for oEmbed responses. (#11065) 2021-10-13 07:00:07 -04:00
Sean Quah 84f5d83257
Add tests for `MediaFilePaths` (#11057) 2021-10-12 18:19:35 +01:00
Patrick Cloke 1b112840d2
Autodiscover oEmbed endpoint from returned HTML (#10822)
Searches the returned HTML for an oEmbed endpoint using the
autodiscovery mechanism (`<link rel=...>`), and will request it
to generate the preview.
2021-10-08 14:14:42 -04:00
Sean Quah 2be0fde3d6
Fix empty `url_cache_thumbnails/yyyy-mm-dd/` directories being left behind (#10924) 2021-09-29 10:24:37 +01:00
Sean Quah f7768f62cb
Avoid storing URL cache files in storage providers (#10911)
URL cache files are short-lived and it does not make sense to offload
them (eg. to the cloud) or back them up.
2021-09-27 12:55:27 +01:00
Patrick Cloke bb7fdd821b
Use direct references for configuration variables (part 5). (#10897) 2021-09-24 07:25:21 -04:00
Erik Johnston 50022cff96
Add reactor to `SynapseRequest` and fix up types. (#10868) 2021-09-24 11:01:25 +01:00
Patrick Cloke 6fc8be9a1b
Include more information in oEmbed previews. (#10819)
* Improved titles (fall back to the author name if there's not title) and include the site name.
* Handle photo/video payloads.
* Include the original URL in the Open Graph response.
* Fix the expiration time (by properly converting from seconds to milliseconds).
2021-09-22 09:45:20 -04:00
Patrick Cloke ba7a91aea5
Refactor oEmbed previews (#10814)
The major change is moving the decision of whether to use oEmbed
further up the call-stack. This reverts the _download_url method to
being a "dumb" functionwhich takes a single URL and downloads it
(as it was before #7920).

This also makes more minor refactorings:

* Renames internal variables for clarity.
* Factors out shared code between the HTML and rich oEmbed
  previews.
* Fixes tests to preview an oEmbed image.
2021-09-21 16:09:57 +00:00
Patrick Cloke bfb4b858a9
Create a constant for a small png image in tests. (#10834)
To avoid duplicating it between a few tests.
2021-09-16 12:01:14 -04:00
Patrick Cloke 580a15e039
Request JSON for oEmbed requests (and ignore XML only providers). (#10759)
This adds the format to the request arguments / URL to
ensure that JSON data is returned (which is all that
Synapse supports).

This also adds additional error checking / filtering to the
configuration file to ignore XML-only providers.
2021-09-08 07:17:52 -04:00
Patrick Cloke e2481dbe93
Allow configuration of the oEmbed URLs. (#10714)
This adds configuration options (under an `oembed` section) to
configure which URLs are matched to use oEmbed for URL
previews.
2021-08-31 18:37:07 -04:00
Sean 7367473f96
Fix error when selecting between thumbnails with the same quality (#10684)
Fixes #10318
2021-08-25 09:51:08 +00:00
reivilibre 642a42edde
Flatten the synapse.rest.client package (#10600) 2021-08-17 11:57:58 +00:00
Jonathan de Jong 89cfc3dd98
[pyupgrade] `tests/` (#10347) 2021-07-13 11:43:15 +01:00
Brendan Abolivier 1b3e398bea
Standardise the module interface (#10062)
This PR adds a common configuration section for all modules (see docs). These modules are then loaded at startup by the homeserver. Modules register their hooks and web resources using the new `register_[...]_callbacks` and `register_web_resource` methods of the module API.
2021-06-18 12:15:52 +01:00
Jonathan de Jong 4b965c862d
Remove redundant "coding: utf-8" lines (#9786)
Part of #9744

Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-04-14 15:34:27 +01:00
Patrick Cloke 0b3112123d
Use mock from the stdlib. (#9772) 2021-04-09 13:44:38 -04:00
Patrick Cloke 075c16b410
Handle image transparency better when thumbnailing. (#9473)
Properly uses RGBA mode for 1- and 8-bit images with transparency
(instead of RBG mode).
2021-03-09 07:37:09 -05:00
Erik Johnston 3a2fe5054f Add test 2021-02-19 15:52:04 +00:00
Eric Eastwood 0a00b7ff14
Update black, and run auto formatting over the codebase (#9381)
- Update black version to the latest
 - Run black auto formatting over the codebase
    - Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
 - Update `code_style.md` docs around installing black to use the correct version
2021-02-16 22:32:34 +00:00
Erik Johnston 7e8083eb48 Add check_media_file_for_spam spam checker hook 2021-02-04 17:01:30 +00:00
Patrick Cloke a7882f9887
Return a 404 if no valid thumbnail is found. (#9163)
If no thumbnail of the requested type exists, return a 404 instead
of erroring. This doesn't quite match the spec (which does not define
what happens if no thumbnail can be found), but is consistent with
what Synapse already does.
2021-01-21 14:53:58 -05:00
Richard van der Hoff 8d3d264052
Skip unit tests which require optional dependencies (#9031)
If we are lacking an optional dependency, skip the tests that rely on it.
2021-01-07 11:41:28 +00:00
Richard van der Hoff 394516ad1b Remove spurious "SynapseRequest" result from `make_request"
This was never used, so let's get rid of it.
2020-12-15 22:35:40 +00:00
Aaron Raimist cd9e72b185
Add X-Robots-Tag header to stop crawlers from indexing media (#8887)
Fixes / related to: https://github.com/matrix-org/synapse/issues/6533

This should do essentially the same thing as a robots.txt file telling robots to not index the media repo. https://developers.google.com/search/reference/robots_meta_tag

Signed-off-by: Aaron Raimist <aaron@raim.ist>
2020-12-08 22:51:03 +00:00
Richard van der Hoff f347f0cd58
remove unused FakeResponse (#8864) 2020-12-02 18:58:25 +00:00
Patrick Cloke 30fba62108
Apply an IP range blacklist to push and key revocation requests. (#8821)
Replaces the `federation_ip_range_blacklist` configuration setting with an
`ip_range_blacklist` setting with wider scope. It now applies to:

* Federation
* Identity servers
* Push notifications
* Checking key validitity for third-party invite events

The old `federation_ip_range_blacklist` setting is still honored if present, but
with reduced scope (it only applies to federation and identity servers).
2020-12-02 11:09:24 -05:00
Richard van der Hoff 129ae841e5 Make `make_request` actually render the request
remove the stubbing out of `request.process`, so that `requestReceived` also renders the request via the appropriate resource.

Replace render() with a stub for now.
2020-11-16 18:24:00 +00:00
Richard van der Hoff 1f41422c98 Fix the URL in the URL preview tests
the preview resource is mointed at preview_url, not url_preview
2020-11-16 18:24:00 +00:00
Richard van der Hoff cfd895a22e use global make_request() directly where we have a custom Resource
Where we want to render a request against a specific Resource, call the global
make_request() function rather than the one in HomeserverTestCase, allowing us
to pass in an appropriate `Site`.
2020-11-15 23:09:03 +00:00
Patrick Cloke b312769c0e
Do not error when thumbnailing invalid files (#8236)
If a file cannot be thumbnailed for some reason (e.g. the file is empty), then
catch the exception and convert it to a reasonable error message for the client.
2020-09-09 12:59:41 -04:00
Patrick Cloke c619253db8
Stop sub-classing object (#8249) 2020-09-04 06:54:56 -04:00
Patrick Cloke 3fc8fdd150
Support oEmbed for media previews. (#7920)
Fixes previews of Twitter URLs by using their oEmbed endpoint to grab content.
2020-07-27 07:50:44 -04:00
Patrick Cloke 5ea29d7f85
Convert more of the media code to async/await (#7873) 2020-07-24 09:39:02 -04:00
Will Hunt 62b1ce8539
isort 5 compatibility (#7786)
The CI appears to use the latest version of isort, which is a problem when isort gets a major version bump. Rather than try to pin the version, I've done the necessary to make isort5 happy with synapse.
2020-07-05 16:32:02 +01:00
Patrick Cloke 434716e1d3
Fetch from the r0 media path instead of the unspecced v1. (#7714) 2020-06-17 08:36:46 -04:00
Dagfinn Ilmari Mannsåker a3f11567d9
Replace all remaining six usage with native Python 3 equivalents (#7704) 2020-06-16 08:51:47 -04:00
WGH e55ee7c32f
Add support for webp thumbnailing (#7586)
Closes #4382

Signed-off-by: Maxim Plotnikov <wgh@torlan.ru>
2020-06-05 11:54:27 +01:00
Andrew Morgan a48138784e
Allow specifying the value of Accept-Language header for URL previews (#7265) 2020-04-15 13:35:29 +01:00
Brendan Abolivier 6ae0c8db33
Lint + changelog 2020-01-22 12:38:18 +00:00
Brendan Abolivier d9a8728b11
Remove unused import 2020-01-22 12:30:49 +00:00
Brendan Abolivier 67aa18e8dc
Add tests for thumbnailing 2020-01-22 12:28:07 +00:00
Richard van der Hoff e78167c94b
Apply suggestions from code review
Co-Authored-By: Brendan Abolivier <babolivier@matrix.org>
Co-Authored-By: Erik Johnston <erik@matrix.org>
2019-11-05 16:46:39 +00:00