MatrixSynapse

Commit Graph

Author	SHA1	Message	Date
Denis Kasak	337f38cac3	Implement a content type allow list for URL previews (#11936 ) This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately. This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302. Signed-off-by: Denis Kasak dkasak@termina.org.uk	2022-02-10 15:43:01 +00:00
Patrick Cloke	314ca4c86d	Pass the proper type when uploading files. (#11927 ) The Content-Length header should be treated as an int, not a string. This shouldn't have any user-facing change.	2022-02-07 10:06:52 -05:00
Patrick Cloke	807efd26ae	Support rendering previews with data: URLs in them (#11767 ) Images which are data URLs will no longer break URL previews and will properly be "downloaded" and thumbnailed.	2022-01-24 08:58:18 -05:00
Philippe Daouadi	15ffc4143c	Fix preview of imgur and Tenor URLs. (#11669 ) By scraping Open Graph information from the HTML even when an autodiscovery endpoint is found. The results are then combined to capture as much information as possible from the page.	2022-01-18 13:20:24 -05:00
Patrick Cloke	10a88ba91c	Use auto_attribs/native type hints for attrs classes. (#11692 )	2022-01-13 13:49:28 +00:00
Patrick Cloke	cbd82d0b2d	Convert all namedtuples to attrs. (#11665 ) To improve type hints throughout the code.	2021-12-30 18:47:12 +00:00
Patrick Cloke	eb39da6782	Move HTML parsing to a separate file for URL previews. (#11566 ) * Splits the logic for parsing HTML from the resource handling code. * Fix a circular import in the oEmbed code (which uses the HTML parsing code). * Renames some of the HTML parsing methods to: * Make it clear which methods are "internal" to the module. * Clarify what the methods do.	2021-12-13 17:55:07 +00:00
Sean Quah	858d80bf0f	Fix media repository failing when media store path contains symlinks (#11446 )	2021-12-02 16:05:24 +00:00
Sean Quah	454c3d7694	Merge branch 'master' into develop	2021-11-23 13:06:56 +00:00
Sean Quah	91f2bd0907	Prevent the media store from writing outside of the configured directory Also tighten validation of server names by forbidding invalid characters in IPv6 addresses and empty domain labels.	2021-11-19 13:39:15 +00:00
Patrick Cloke	9b90b9454b	Add type hints to media repository storage module (#11311 )	2021-11-12 11:05:26 -05:00
Neeeflix	6ce19b94e8	Fix error in thumbnail generation (#11288 ) Signed-off-by: Jonas Zeunert <jonas@zeunert.org>	2021-11-10 20:49:43 +00:00
Erik Johnston	237f7eb87a	Merge remote-tracking branch 'origin/master' into develop	2021-11-02 14:28:27 +00:00
Shay	f5c6a80886	Handle missing Content-Type header when accessing remote media (#11200 ) * add code to handle missing content-type header and a test to verify that it works * add handling for missing content-type in the /upload endpoint as well * slightly refactor test code to put private method in approriate place * handle possible null value for content-type when pulling from the local db * add changelog * refactor test and add code to handle missing content-type in cached remote media * requested changes * Update changelog.d/11200.bugfix Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com> Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>	2021-11-01 10:26:02 -07:00
Patrick Cloke	b3e843be88	Fix URL preview errors when previewing XML documents. (#11196 )	2021-10-27 14:48:02 +00:00
Patrick Cloke	efd0074ab7	Ensure each charset is attempted only once during media preview. (#11089 ) There's no point in trying more than once since it is guaranteed to continually fail.	2021-10-14 18:51:44 +00:00
Patrick Cloke	e2f0b49b3f	Attempt different character encodings when previewing a URL. (#11077 ) This follows similar logic to BeautifulSoup where we attempt different character encodings until we find one which works.	2021-10-14 10:17:20 -04:00
Sean Quah	b59f3281d5	Remove dead code from `MediaFilePaths` (#11056 )	2021-10-13 13:41:24 +01:00
Patrick Cloke	732bbf6737	Be more lenient when parsing the version for oEmbed responses. (#11065 )	2021-10-13 07:00:07 -04:00
Patrick Cloke	9abc5f2a05	Merge remote-tracking branch 'origin/release-v1.45' into develop	2021-10-12 14:21:05 -04:00
Sean Quah	8eaffe013c	Update `_wrap_in_base_path` type hints to preserve function arguments (#11055 )	2021-10-12 18:19:21 +01:00
Patrick Cloke	1db9282dfa	Fix formatting string when oEmbed errors occur. (#11061 )	2021-10-12 17:15:42 +00:00
Patrick Cloke	1b112840d2	Autodiscover oEmbed endpoint from returned HTML (#10822 ) Searches the returned HTML for an oEmbed endpoint using the autodiscovery mechanism (`<link rel=...>`), and will request it to generate the preview.	2021-10-08 14:14:42 -04:00
David Robertson	797ee7812d	Relax `ignore-missing-imports` for modules that have stubs now and update mypy (#11006 ) Updating mypy past version 0.9 means that third-party stubs are no-longer distributed with typeshed. See http://mypy-lang.blogspot.com/2021/06/mypy-0900-released.html for details. We therefore pull in stub packages in setup.py Additionally, some modules that we were previously ignoring import failures for now have stubs. So let's use them. The rest of this change consists of fixups to make the newer mypy + stubs pass CI. Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>	2021-10-08 14:49:41 +01:00
Sean Quah	2be0fde3d6	Fix empty `url_cache_thumbnails/yyyy-mm-dd/` directories being left behind (#10924 )	2021-09-29 10:24:37 +01:00
Sean Quah	f7768f62cb	Avoid storing URL cache files in storage providers (#10911 ) URL cache files are short-lived and it does not make sense to offload them (eg. to the cloud) or back them up.	2021-09-27 12:55:27 +01:00
Sean Quah	6c83c27107	Fix race conditions when creating media store and config directories (#10913 )	2021-09-27 11:29:23 +01:00
Patrick Cloke	bb7fdd821b	Use direct references for configuration variables (part 5). (#10897 )	2021-09-24 07:25:21 -04:00
Erik Johnston	50022cff96	Add reactor to `SynapseRequest` and fix up types. (#10868 )	2021-09-24 11:01:25 +01:00
Patrick Cloke	47854c71e9	Use direct references for configuration variables (part 4). (#10893 )	2021-09-23 12:03:01 -04:00
Patrick Cloke	6fc8be9a1b	Include more information in oEmbed previews. (#10819 ) * Improved titles (fall back to the author name if there's not title) and include the site name. * Handle photo/video payloads. * Include the original URL in the Open Graph response. * Fix the expiration time (by properly converting from seconds to milliseconds).	2021-09-22 09:45:20 -04:00
Patrick Cloke	ba7a91aea5	Refactor oEmbed previews (#10814 ) The major change is moving the decision of whether to use oEmbed further up the call-stack. This reverts the _download_url method to being a "dumb" functionwhich takes a single URL and downloads it (as it was before #7920). This also makes more minor refactorings: * Renames internal variables for clarity. * Factors out shared code between the HTML and rich oEmbed previews. * Fixes tests to preview an oEmbed image.	2021-09-21 16:09:57 +00:00
Patrick Cloke	b93259082c	Add missing type hints to non-client REST servlets. (#10817 ) Including admin, consent, key, synapse, and media. All REST servlets (the synapse.rest module) now require typed method definitions.	2021-09-15 08:45:32 -04:00
Patrick Cloke	b996782df5	Convert media repo's FileInfo to attrs. (#10785 ) This is mostly an internal change, but improves type hints in the media code.	2021-09-14 07:09:38 -04:00
Patrick Cloke	580a15e039	Request JSON for oEmbed requests (and ignore XML only providers). (#10759 ) This adds the format to the request arguments / URL to ensure that JSON data is returned (which is all that Synapse supports). This also adds additional error checking / filtering to the configuration file to ignore XML-only providers.	2021-09-08 07:17:52 -04:00
Patrick Cloke	89ba834818	Use attrs internally for the URL preview code & add documentation. (#10753 )	2021-09-07 13:10:34 +00:00
Patrick Cloke	e2481dbe93	Allow configuration of the oEmbed URLs. (#10714 ) This adds configuration options (under an `oembed` section) to configure which URLs are matched to use oEmbed for URL previews.	2021-08-31 18:37:07 -04:00
Sean	7367473f96	Fix error when selecting between thumbnails with the same quality (#10684 ) Fixes #10318	2021-08-25 09:51:08 +00:00
Dirk Klimpel	915b37e5ef	Admin API to delete media for a specific user (#10558 )	2021-08-11 19:29:59 +00:00
sri-vidyut	8e1febc6a1	Support underscores (in addition to hyphens) for charset detection. (#10410 )	2021-07-27 17:29:42 +00:00
Denis Kasak	2476d5373c	Mitigate media repo XSSs on IE11. (#10468 ) IE11 doesn't support Content-Security-Policy but it has support for a non-standard X-Content-Security-Policy header, which only supports the sandbox directive. This prevents script execution, so it at least offers some protection against media repo-based attacks. Signed-off-by: Denis Kasak <dkasak@termina.org.uk>	2021-07-27 13:45:10 +02:00
Patrick Cloke	5db118626b	Add a return type to parse_string. (#10438 ) And set the required attribute in a few places which will error if a parameter is not provided.	2021-07-21 09:47:56 -04:00
Jonathan de Jong	95e47b2e78	[pyupgrade] `synapse/` (#10348 ) This PR is tantamount to running ``` pyupgrade --py36-plus --keep-percent-format `find synapse/ -type f -name "*.py"` ``` Part of #9744	2021-07-19 15:28:05 +01:00
Jonathan de Jong	98aec1cc9d	Use inline type hints in `handlers/` and `rest/`. (#10382 )	2021-07-16 18:22:36 +01:00
Patrick Cloke	9e4610cc27	Correct type hints for parse_string(s)_from_args. (#10137 )	2021-06-08 08:30:48 -04:00
Michael Telatynski	e8ac9ac8ca	Fix /upload 500'ing when presented a very large image (#10029 ) * Fix /upload 500'ing when presented a very large image Catch DecompressionBombError and re-raise as ThumbnailErrors * Set PIL's MAX_IMAGE_PIXELS to match homeserver.yaml to get it to bomb out quicker, to load less into memory in the case of super large images * Add changelog entry for 10029	2021-05-21 18:31:59 +02:00
Andrew Morgan	fe604a022a	Remove various bits of compatibility code for Python <3.6 (#9879 ) I went through and removed a bunch of cruft that was lying around for compatibility with old Python versions. This PR also will now prevent Synapse from starting unless you're running Python 3.6+.	2021-04-27 13:13:07 +01:00
Richard van der Hoff	3ff2251754	Improved validation for received requests (#9817 ) * Simplify `start_listening` callpath * Correctly check the size of uploaded files	2021-04-23 19:20:44 +01:00
Richard van der Hoff	5a153772c1	remove `HomeServer.get_config` (#9815 ) Every single time I want to access the config object, I have to remember whether or not we use `get_config`. Let's just get rid of it.	2021-04-14 19:09:08 +01:00
rkfg	c9a2b5d402	More robust handling of the Content-Type header for thumbnail generation (#9788 ) Signed-off-by: Sergey Shpikin <rkfg@rkfg.me>	2021-04-14 16:30:59 +01:00
Jonathan de Jong	4b965c862d	Remove redundant "coding: utf-8" lines (#9786 ) Part of #9744 Removes all redundant `# -- coding: utf-8 --` lines from files, as python 3 automatically reads source code as utf-8 now. `Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`	2021-04-14 15:34:27 +01:00
Patrick Cloke	44bb881096	Add type hints to expiring cache. (#9730 )	2021-04-06 08:58:18 -04:00
Erik Johnston	b5efcb577e	Make it possible to use dmypy (#9692 ) Running `dmypy run` will do a `mypy` check while spinning up a daemon that makes rerunning `dmypy run` a lot faster. `dmypy` doesn't support `follow_imports = silent` and has `local_partial_types` enabled, so this PR enables those options and fixes the issues that were newly raised. Note that `local_partial_types` will be enabled by default in upcoming mypy releases.	2021-03-26 16:49:46 +00:00
Patrick Cloke	b7748d3c00	Import HomeServer from the proper module. (#9665 )	2021-03-23 07:12:48 -04:00
Patrick Cloke	55da8df078	Fix additional type hints from Twisted 21.2.0. (#9591 )	2021-03-12 11:37:57 -05:00
Richard van der Hoff	a7a3790066	Convert Requester to attrs (#9586 ) ... because namedtuples suck Fix up a couple of other annotations to keep mypy happy.	2021-03-10 18:15:56 +00:00
Patrick Cloke	075c16b410	Handle image transparency better when thumbnailing. (#9473 ) Properly uses RGBA mode for 1- and 8-bit images with transparency (instead of RBG mode).	2021-03-09 07:37:09 -05:00
Patrick Cloke	a0bc9d387e	Use the proper Request in type hints. (#9515 ) This also pins the Twisted version in the mypy job for CI until proper type hints are fixed throughout Synapse.	2021-03-01 12:23:46 -05:00
Tim Leung	ddb240293a	Add support for no_proxy and case insensitive env variables (#9372 ) ### Changes proposed in this PR - Add support for the `no_proxy` and `NO_PROXY` environment variables - Internally rely on urllib's [`proxy_bypass_environment`](`bdb941be42/Lib/urllib/request.py (L2519)`) - Extract env variables using urllib's `getproxies`/[`getproxies_environment`](`bdb941be42/Lib/urllib/request.py (L2488)`) which supports lowercase + uppercase, preferring lowercase, except for `HTTP_PROXY` in a CGI environment This does contain behaviour changes for consumers so making sure these are called out: - `no_proxy`/`NO_PROXY` is now respected - lowercase `https_proxy` is now allowed and taken over `HTTPS_PROXY` Related to #9306 which also uses `ProxyAgent` Signed-off-by: Timothy Leung tim95@hotmail.co.uk	2021-02-26 17:37:57 +00:00
Erik Johnston	3d2acc930f	Return a 404 if we don't have the original file	2021-02-19 10:46:18 +00:00
Erik Johnston	b106080fb4	Regenerate exact thumbnails if missing	2021-02-18 17:05:32 +00:00
Eric Eastwood	0a00b7ff14	Update black, and run auto formatting over the codebase (#9381 ) - Update black version to the latest - Run black auto formatting over the codebase - Run autoformatting according to [`docs/code_style.md `](`80d6dc9783/docs/code_style.md`) - Update `code_style.md` docs around installing black to use the correct version	2021-02-16 22:32:34 +00:00
Patrick Cloke	7950aa8a27	Fix some typos.	2021-02-12 11:14:12 -05:00
Patrick Cloke	0963d39ea6	Handle additional errors when previewing URLs. (#9333 ) * Handle the case of lxml not finding a document tree. * Parse the document encoding from the XML tag.	2021-02-08 12:33:30 -05:00
Erik Johnston	7e8083eb48	Add check_media_file_for_spam spam checker hook	2021-02-04 17:01:30 +00:00
Patrick Cloke	4937fe3d6b	Try to recover from unknown encodings when previewing media. (#9164 ) Treat unknown encodings (according to lxml) as UTF-8 when generating a preview for HTML documents. This isn't fully accurate, but will hopefully give a reasonable title and summary.	2021-01-26 07:32:17 -05:00
Patrick Cloke	a7882f9887	Return a 404 if no valid thumbnail is found. (#9163 ) If no thumbnail of the requested type exists, return a 404 instead of erroring. This doesn't quite match the spec (which does not define what happens if no thumbnail can be found), but is consistent with what Synapse already does.	2021-01-21 14:53:58 -05:00
Patrick Cloke	d34c6e1279	Add type hints to media rest resources. (#9093 )	2021-01-15 10:57:37 -05:00
David Teller	f14428b25c	Allow spam-checker modules to be provide async methods. (#8890 ) Spam checker modules can now provide async methods. This is implemented in a backwards-compatible manner.	2020-12-11 14:05:15 -05:00
Aaron Raimist	cd9e72b185	Add X-Robots-Tag header to stop crawlers from indexing media (#8887 ) Fixes / related to: https://github.com/matrix-org/synapse/issues/6533 This should do essentially the same thing as a robots.txt file telling robots to not index the media repo. https://developers.google.com/search/reference/robots_meta_tag Signed-off-by: Aaron Raimist <aaron@raim.ist>	2020-12-08 22:51:03 +00:00
Patrick Cloke	1f3748f033	Do not raise a 500 exception when previewing empty media. (#8883 )	2020-12-07 10:00:08 -05:00
Patrick Cloke	df3e6a23a7	Do not 500 if the content-length is not provided when uploading media. (#8862 ) Instead return the proper 400 error.	2020-12-04 10:26:09 -05:00
Patrick Cloke	30fba62108	Apply an IP range blacklist to push and key revocation requests. (#8821 ) Replaces the `federation_ip_range_blacklist` configuration setting with an `ip_range_blacklist` setting with wider scope. It now applies to: * Federation * Identity servers * Push notifications * Checking key validitity for third-party invite events The old `federation_ip_range_blacklist` setting is still honored if present, but with reduced scope (it only applies to federation and identity servers).	2020-12-02 11:09:24 -05:00
Erik Johnston	46f4be94b4	Fix race for concurrent downloads of remote media. (#8682 ) Fixes #6755	2020-10-30 10:55:24 +00:00
Dirk Klimpel	49d72dea2a	Add an admin api to delete local media. (#8519 ) Related to: #6459, #3479 Add `DELETE /_synapse/admin/v1/media/<server_name>/<media_id>` to delete a single file from server.	2020-10-26 17:02:28 +00:00
Andrew Morgan	3e58ce72b4	Don't bother responding to client requests that have already disconnected (#8465 ) This PR ports the quick fix from https://github.com/matrix-org/synapse/pull/2796 to further methods which handle media, URL preview and `/key/v2/server` requests. This prevents a harmless `ERROR` that comes up in the logs when we were unable to respond to a client request when the client had already disconnected. In this case we simply bail out if the client has already done so. This is the 'simple fix' as suggested by https://github.com/matrix-org/synapse/issues/5304#issuecomment-574740003. Fixes https://github.com/matrix-org/synapse/issues/6700 Fixes https://github.com/matrix-org/synapse/issues/5304	2020-10-06 10:03:39 +01:00
Richard van der Hoff	73d93039ff	Fix bug in remote thumbnail search (#8438 ) #7124 changed the behaviour of remote thumbnails so that the thumbnailing method was included in the filename of the thumbnail. To support existing files, it included a fallback so that we would check the old filename if the new filename didn't exist. Unfortunately, it didn't apply this logic to storage providers, so any thumbnails stored on such a storage provider was broken.	2020-10-02 12:29:29 +01:00
Richard van der Hoff	b1f4e6e4fc	fix a logging error in thumbnailer (#8435 ) Introduced in #8236	2020-10-01 13:34:24 +01:00
Will Hunt	c2bdf040aa	Discard an empty upload_name before persisting an uploaded file (#7905 )	2020-09-29 12:15:27 -04:00
Richard van der Hoff	11c9e17738	Add type annotations to SimpleHttpClient (#8372 )	2020-09-24 15:47:20 +01:00
Patrick Cloke	aec294ee0d	Use slots in attrs classes where possible (#8296 ) slots use less memory (and attribute access is faster) while slightly limiting the flexibility of the class attributes. This focuses on objects which are instantiated "often" and for short periods of time.	2020-09-14 12:50:06 -04:00
Patrick Cloke	d2a3eb04a4	Fix typos in comments.	2020-09-14 11:46:58 -04:00
Patrick Cloke	b312769c0e	Do not error when thumbnailing invalid files (#8236 ) If a file cannot be thumbnailed for some reason (e.g. the file is empty), then catch the exception and convert it to a reasonable error message for the client.	2020-09-09 12:59:41 -04:00
DeepBlueV7.X	560f3b8609	Include method in thumbnail media name (#7124 ) This fixes an issue where different methods (crop/scale) overwrite each other. This first tries the new path. If that fails and we are looking for a remote thumbnail, it tries the old path. If that still isn't found, it continues as normal. This should probably be removed in the future, after some of the newer thumbnails were generated with the new path on most deployments. Then the overhead should be minimal if the other thumbnails need to be regenerated. Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>	2020-09-08 17:19:50 +01:00
Patrick Cloke	c619253db8	Stop sub-classing object (#8249 )	2020-09-04 06:54:56 -04:00
Patrick Cloke	4e874ed593	Remove unnecessary maybeDeferred calls (#8044 )	2020-08-07 09:44:48 -04:00
David Vo	4dd27e6d11	Reduce unnecessary whitespace in JSON. (#7372 )	2020-08-07 08:02:55 -04:00
Erik Johnston	a7bdf98d01	Rename database classes to make some sense (#8033 )	2020-08-05 21:38:57 +01:00
Patrick Cloke	8ff2deda72	Fix async/await calls for broken media providers. (#8027 )	2020-08-04 09:44:25 -04:00
Patrick Cloke	68626ff8e9	Convert the remaining media repo code to async / await. (#7947 )	2020-07-27 14:40:11 -04:00
Patrick Cloke	3fc8fdd150	Support oEmbed for media previews. (#7920 ) Fixes previews of Twitter URLs by using their oEmbed endpoint to grab content.	2020-07-27 07:50:44 -04:00
Patrick Cloke	5ea29d7f85	Convert more of the media code to async/await (#7873 )	2020-07-24 09:39:02 -04:00
Will Hunt	62b1ce8539	isort 5 compatibility (#7786 ) The CI appears to use the latest version of isort, which is a problem when isort gets a major version bump. Rather than try to pin the version, I've done the necessary to make isort5 happy with synapse.	2020-07-05 16:32:02 +01:00
Erik Johnston	5cdca53aa0	Merge different Resource implementation classes (#7732 )	2020-07-03 19:02:19 +01:00
Erik Johnston	b44bdd7f7b	Support running multiple media repos. (#7706 ) This requires a new config option to specify which media repo should be responsible for running background jobs to e.g. clear out expired URL preview caches.	2020-06-17 14:13:30 +01:00
Patrick Cloke	434716e1d3	Fetch from the r0 media path instead of the unspecced v1. (#7714 )	2020-06-17 08:36:46 -04:00
Dagfinn Ilmari Mannsåker	a3f11567d9	Replace all remaining six usage with native Python 3 equivalents (#7704 )	2020-06-16 08:51:47 -04:00
Patrick Cloke	bd6dc17221	Replace iteritems/itervalues/iterkeys with native versions. (#7692 )	2020-06-15 07:03:36 -04:00
Richard van der Hoff	d4676910c9	remove miscellaneous PY2 code	2020-05-15 19:37:41 +01:00
Michael Kaye	5308239d5d	Reduce logging verbosity of URL cache cleanup. (#7295 )	2020-04-22 07:45:16 -04:00

1 2 3 4 5 ...

467 Commits (4d6b80038588f273a95d8d705aca885dc61768cc)