Commit Graph

407 Commits (b76337fdf825282b7c75ebc38e9a64f0b00f7cf0)

Author SHA1 Message Date
Erik Johnston b106080fb4 Regenerate exact thumbnails if missing 2021-02-18 17:05:32 +00:00
Eric Eastwood 0a00b7ff14
Update black, and run auto formatting over the codebase (#9381)
- Update black version to the latest
 - Run black auto formatting over the codebase
    - Run autoformatting according to [`docs/code_style.md
`](80d6dc9783/docs/code_style.md)
 - Update `code_style.md` docs around installing black to use the correct version
2021-02-16 22:32:34 +00:00
Patrick Cloke 7950aa8a27 Fix some typos. 2021-02-12 11:14:12 -05:00
Patrick Cloke 0963d39ea6
Handle additional errors when previewing URLs. (#9333)
* Handle the case of lxml not finding a document tree.
* Parse the document encoding from the XML tag.
2021-02-08 12:33:30 -05:00
Erik Johnston 7e8083eb48 Add check_media_file_for_spam spam checker hook 2021-02-04 17:01:30 +00:00
Patrick Cloke 4937fe3d6b
Try to recover from unknown encodings when previewing media. (#9164)
Treat unknown encodings (according to lxml) as UTF-8
when generating a preview for HTML documents. This
isn't fully accurate, but will hopefully give a reasonable
title and summary.
2021-01-26 07:32:17 -05:00
Patrick Cloke a7882f9887
Return a 404 if no valid thumbnail is found. (#9163)
If no thumbnail of the requested type exists, return a 404 instead
of erroring. This doesn't quite match the spec (which does not define
what happens if no thumbnail can be found), but is consistent with
what Synapse already does.
2021-01-21 14:53:58 -05:00
Patrick Cloke d34c6e1279
Add type hints to media rest resources. (#9093) 2021-01-15 10:57:37 -05:00
David Teller f14428b25c
Allow spam-checker modules to be provide async methods. (#8890)
Spam checker modules can now provide async methods. This is implemented
in a backwards-compatible manner.
2020-12-11 14:05:15 -05:00
Aaron Raimist cd9e72b185
Add X-Robots-Tag header to stop crawlers from indexing media (#8887)
Fixes / related to: https://github.com/matrix-org/synapse/issues/6533

This should do essentially the same thing as a robots.txt file telling robots to not index the media repo. https://developers.google.com/search/reference/robots_meta_tag

Signed-off-by: Aaron Raimist <aaron@raim.ist>
2020-12-08 22:51:03 +00:00
Patrick Cloke 1f3748f033
Do not raise a 500 exception when previewing empty media. (#8883) 2020-12-07 10:00:08 -05:00
Patrick Cloke df3e6a23a7
Do not 500 if the content-length is not provided when uploading media. (#8862)
Instead return the proper 400 error.
2020-12-04 10:26:09 -05:00
Patrick Cloke 30fba62108
Apply an IP range blacklist to push and key revocation requests. (#8821)
Replaces the `federation_ip_range_blacklist` configuration setting with an
`ip_range_blacklist` setting with wider scope. It now applies to:

* Federation
* Identity servers
* Push notifications
* Checking key validitity for third-party invite events

The old `federation_ip_range_blacklist` setting is still honored if present, but
with reduced scope (it only applies to federation and identity servers).
2020-12-02 11:09:24 -05:00
Erik Johnston 46f4be94b4
Fix race for concurrent downloads of remote media. (#8682)
Fixes #6755
2020-10-30 10:55:24 +00:00
Dirk Klimpel 49d72dea2a
Add an admin api to delete local media. (#8519)
Related to: #6459, #3479

Add `DELETE /_synapse/admin/v1/media/<server_name>/<media_id>` to delete
a single file from server.
2020-10-26 17:02:28 +00:00
Andrew Morgan 3e58ce72b4
Don't bother responding to client requests that have already disconnected (#8465)
This PR ports the quick fix from https://github.com/matrix-org/synapse/pull/2796 to further methods which handle media, URL preview and `/key/v2/server` requests. This prevents a harmless `ERROR` that comes up in the logs when we were unable to respond to a client request when the client had already disconnected. In this case we simply bail out if the client has already done so.

This is the 'simple fix' as suggested by https://github.com/matrix-org/synapse/issues/5304#issuecomment-574740003.

Fixes https://github.com/matrix-org/synapse/issues/6700
Fixes https://github.com/matrix-org/synapse/issues/5304
2020-10-06 10:03:39 +01:00
Richard van der Hoff 73d93039ff
Fix bug in remote thumbnail search (#8438)
#7124 changed the behaviour of remote thumbnails so that the thumbnailing method was included in the filename of the thumbnail. To support existing files, it included a fallback so that we would check the old filename if the new filename didn't exist.

Unfortunately, it didn't apply this logic to storage providers, so any thumbnails stored on such a storage provider was broken.
2020-10-02 12:29:29 +01:00
Richard van der Hoff b1f4e6e4fc
fix a logging error in thumbnailer (#8435)
Introduced in #8236
2020-10-01 13:34:24 +01:00
Will Hunt c2bdf040aa
Discard an empty upload_name before persisting an uploaded file (#7905) 2020-09-29 12:15:27 -04:00
Richard van der Hoff 11c9e17738
Add type annotations to SimpleHttpClient (#8372) 2020-09-24 15:47:20 +01:00
Patrick Cloke aec294ee0d
Use slots in attrs classes where possible (#8296)
slots use less memory (and attribute access is faster) while slightly
limiting the flexibility of the class attributes. This focuses on objects
which are instantiated "often" and for short periods of time.
2020-09-14 12:50:06 -04:00
Patrick Cloke d2a3eb04a4 Fix typos in comments. 2020-09-14 11:46:58 -04:00
Patrick Cloke b312769c0e
Do not error when thumbnailing invalid files (#8236)
If a file cannot be thumbnailed for some reason (e.g. the file is empty), then
catch the exception and convert it to a reasonable error message for the client.
2020-09-09 12:59:41 -04:00
DeepBlueV7.X 560f3b8609
Include method in thumbnail media name (#7124)
This fixes an issue where different methods (crop/scale) overwrite each other.

This first tries the new path. If that fails and we are looking for a
remote thumbnail, it tries the old path. If that still isn't found, it
continues as normal.

This should probably be removed in the future, after some of the newer
thumbnails were generated with the new path on most deployments. Then
the overhead should be minimal if the other thumbnails need to be
regenerated.

Signed-off-by: Nicolas Werner <nicolas.werner@hotmail.de>
2020-09-08 17:19:50 +01:00
Patrick Cloke c619253db8
Stop sub-classing object (#8249) 2020-09-04 06:54:56 -04:00
Patrick Cloke 4e874ed593
Remove unnecessary maybeDeferred calls (#8044) 2020-08-07 09:44:48 -04:00
David Vo 4dd27e6d11
Reduce unnecessary whitespace in JSON. (#7372) 2020-08-07 08:02:55 -04:00
Erik Johnston a7bdf98d01
Rename database classes to make some sense (#8033) 2020-08-05 21:38:57 +01:00
Patrick Cloke 8ff2deda72
Fix async/await calls for broken media providers. (#8027) 2020-08-04 09:44:25 -04:00
Patrick Cloke 68626ff8e9
Convert the remaining media repo code to async / await. (#7947) 2020-07-27 14:40:11 -04:00
Patrick Cloke 3fc8fdd150
Support oEmbed for media previews. (#7920)
Fixes previews of Twitter URLs by using their oEmbed endpoint to grab content.
2020-07-27 07:50:44 -04:00
Patrick Cloke 5ea29d7f85
Convert more of the media code to async/await (#7873) 2020-07-24 09:39:02 -04:00
Will Hunt 62b1ce8539
isort 5 compatibility (#7786)
The CI appears to use the latest version of isort, which is a problem when isort gets a major version bump. Rather than try to pin the version, I've done the necessary to make isort5 happy with synapse.
2020-07-05 16:32:02 +01:00
Erik Johnston 5cdca53aa0
Merge different Resource implementation classes (#7732) 2020-07-03 19:02:19 +01:00
Erik Johnston b44bdd7f7b
Support running multiple media repos. (#7706)
This requires a new config option to specify which media repo should be
responsible for running background jobs to e.g. clear out expired URL
preview caches.
2020-06-17 14:13:30 +01:00
Patrick Cloke 434716e1d3
Fetch from the r0 media path instead of the unspecced v1. (#7714) 2020-06-17 08:36:46 -04:00
Dagfinn Ilmari Mannsåker a3f11567d9
Replace all remaining six usage with native Python 3 equivalents (#7704) 2020-06-16 08:51:47 -04:00
Patrick Cloke bd6dc17221
Replace iteritems/itervalues/iterkeys with native versions. (#7692) 2020-06-15 07:03:36 -04:00
Richard van der Hoff d4676910c9 remove miscellaneous PY2 code 2020-05-15 19:37:41 +01:00
Michael Kaye 5308239d5d
Reduce logging verbosity of URL cache cleanup. (#7295) 2020-04-22 07:45:16 -04:00
Andrew Morgan a48138784e
Allow specifying the value of Accept-Language header for URL previews (#7265) 2020-04-15 13:35:29 +01:00
Dionysis Grigoropoulos 96071eea8f
Set Referrer-Policy to no-referrer for media (#7009) 2020-03-23 09:48:28 +00:00
Patrick Cloke caec7d4fa0
Convert some of the media REST code to async/await (#7110) 2020-03-20 07:20:02 -04:00
The Stranjer 5e477c1deb
Set charset to utf-8 when adding headers for certain text content types (#7044)
Fixes #7043
2020-03-17 13:29:09 +00:00
Patrick Cloke 509e381afa
Clarify list/set/dict/tuple comprehensions and enforce via flake8 (#6957)
Ensure good comprehension hygiene using flake8-comprehensions.
2020-02-21 07:15:07 -05:00
Richard van der Hoff 6b7462a13f
a bit of debugging for media storage providers (#6757)
* a bit of debugging for media storage providers

* changelog
2020-01-23 12:11:44 +00:00
Brendan Abolivier ed83c3a018
Fix typo in _select_thumbnail 2020-01-22 12:27:42 +00:00
Erik Johnston b0a66ab83c
Fixup synapse.rest to pass mypy (#6732) 2020-01-20 17:38:21 +00:00
Richard van der Hoff 98247c4a0e
Remove unused, undocumented "content repo" resource (#6628)
This looks like it got half-killed back in #888.

Fixes #6567.
2020-01-03 17:10:52 +00:00
Erik Johnston 4a33a6dd19 Move background update handling out of store 2019-12-05 11:11:26 +00:00
Filip Štědronský 81731c6e75 Fix: Pillow error when uploading RGBA image (#3325) (#6241)
Signed-Off-By: Filip Štědronský <g@regnarg.cz>
2019-12-02 12:12:55 +00:00
Richard van der Hoff ef1a85e773
Fix startup error when http proxy is defined. (#6421)
Guess I only tested this on python 2 :/

Fixes #6419.
2019-11-26 18:10:50 +00:00
Andrew Morgan 3916e1b97a
Clean up newline quote marks around the codebase (#6362) 2019-11-21 12:00:14 +00:00
Richard van der Hoff 5570d1c93f
Merge pull request #6334 from matrix-org/rav/url_preview_limit_title_2
Fix exception when OpenGraph tag values are ints
2019-11-05 17:28:11 +00:00
Richard van der Hoff 81d49cbb07 Fix exception when OpenGraph tag values are ints 2019-11-05 17:22:58 +00:00
Richard van der Hoff 55a7da247a
Merge branch 'develop' into rav/url_preview_limit_title 2019-11-05 17:08:07 +00:00
Richard van der Hoff e78167c94b
Apply suggestions from code review
Co-Authored-By: Brendan Abolivier <babolivier@matrix.org>
Co-Authored-By: Erik Johnston <erik@matrix.org>
2019-11-05 16:46:39 +00:00
Richard van der Hoff e9bfe719ba Strip overlong OpenGraph data from url preview
... to stop people causing DoSes with malicious web pages
2019-11-05 15:51:18 +00:00
Richard van der Hoff 1cb84c6486
Support for routing outbound HTTP requests via a proxy (#6239)
The `http_proxy` and `HTTPS_PROXY` env vars can be set to a `host[:port]` value which should point to a proxy.

The address of the proxy should be excluded from IP blacklists such as the `url_preview_ip_range_blacklist`.

The proxy will then be used for
 * push
 * url previews
 * phone-home stats
 * recaptcha validation
 * CAS auth validation

It will *not* be used for:
 * Application Services
 * Identity servers
 * Outbound federation
 * In worker configurations, connections from workers to masters

Fixes #4198.
2019-11-01 14:07:44 +00:00
Andrew Morgan 54fef094b3
Remove usage of deprecated logger.warn method from codebase (#6271)
Replace every instance of `logger.warn` with `logger.warning` as the former is deprecated.
2019-10-31 10:23:24 +00:00
Michael Kaye e4d98188da Address codestyle concerns 2019-10-24 18:43:13 +01:00
Michael Kaye 8f4a808d9d Delay printf until logging is required.
Using % will cause the string to be generated even if debugging
is off.
2019-10-24 18:31:53 +01:00
Erik Johnston ca3e01e50d Fix store_url_cache using bytes 2019-10-10 14:52:29 +01:00
Anshul Angaria 474abf1eb6 add M_TOO_LARGE error code for uploading a too large file (#6151)
Fixes #6109
2019-10-08 13:55:16 +01:00
Michael Kaye dc795ba709 Log responder we are using. (#6139)
This prevents us logging "Responding to media request with responder %s".
2019-10-07 15:41:25 +01:00
Robert Swain 39b40d6d99 media/thumbnailer: Better quality for 1-bit / 8-bit color palette images (#2142)
Pillow will use nearest neighbour as the resampling algorithm if the
source image is either 1-bit or a color palette using 8 bits. If we
convert to RGB before scaling, we'll probably get a better result.
2019-10-04 09:34:52 +01:00
Andrew Morgan 2a44782666
Remove double return statements (#5962)
Remove all the "double return" statements which were a result of us removing all the instances of

```
defer.returnValue(...)
return
```

statements when we switched to python3 fully.
2019-09-03 11:42:45 +01:00
L0ric0 ce7803b8b0 fix thumbnail storage location (#5915)
* fix thumbnail storage location

Signed-off-by: Lorenz Steinert <lorenz@steinerts.de>

* Add changelog file.

Signed-off-by: Lorenz Steinert <lorenz@steinerts.de>

* Update Changelog

Signed-off-by: Lorenz Steinert <lorenz@steinerts.de>
2019-09-02 12:18:41 +01:00
Andrew Morgan 4548d1f87e
Remove unnecessary parentheses around return statements (#5931)
Python will return a tuple whether there are parentheses around the returned values or not.

I'm just sick of my editor complaining about this all over the place :)
2019-08-30 16:28:26 +01:00
Amber Brown 0b6fbb28a8
Don't load the media repo when configured to use an external media repo (#5754) 2019-08-13 21:49:28 +10:00
Amber Brown 4806651744
Replace returnValue with return (#5736) 2019-07-23 23:00:55 +10:00
Andrew Morgan 24aa0e0a5b fix typo: backgroud -> background 2019-07-12 15:29:40 +01:00
Amber Brown 463b072b12
Move logging utilities out of the side drawer of util/ and into logging/ (#5606) 2019-07-04 00:07:04 +10:00
Amber Brown 0ee9076ffe Fix media repo breaking (#5593) 2019-07-02 19:01:28 +01:00
Amber Brown f40a7dc41f
Make the http server handle coroutine-making REST servlets (#5475) 2019-06-29 17:06:55 +10:00
Amber Brown 32e7c9e7f2
Run Black. (#5482) 2019-06-20 19:32:02 +10:00
Erik Johnston 95d38afe96 Don't log exception when failing to fetch remote content.
In particular, let's not log stack traces when we stop processing
becuase the response body was too large.
2019-06-07 12:39:10 +01:00
Aaron Raimist 30858ff461 Fix error when downloading thumbnail with width/height param missing (#5258)
Fix error when downloading thumbnail with width/height param missing

Fixes #2748

Signed-off-by: Aaron Raimist <aaron@raim.ist>
2019-05-29 14:27:41 +01:00
PauRE f89f688a55 Fix image orientation when generating thumbnail (#5039) 2019-05-16 19:04:26 +01:00
Amber Brown df2ebd75d3
Migrate all tests to use the dict-based config format instead of hanging items off HomeserverConfig (#5171) 2019-05-13 15:01:14 -05:00
Andrew Morgan 2f48c4e1ae
URL preview blacklisting fixes (#5155)
Prevents a SynapseError being raised inside of a IResolutionReceiver and instead opts to just return 0 results. This thus means that we have to lump a failed lookup and a blacklisted lookup together with the same error message, but the substitute should be generic enough to cover both cases.
2019-05-10 10:32:44 -07:00
Amber Brown 6b2b9a58c4 Prevent "producer not unregistered" message (#5009) 2019-04-24 17:37:32 +01:00
Andrew Morgan caa76e6021
Remove periods from copyright headers (#5046) 2019-04-11 17:08:13 +01:00
Matthew Hodgson 2326e00bc4 fix incorrect encoding of filenames with spaces in (#2090)
fixes https://github.com/vector-im/riot-web/issues/3155
2019-03-11 09:53:45 +00:00
Richard van der Hoff 68f47d6744 Fix parsing of Content-Disposition headers (#4763)
* Fix parsing of Content-Disposition headers

TIL: filenames in content-dispostion headers can contain semicolons, and aren't
%-encoded.

* fix python2 incompatibility

* Fix docstrings
2019-02-27 14:29:10 -08:00
Erik Johnston 899a119c2b Don't log stack trace when client has gone away during media download (#4738)
* Don't log stack trace when client has gone away during media download

* Newsfile

* Fixup newsfile
2019-02-25 11:17:22 -08:00
Erik Johnston b970cb0e96 Refactor request sending to have better excpetions (#4358)
* Correctly retry and back off if we get a HTTPerror response

* Refactor request sending to have better excpetions

MatrixFederationHttpClient blindly reraised exceptions to the caller
without differentiating "expected" failures (e.g. connection timeouts
etc) versus more severe problems (e.g. programming errors).

This commit adds a RequestSendFailed exception that is raised when
"expected" failures happen, allowing the TransactionQueue to log them as
warnings while allowing us to log other exceptions as actual exceptions.
2019-01-08 11:04:28 +00:00
Amber Brown ea6abf6724
Fix IP URL previews on Python 3 (#4215) 2018-12-22 01:56:13 +11:00
David Baker 89ac2a5bdb Add 'sandbox' to CSP for media repo (#4284)
* Add 'sandbox' to the CSP for media repo

* Changelog
2018-12-11 04:05:02 +11:00
Will Hunt fee831c040 Move imports to one line 2018-12-10 13:52:33 +00:00
Will Hunt 466c1f3e01
Use `send_cors` 2018-12-10 13:11:37 +00:00
Will Hunt 91206e09f2 changelog & isort 2018-12-09 17:39:44 +00:00
Will Hunt dbf736ba66
Make /config more CORS-y 2018-12-09 13:27:22 +00:00
Amber Brown 8b1affe7d5
Fix Content-Disposition in media repository (#4176) 2018-11-15 15:55:58 -06:00
Amber Brown df758e155d
Use <meta> tags to discover the per-page encoding of html previews (#4183) 2018-11-15 11:05:08 -06:00
Amber Brown b3708830b8
Fix URL preview bugs (type error when loading cache from db, content-type including quotes) (#4157) 2018-11-08 01:37:43 +11:00
Amber Brown 4cd1c9f2ff
Delete the disused & unspecced identicon functionality (#4106) 2018-10-29 23:57:24 +11:00
Richard van der Hoff ef771cc4c2 Fix a number of flake8 errors
Broadly three things here:

* disable W504 which seems a bit whacko
* remove a bunch of `as e` expressions from exception handlers that don't use
  them
* use `r""` for strings which include backslashes

Also, we don't use pep8 any more, so we can get rid of the duplicate config
there.
2018-10-24 10:39:03 +01:00
Richard van der Hoff 5c445114d3
Correctly account for cpu usage by background threads (#4074)
Wrap calls to deferToThread() in a thing which uses a child logcontext to
attribute CPU usage to the right request.

While we're in the area, remove the logcontext_tracer stuff, which is never
used, and afaik doesn't work.

Fixes #4064
2018-10-23 13:12:32 +01:00
Erik Johnston f6a0a02a62 Fix bug where we raised StopIteration in a generator
This made python 3.7 unhappy
2018-10-17 16:10:52 +01:00