Refactor and convert `Linearizer` to async. This makes a `Linearizer`
cancellation bug easier to fix.
Also refactor to use an async context manager, which eliminates an
unlikely footgun where code that doesn't immediately use the context
manager could forget to release the lock.
Signed-off-by: Sean Quah <seanq@element.io>
This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately.
This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302.
Signed-off-by: Denis Kasak dkasak@termina.org.uk
By scraping Open Graph information from the HTML even
when an autodiscovery endpoint is found. The results are
then combined to capture as much information as possible
from the page.
* Splits the logic for parsing HTML from the resource handling code.
* Fix a circular import in the oEmbed code (which uses the HTML parsing code).
* Renames some of the HTML parsing methods to:
* Make it clear which methods are "internal" to the module.
* Clarify what the methods do.
* add code to handle missing content-type header and a test to verify that it works
* add handling for missing content-type in the /upload endpoint as well
* slightly refactor test code to put private method in approriate place
* handle possible null value for content-type when pulling from the local db
* add changelog
* refactor test and add code to handle missing content-type in cached remote media
* requested changes
* Update changelog.d/11200.bugfix
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Updating mypy past version 0.9 means that third-party stubs are no-longer distributed with typeshed. See http://mypy-lang.blogspot.com/2021/06/mypy-0900-released.html for details.
We therefore pull in stub packages in setup.py
Additionally, some modules that we were previously ignoring import failures for now have stubs. So let's use them.
The rest of this change consists of fixups to make the newer mypy + stubs pass CI.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
* Improved titles (fall back to the author name if there's not title) and include the site name.
* Handle photo/video payloads.
* Include the original URL in the Open Graph response.
* Fix the expiration time (by properly converting from seconds to milliseconds).
The major change is moving the decision of whether to use oEmbed
further up the call-stack. This reverts the _download_url method to
being a "dumb" functionwhich takes a single URL and downloads it
(as it was before #7920).
This also makes more minor refactorings:
* Renames internal variables for clarity.
* Factors out shared code between the HTML and rich oEmbed
previews.
* Fixes tests to preview an oEmbed image.
This adds the format to the request arguments / URL to
ensure that JSON data is returned (which is all that
Synapse supports).
This also adds additional error checking / filtering to the
configuration file to ignore XML-only providers.