Commit Graph

14 Commits (features_csvimport)

Author SHA1 Message Date
Raphaël Vinot 8fc5b1fd1f fix: Make pep8 happy 2018-12-11 15:29:09 +01:00
seamus tuohy 40c71af637 Added support for malformed internationalized email headers
When an emails contains headers that use Unicode without properly crafing
them to comform to RFC-6323 the email import module would crash.
(See issue #119 & issue #93)

To address this I have added additional layers of encoding/decoding to
any possibly internationalized email headers. This decodes properly
formed and malformed UTF-8, UTF-16, and UTF-32 headers appropriately.
When an unknown encoding is encountered it is returned as an 'encoded-word'
per RFC2047.

This commit also adds unit-tests that tests properly formed and malformed
UTF-8, UTF-16, UTF-32, and CJK encoded strings in all header fields; UTF-8,
UTF-16, and UTF-32 encoded message bodies; and emoji testing for headers
and attachment file names.
2017-07-02 18:03:14 -04:00
seamus tuohy 83a9d695ea Email import no longer unzips major compressed text document formats.
Let this commit serve as a warning about the perils of duck typing.
Word documents (docx,odt,etc) were being uncompressed when they were
attached to emails. The email importer now checks a list of well known
extensions and will not attempt to unzip them.

It is stuck using a list of extensions instead of using file magic because
many of these formats produce an application/zip mimetype when scanned.
2017-01-10 09:55:33 -05:00
Raphaël Vinot 1051e2210b Keep zip content as binary 2017-01-07 19:30:00 -05:00
Raphaël Vinot 9f84db3659 Fix tests, cleanup 2017-01-07 18:36:08 -05:00
Raphaël Vinot 2db845c45c Improve support of email attachments
Related to #90
2017-01-07 14:39:52 -05:00
Raphaël Vinot b51806ac9f Improve support of email importer if headers are missing
Fix #88
2017-01-07 10:25:38 -05:00
Raphaël Vinot 02f5e95a98 Fix python 3.6 support 2017-01-06 20:36:09 -05:00
Raphaël Vinot 93a49c3c1d Make PEP8 happy 2017-01-06 19:01:19 -05:00
Raphaël Vinot 3f83357a2d Fix failing test (bug in the mail parser?) 2017-01-06 18:56:29 -05:00
seamus tuohy 1a7973bc06 Add additional email parsing and tests
Added additional attribute parsing and corresponding unit-tests.
E-mail attachment and url extraction added in this commit. This includes
unpacking zipfiles and simple password cracking of encrypted zipfiles.
2017-01-04 10:21:36 -08:00
seamus tuohy 0ff270a3be Fixed basic errors 2016-12-26 14:33:10 -08:00
seamus tuohy 86ae72c444 Added attachment and url support 2016-12-26 13:55:54 -08:00
seamus tuohy 5033b1a9ca Added email meta-data import module.
This email meta-data import module collects basic meta-data from an e-mail
and populates an event with it. It populates the email subject, source
addresses, destination addresses, subject, and any attachment file names.
This commit also contains unit-tests for this module as well as updates to
the readme. Readme updates are additions aimed to make it easier for
outsiders to build modules.
2016-10-22 17:13:20 -04:00