Commit Graph

119 Commits (27ca31ffd220e19fe2b5faba9e2be1fd28f7b123)

Author SHA1 Message Date
Dan Puttick 12d5624b4d Change FileBase.log_details to Filebase._file_props
* _file_props is a dict that will hold all information about the file
* Updated filecheck.py to reflect this
* Potentially will change contents of file_props to being attributes on the
file in the future. This change would be easy since all access to _file_props
is now via set_property and get_property methods.
* Add filename to _file_props
2017-03-15 21:06:06 -04:00
Dan Puttick 9832101c85 Identify TODOs that are log related 2017-03-15 21:06:06 -04:00
Dan Puttick 8d7dd1197f Move run_process back to Groomer object 2017-03-15 21:06:06 -04:00
Dan Puttick 781d0a76af First working version with methods in File object
- All tests now passing with file handling methods on File object
instead of Groomer object.
- Logging functionality still isn't finished.
2017-03-15 21:06:06 -04:00
Dan Puttick 9aafe6e518 Remove cur_file from methods in File object 2017-03-15 21:06:06 -04:00
Dan Puttick 53c1598af8 Move file processing methods into File object
- It seems like filecheck will be easier to reason about if all of
the file processing stuff happens in the File object. The Groomer
object will now be responsible only for enumerating the files to
be processed.
- Tests won't pass for this commit, but wanted to make the diff
cleaner but committing this before making changes.
2017-03-15 21:04:57 -04:00
Dan Puttick 3d36c90d66 Make list_all_files a public method 2017-03-15 21:04:57 -04:00
Dan Puttick c6ecc5e3a3 Fix process_dir bug in filecheck tests 2017-03-15 21:04:57 -04:00
Dan Puttick cfeccc2561 Move main() into filecheck
- Also change the name of processdir to process_dir
2017-03-15 21:04:57 -04:00
Dan Puttick 61aa14c98d Change _write_log to _print_log 2017-03-15 21:04:57 -04:00
Dan Puttick a450fe6b96 Add config object to filecheck
- Grouped all configuration options for filecheck into a Config object
- Makes the code easier to read since no longer many references to different
configuration globals
2017-03-15 21:04:57 -04:00
Dan Puttick 7d62238270 Hacks to make tests pass before fixing 2017-03-15 21:04:57 -04:00
Dan Puttick 1cf8a62f46 First commit with Logger object
- Made logger object
- Moved some logger related code from Groomer to Logger
- Changed logging related tests
- Filecheck tests still do not pass
2017-03-15 21:04:57 -04:00
Dan Puttick 92d1b1cd93 Refactor metadata processing code 2017-03-15 21:01:28 -04:00
Dan Puttick e2af701ac9 Remove several pieces of unused code
* Remove python 2 KittenGroomerBase.tree
* Remove default source and dest from KittenGroomerFileCheck
* Remove unused sys import
2017-03-15 21:01:28 -04:00
Raphaël Vinot a3cad2c21e Fix forgotten copy 2017-03-14 10:47:20 +01:00
Dan Puttick fd30fb3e08 Change _run_process() to use builtin timeout parameter
NOTE: this change breaks Python 2 compatability: subprocess.check_call does not
take a timeout argument in Python 2.7
2017-01-19 17:00:10 -05:00
Dan Puttick 21cc175867 Move non-filecheck.py binaries into examples directory
Tests for these scripts also removed from /tests and from .travis.yml
Two .zip archives accidentally deleted from /tests/src_invalid, re-added them
and changed .gitignore to prevent the problem
2017-01-19 15:25:08 -05:00
Dan Puttick 3dad4faa61 Reorganize tests making them easier to run
- The tests now automatically run depending on whether you have the dependencies
installed, instead of failing and throwing exceptions.
- CONTRIBUTING.md has more information on how to run the tests.
- When the tests run, they will save their logs to /test_logs instead
of printing them so you can read them later.
- Change names of source file directories to make them more descriptive
2017-01-18 15:51:54 -05:00
Dan Puttick 70a73dc292 Move process_file code into its own method 2017-01-18 15:51:50 -05:00
Dan Puttick 173a844b69 Some reorganization of filecheck.py, adding docstrings 2016-12-20 12:35:30 -05:00
Dan Puttick ecb4f56710 Small fixes to bin/README.md 2016-12-16 17:38:40 -05:00
Dan Puttick 7bafff3699 Fixed several small filecheck.py bugs for python3 compat 2016-12-16 16:06:12 -05:00
Dan Puttick 4f851435e5 Updated path traversal link, changed pip installs 2016-12-14 15:49:29 -05:00
Dan Puttick 0364e038e1 Added unit tests for KittenGroomerBase 2016-12-14 15:44:59 -05:00
Dan Puttick 26366cdb6d Added dev-requirements 2016-11-30 14:08:11 -05:00
Dan Puttick 8eac0d25f9 Small change to bin/README.md 2016-11-30 13:53:41 -05:00
Raphaël Vinot 04f2185b5d Improve RTF support 2016-05-17 14:10:14 +02:00
Raphaël Vinot fda900afc8 Cleanup unused mimetypes 2016-05-17 11:50:34 +02:00
Raphaël Vinot 61f519edb4 Reduce default recursive archives 2016-05-16 14:23:56 +02:00
Raphaël Vinot d05f8e9665 Fix Archive bomb 2016-05-16 12:25:52 +02:00
Raphaël Vinot 3eecd9cc16 Fix winoffice file processing with olefile 2016-05-14 20:44:16 +02:00
Raphaël Vinot 4deb73d245 Use own version of officedissector. 2016-05-09 17:38:32 +02:00
Raphaël Vinot 51615f8887 Handle invalid docx properly 2016-02-01 14:19:51 +01:00
Raphaël Vinot e8de330d34 Proper handling of OOXML docs 2016-02-01 12:34:47 +01:00
Raphaël Vinot 34e7075609 Merge pull request #2 from Dymaxion00/master
Initial working version of EXIF splitting and image format validation…
2015-12-21 00:31:39 +01:00
Eleanor Saitta 53e4570356 Switch back to exifread; PIL's EXIF support sucks. 2015-12-16 16:12:27 -05:00
Eleanor Saitta 53b61d487e Move to PIL for EXIF; add PNG metadata extractor; modularize metadata extraction
Switch back to exifread; PIL's EXIF support sucks.
2015-12-16 16:09:57 -05:00
Raphaël Vinot ecfdeb7b79 Add missing '.' 2015-12-15 10:46:11 +01:00
Eleanor Saitta ca90a08159 Initial working version of EXIF splitting and image format validation by round-trip conversion. 2015-12-10 00:06:36 -05:00
Raphaël Vinot 6bc83f947d Improve readme 2015-11-24 18:03:51 +01:00
Raphaël Vinot 936fc2c2a2 Proper handling of symlinks 2015-11-24 17:45:06 +01:00
Raphaël Vinot f2233aeae1 Improve doc, use trusty in travis. 2015-11-24 15:03:57 +01:00
Raphaël Vinot f44aedac17 Print FS tree for unpacked archives 2015-11-24 11:41:45 +01:00
Raphaël Vinot daec0cd689 Add forbidden extensions 2015-11-24 11:40:56 +01:00
Raphaël Vinot 1a2637b252 Use default python-magic, escape filenames 2015-11-05 16:27:48 +01:00
Raphaël Vinot 03f1d90f33 Code de-dupication 2015-11-05 15:34:22 +01:00
Raphaël Vinot b0d0912ff9 Skip the known extension check if mimetypes fails. 2015-11-05 10:34:03 +01:00
Raphaël Vinot 9079eac90a try to fix magic 2015-11-05 08:57:24 +01:00
Raphaël Vinot 531ab43dae Improve debug, add list of malicious ext 2015-11-05 00:10:30 +01:00
Raphaël Vinot 2669e80ca9 Unpack all archives, debug invalid mimetype 2015-11-03 17:56:42 +01:00
Raphaël Vinot c122ef9db8 Better support of ODF 2015-11-03 15:30:59 +01:00
Raphaël Vinot 5f080e7323 fix call pdfid 2015-11-03 13:04:14 +01:00
Raphaël Vinot d1f1c4fe16 Add new file to travis 2015-11-03 11:12:29 +01:00
Raphaël Vinot cb38f004e1 Initial version of the script to do sanity checks on files
In (pure) python
2015-11-02 18:00:40 +01:00
Raphaël Vinot 7f15b60539 Avoid error on unknown variable 2015-10-27 14:45:12 +01:00
Raphaël Vinot 74fe05cbe1 Do not use subprocess. 2015-10-27 10:24:45 +01:00
Raphaël Vinot 5d848f4787 Add script for specific purposes, add testcase 2015-10-26 17:11:36 +01:00
Raphaël Vinot dc098dd9a8 Make GS conversion safer 2015-06-17 16:50:20 +02:00
Raphaël Vinot a678a1c9f7 Force path to PDFA_def.ps 2015-06-02 16:44:57 +02:00
Raphaël Vinot 84b004c8a9 Better support of PDF/PS docs 2015-05-31 15:36:36 +02:00
Raphaël Vinot fb7e47b10e Fix bug with media processing 2015-05-29 18:00:48 +02:00
Raphaël Vinot 32d70efe29 Merge branch 'master' of github.com:CIRCL/PyCIRCLean 2015-05-29 17:35:14 +02:00
Raphaël Vinot 3b759eb9ab Fix typo, force overwrite on extract 2015-05-29 17:34:55 +02:00
Raphaël Vinot 420e87cbba Do not process a file that has been marked as dangerous. 2015-05-26 18:56:18 +02:00
Raphaël Vinot dcc3c7eda8 WIP: Start unoconv as a listener. 2015-05-26 18:08:57 +02:00
Raphaël Vinot 5d419f711a Python 3 support, run libreoffice headless. 2015-05-18 01:34:41 +02:00
Raphaël Vinot ac372dc59d Fix completely buggy mimetype/extension xcheck 2015-05-17 15:58:59 +02:00
Raphaël Vinot e9d76adb42 Initial commit 2015-05-11 14:32:59 +02:00