* ObjectPool winoffice files are now make_dangerous
* safe_copy now catches IOErrors only
* Use os.makedirs(exist_ok=True) instead of checking for existence in safe_copy
and create_metadata_file
* Added stubs for two tests related to safe_copy
* This is kind of a big refactoring - I realized that storing file
props in a dict was causing some subtle problems, and that just having
them as attributes makes things a lot more simple
* I considered making a separate FileProps object and nesting it
inside FileBase, but almost all FileBase methods concern manipulating
file props, so it didn't really make sense.
* Tests are almost passing with this commit, but need a few more changes
and fixes for full test coverage + all passing.
* Previously, filecheck.py removed rtl character from the destination path only.
* Now, the rtl character is replaced in file.filename and the filename file
property.
ObjectStream isn't necessarely malicious, but can be. This patch could
be improved by unpacking the content of the stream, but it requires 3rd
party libraries we don't have for now.
Final fix for PCL-01-002
* Same problem we've had before - linux filenames can have non-unicode chars
in them
* We need to write the filename as raw bytes to the log
* os.fsencode lets us convert a utf-8 encoded string to bytes and ignore those
that can't be printed as unicode
* Still not clear if the log generated this way will be human-readable
* Wrote a new text-based logger that displays all file information in the tree
instead of using two separate logs
* Stopped using twiggy since it wasn't giving us anything useful
* Moved a lot of the logging code to filecheck, since it didn't really seem
appropriate as an API. Left a Logging stub in kittengroomer to hold methods
that might be useful for implementing other loggers.
* For the new logger, had to change the way that we traverse the items in the
source file tree.
* Realized that the API consumer might want to write their own logging tool.
* FileBase and KittenGroomerBase will have no logging code.
* If the API consumer likes, they can import GroomerLogger and use it in their
implementation.
The unicode right to left override character can be used for various attacks.
This commit:
* Detects this character in the filename on the source key
* Strips it from the path before copying it to the dest key
* Marks the file as dangerous (this character doesn't belong in a filename)
* Each test folder now copies files into its own test directory
* Change gitignore due to dst dir changes
* Make sure logger.tree is called for every directory