Commit Graph

247 Commits (7b8b175a9397d43c039aee344dd840a6c59d4d33)

Author SHA1 Message Date
Dan Puttick 7b8b175a93 Fixup comments, add stubs for new tests 2017-07-11 15:48:40 -04:00
Dan Puttick 966c577ff8 Add uncategorized 2017-07-11 15:47:48 -04:00
Dan Puttick fed4f75cd7 Filecheck tests work with file catalog 2017-07-07 16:09:06 -04:00
Dan Puttick e977966480 Reorganize filecheck tests
* Move all SampleFile code into test_filecheck.py
* Make a stub for logging tests in test_filecheck_logging.py
2017-07-07 16:02:20 -04:00
Dan Puttick 0bec2a8345 Add PyYAML to dev-requirements.txt 2017-07-06 18:00:04 -04:00
Dan Puttick 871eea7a68 Working version of file-by-file testing 2017-07-06 15:41:38 -04:00
Dan Puttick 3f9be48cbd Initial version of individual file test harness
* Parametrized test for normal and dangerous files
* Still needs a way to handle "complex" files (ones that require a full groomer
due to having temporary folders), as well as a way to read .expect test description
files
2017-07-05 18:09:10 -04:00
Dan Puttick e88ec8a474 Rename src_valid and src_invalid
* In preparation for having source files be a submodule instead of bundled with PyCIRCLean
* First, going to get the tests running with the new set of files
* Then, going to make the tests run file by file instead of on the whole directory
* Also, need to rewrite the KittenGroomer tests to mock the libmagic methods
2017-06-30 15:25:53 -04:00
Dan Puttick a73b746f5a Removing unneeded test files 2017-06-28 16:01:42 -04:00
Dan Puttick 7ead07814a Delete testfile_catalog
* Test files are being moved to separate repo
2017-06-27 14:36:58 -04:00
Dan Puttick 141a64cf33 Remove extra comments from helpers.py
* Was thinking about removing processdir and main, but somebody might still
want to use them + they could be a helpful reference for a new person.
2017-06-27 12:13:29 -04:00
Dan Puttick e51a503c33 Remove PIL.PngImagePlugin import
* This dependency isn't being used anymore: when I test, pngs are still
processed normally
2017-06-27 12:13:29 -04:00
Dan Puttick fa1df8e67f Remove rtl character from filename in log
* Previously, filecheck.py removed rtl character from the destination path only.
* Now, the rtl character is replaced in file.filename and the filename file
property.
2017-06-27 12:13:29 -04:00
Dan Puttick 11ab194232 Add sample .rar and .pdf 2017-06-27 12:13:29 -04:00
Dan Puttick b77fa4b625 Remove sample mpeg and mp3 files in Travis
* Links to these files were broken
* Will add new samples directly to the repo
2017-06-27 12:13:29 -04:00
Dan Puttick 4e4ab74efd Move sample rtf from travis build into repo
* Sample file available at http://thewalter.net/stef/software/rtfx/sample.rtf
* Note: didn't confirm that there aren't any copyright problems with this sample
if there are, should revert this commit.

diff --git a/.travis.yml b/.travis.yml
index 0cf3beb..329729d 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -70,7 +70,6 @@ install:
     # Some random samples
     # - wget http://www.sample-videos.com/audio/mp3/india-national-anthem.mp3
     # - wget http://www.sample-videos.com/video/mp4/720/big_buck_bunny_720p_1mb.mp4
-    - wget http://thewalter.net/stef/software/rtfx/sample.rtf
     - popd

 script:
diff --git a/tests/src_valid/sample.rtf b/tests/src_valid/sample.rtf
new file mode 100644
index 0000000..ede29b3
--- /dev/null
+++ b/tests/src_valid/sample.rtf
@@ -0,0 +1,68 @@
+{\rtf1\ansi\ansicpg1252\uc1\deff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\deflang1033\deflangfe1033{\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f1\fswiss\fcharset0\fprq2{\*\panose 020b0604020202020204}Arial;}
+{\f2\fmodern\fcharset0\fprq1{\*\panose 02070309020205020404}Courier New;}{\f3\froman\fcharset2\fprq2{\*\panose 05050102010706020507}Symbol;}{\f10\fnil\fcharset2\fprq2{\*\panose 05000000000000000000}Wingdings;}
+{\f121\froman\fcharset238\fprq2 Times New Roman CE;}{\f122\froman\fcharset204\fprq2 Times New Roman Cyr;}{\f124\froman\fcharset161\fprq2 Times New Roman Greek;}{\f125\froman\fcharset162\fprq2 Times New Roman Tur;}
+{\f126\froman\fcharset177\fprq2 Times New Roman (Hebrew);}{\f127\froman\fcharset178\fprq2 Times New Roman (Arabic);}{\f128\froman\fcharset186\fprq2 Times New Roman Baltic;}{\f129\froman\fcharset163\fprq2 Times New Roman (Vietnamese);}
+{\f131\fswiss\fcharset238\fprq2 Arial CE;}{\f132\fswiss\fcharset204\fprq2 Arial Cyr;}{\f134\fswiss\fcharset161\fprq2 Arial Greek;}{\f135\fswiss\fcharset162\fprq2 Arial Tur;}{\f136\fswiss\fcharset177\fprq2 Arial (Hebrew);}
+{\f137\fswiss\fcharset178\fprq2 Arial (Arabic);}{\f138\fswiss\fcharset186\fprq2 Arial Baltic;}{\f139\fswiss\fcharset163\fprq2 Arial (Vietnamese);}{\f141\fmodern\fcharset238\fprq1 Courier New CE;}{\f142\fmodern\fcharset204\fprq1 Courier New Cyr;}
+{\f144\fmodern\fcharset161\fprq1 Courier New Greek;}{\f145\fmodern\fcharset162\fprq1 Courier New Tur;}{\f146\fmodern\fcharset177\fprq1 Courier New (Hebrew);}{\f147\fmodern\fcharset178\fprq1 Courier New (Arabic);}
+{\f148\fmodern\fcharset186\fprq1 Courier New Baltic;}{\f149\fmodern\fcharset163\fprq1 Courier New (Vietnamese);}}{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;
+\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;\red128\green128\blue128;\red192\green192\blue192;}{\stylesheet{
+\ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 \snext0 Normal;}{\s1\ql \li0\ri0\sb240\sa60\keepn\widctlpar\aspalpha\aspnum\faauto\outlinelevel0\adjustright\rin0\lin0\itap0
+\b\f1\fs32\lang1033\langfe1033\kerning32\cgrid\langnp1033\langfenp1033 \sbasedon0 \snext0 \styrsid2294299 heading 1;}{\*\cs10 \additive \ssemihidden Default Paragraph Font;}{\*
+\ts11\tsrowd\trftsWidthB3\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tscellwidthfts0\tsvertalt\tsbrdrt\tsbrdrl\tsbrdrb\tsbrdrr\tsbrdrdgl\tsbrdrdgr\tsbrdrh\tsbrdrv
+\ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs20\lang1024\langfe1024\cgrid\langnp1024\langfenp1024 \snext11 \ssemihidden Normal Table;}{\*\ts15\tsrowd\trbrdrt\brdrs\brdrw10 \trbrdrl\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10
+\trbrdrr\brdrs\brdrw10 \trbrdrh\brdrs\brdrw10 \trbrdrv\brdrs\brdrw10 \trftsWidthB3\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tscellwidthfts0\tsvertalt\tsbrdrt\tsbrdrl\tsbrdrb\tsbrdrr\tsbrdrdgl\tsbrdrdgr\tsbrdrh\tsbrdrv
+\ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs20\lang1024\langfe1024\cgrid\langnp1024\langfenp1024 \sbasedon11 \snext15 \styrsid2294299 Table Grid;}{
+\s16\ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs20\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 \sbasedon0 \snext16 \ssemihidden \styrsid1792631 footnote text;}{\*\cs17 \additive \super
+\sbasedon10 \ssemihidden \styrsid1792631 footnote reference;}}{\*\latentstyles\lsdstimax156\lsdlockeddef0}{\*\listtable{\list\listtemplateid-767292450\listhybrid{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360
+\levelindent0{\leveltext\leveltemplateid67698689\'01\u-3913 ?;}{\levelnumbers;}\f3\fbias0 \fi-360\li720\jclisttab\tx720\lin720 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\leveltext
+\leveltemplateid67698691\'01o;}{\levelnumbers;}\f2\fbias0 \fi-360\li1440\jclisttab\tx1440\lin1440 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\leveltext\leveltemplateid67698693
+\'01\u-3929 ?;}{\levelnumbers;}\f10\fbias0 \fi-360\li2160\jclisttab\tx2160\lin2160 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\leveltext\leveltemplateid67698689\'01\u-3913 ?;}{\levelnumbers
+;}\f3\fbias0 \fi-360\li2880\jclisttab\tx2880\lin2880 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\leveltext\leveltemplateid67698691\'01o;}{\levelnumbers;}\f2\fbias0 \fi-360\li3600
+\jclisttab\tx3600\lin3600 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\leveltext\leveltemplateid67698693\'01\u-3929 ?;}{\levelnumbers;}\f10\fbias0 \fi-360\li4320\jclisttab\tx4320\lin4320 }
+{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\leveltext\leveltemplateid67698689\'01\u-3913 ?;}{\levelnumbers;}\f3\fbias0 \fi-360\li5040\jclisttab\tx5040\lin5040 }{\listlevel\levelnfc23
+\levelnfcn23\leveljc0\leveljcn0\levelfollow0\levelstartat1\levelspace360\levelindent0{\leveltext\leveltemplateid67698691\'01o;}{\levelnumbers;}\f2\fbias0 \fi-360\li5760\jclisttab\tx5760\lin5760 }{\listlevel\levelnfc23\levelnfcn23\leveljc0\leveljcn0
+\levelfollow0\levelstartat1\levelspace360\levelindent0{\leveltext\leveltemplateid67698693\'01\u-3929 ?;}{\levelnumbers;}\f10\fbias0 \fi-360\li6480\jclisttab\tx6480\lin6480 }{\listname ;}\listid687222349}}{\*\listoverridetable{\listoverride\listid687222349
+\listoverridecount0\ls1}}{\*\rsidtbl \rsid1792631\rsid2294299}{\*\generator Microsoft Word 11.0.6113;}{\info{\title This is a test RTF}{\author Nate}{\operator Nate}{\version2}}\widowctrl\ftnbj\aenddoc\noxlattoyen\expshrtn\noultrlspc\dntblnsbdb\nospaceforul\hyphcaps0\formshade\horzdoc\dgmargin\dghspace180\dgvspace180\dghorigin1800\dgvorigin1440
+\dghshow1\dgvshow1\jexpand\viewkind1\viewscale80\pgbrdrhead\pgbrdrfoot\splytwnine\ftnlytwnine\htmautsp\nolnhtadjtbl\useltbaln\alntblind\lytcalctblwd\lyttblrtgr\lnbrkrule\nobrkwrptbl\snaptogridincell\allowfieldendsel\wrppunct
+\asianbrkrule\rsidroot2294299\newtblstyruls\nogrowautofit \fet0{\*\ftnsep \pard\plain \ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\insrsid1792631 \chftnsep
+\par }}{\*\ftnsepc \pard\plain \ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\insrsid1792631 \chftnsepc
+\par }}{\*\aftnsep \pard\plain \ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\insrsid1792631 \chftnsep
+\par }}{\*\aftnsepc \pard\plain \ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\insrsid1792631 \chftnsepc
+\par }}\sectd \linex0\endnhere\sectlinegrid360\sectdefaultcl\sftnbj {\*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl2\pnucltr\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl3\pndec\pnstart1\pnindent720\pnhang {\pntxta .}}
+{\*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang {\pntxta )}}{\*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl6\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang
+{\pntxtb (}{\pntxta )}}{\*\pnseclvl8\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}\pard\plain
+\s1\ql \li0\ri0\sb240\sa60\keepn\widctlpar\aspalpha\aspnum\faauto\outlinelevel0\adjustright\rin0\lin0\itap0\pararsid2294299 \b\f1\fs32\lang1033\langfe1033\kerning32\cgrid\langnp1033\langfenp1033 {\insrsid2294299 This is a test RTF
+\par }\pard\plain \ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid2294299 \fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\insrsid2294299 Hi! I\rquote m a test file. This is some }{\b\insrsid2294299 bold}{
+\insrsid2294299  text, and some }{\i\insrsid2294299 italic}{\insrsid2294299  text, as well as some }{\ul\insrsid2294299 underline}{\insrsid2294299  text. And a bit of }{\v\insrsid2294299\charrsid2294299 hidden}{\insrsid2294299  text. So we\rquote
+re going to end this paragraph here and go on to a nice little list:
+\par
+\par {\listtext\pard\plain\f3\insrsid2294299 \loch\af3\dbch\af0\hich\f3 \'b7\tab}}\pard \ql \fi-360\li720\ri0\widctlpar\jclisttab\tx720\aspalpha\aspnum\faauto\ls1\adjustright\rin0\lin720\itap0\pararsid2294299 {\insrsid2294299 Item 1
+\par {\listtext\pard\plain\f3\insrsid2294299 \loch\af3\dbch\af0\hich\f3 \'b7\tab}Item 2
+\par {\listtext\pard\plain\f3\insrsid2294299 \loch\af3\dbch\af0\hich\f3 \'b7\tab}Item 3
+\par {\listtext\pard\plain\f3\insrsid2294299 \loch\af3\dbch\af0\hich\f3 \'b7\tab}Item 4
+\par }\pard \ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid2294299 {\insrsid2294299
+\par And now comes a fun table:
+\par
+\par }\trowd \irow0\irowband0\ts15\trgaph108\trleft-108\trbrdrt\brdrs\brdrw10 \trbrdrl\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrh\brdrs\brdrw10 \trbrdrv\brdrs\brdrw10
+\trftsWidth1\trftsWidthB3\trautofit1\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tbllkhdrrows\tbllklastrow\tbllkhdrcols\tbllklastcol \clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10
+\cltxlrtb\clftsWidth3\clwWidth2952\clshdrawnil \cellx2844\clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth2952\clshdrawnil \cellx5796\clvertalt\clbrdrt\brdrs\brdrw10
+\clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth2952\clshdrawnil \cellx8748\pard\plain \ql \li0\ri0\widctlpar\intbl\aspalpha\aspnum\faauto\adjustright\rin0\lin0\pararsid2294299\yts15
+\fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\insrsid2294299 Cell 1\cell Cell 2
+\par More in cell 2\cell Cell 3\cell }\pard\plain \ql \li0\ri0\widctlpar\intbl\aspalpha\aspnum\faauto\adjustright\rin0\lin0 \fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\insrsid2294299 \trowd \irow0\irowband0\ts15\trgaph108\trleft-108\trbrdrt
+\brdrs\brdrw10 \trbrdrl\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrh\brdrs\brdrw10 \trbrdrv\brdrs\brdrw10
+\trftsWidth1\trftsWidthB3\trautofit1\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tbllkhdrrows\tbllklastrow\tbllkhdrcols\tbllklastcol \clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10
+\cltxlrtb\clftsWidth3\clwWidth2952\clshdrawnil \cellx2844\clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth2952\clshdrawnil \cellx5796\clvertalt\clbrdrt\brdrs\brdrw10
+\clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth2952\clshdrawnil \cellx8748\row }\pard\plain \ql \li0\ri0\widctlpar\intbl\aspalpha\aspnum\faauto\adjustright\rin0\lin0\pararsid2294299\yts15
+\fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\insrsid2294299 Next row\cell Next row \cell Next row\cell }\pard\plain \ql \li0\ri0\widctlpar\intbl\aspalpha\aspnum\faauto\adjustright\rin0\lin0
+\fs24\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\insrsid2294299 \trowd \irow1\irowband1\lastrow \ts15\trgaph108\trleft-108\trbrdrt\brdrs\brdrw10 \trbrdrl\brdrs\brdrw10 \trbrdrb\brdrs\brdrw10 \trbrdrr\brdrs\brdrw10 \trbrdrh\brdrs\brdrw10 \trbrdrv
+\brdrs\brdrw10 \trftsWidth1\trftsWidthB3\trautofit1\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tbllkhdrrows\tbllklastrow\tbllkhdrcols\tbllklastcol \clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr
+\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth2952\clshdrawnil \cellx2844\clvertalt\clbrdrt\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth2952\clshdrawnil \cellx5796\clvertalt\clbrdrt
+\brdrs\brdrw10 \clbrdrl\brdrs\brdrw10 \clbrdrb\brdrs\brdrw10 \clbrdrr\brdrs\brdrw10 \cltxlrtb\clftsWidth3\clwWidth2952\clshdrawnil \cellx8748\row }\pard \ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\pararsid2294299 {
+\insrsid2294299
+\par A page break:
+\par \page And here we\rquote re on the next page.}{\insrsid1792631  }{\insrsid2294299
+\par }{\insrsid1792631 This para has a }{\cs17\super\insrsid1792631 \chftn {\footnote \pard\plain \s16\ql \li0\ri0\widctlpar\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \fs20\lang1033\langfe1033\cgrid\langnp1033\langfenp1033 {\cs17\super\insrsid1792631
+\chftn }{\insrsid1792631  This is the actual content of the footnote.}}}{\insrsid1792631 footnote.
+\par And here\rquote s yet another paragraph. }{\insrsid1792631\charrsid2294299
+\par }}
2017-06-27 12:13:29 -04:00
Dan Puttick a76b0df543 Add more docstrings to filecheck.py 2017-06-27 12:13:29 -04:00
Dan Puttick 76467e420e Add work-in-progress support for errors in log 2017-06-27 11:54:57 -04:00
Dan Puttick 3c1fcda29e Add different file size units to log
* Format file sizes with B, KB, MB, or GB depending on size
* Clean up logic and variable names in add_file()
2017-06-27 11:54:57 -04:00
Dan Puttick e27d397496 Change is_recursive to is_archive
* is_archive is clearer and more descriptive than is_recursive
2017-06-27 11:54:57 -04:00
Dan Puttick 74a7244fbd Add png file to src_valid 2017-06-27 11:52:21 -04:00
Raphaël Vinot 079e8d30a3 Add support for ObjectStream in PDF
ObjectStream isn't necessarely malicious, but can be. This patch could
be improved by unpacking the content of the stream, but it requires 3rd
party libraries we don't have for now.

Final fix for PCL-01-002
2017-06-19 11:24:47 +02:00
Raphaël Vinot 7d38ec3d32 Remove extensions processed by the script 2017-06-16 17:57:07 +02:00
Raphaël Vinot f44719b83e Add list of malicious extensions used in Google Chrome
Fix PCL-01-009
2017-06-16 17:26:39 +02:00
Raphaël Vinot 40f71e758f Fix logging and symlinks
Fix PCL-01-006
2017-06-16 14:47:53 +02:00
Raphaël Vinot 8c007e28cf Add support for XFA structure un PDF
Partial fix for PCL-01-002
2017-06-16 14:47:19 +02:00
Raphaël Vinot e276f33c29 Merge pull request #15 from aschet01/norepeats
Check value against repeats in description_string.
2017-05-28 06:41:35 +02:00
Adam Schettenhelm 7b990df436 Check against value for duplicates when inserting into description_string 2017-05-26 20:37:36 -04:00
Raphaël Vinot b6fb4c80bf Add travis tests on python 3.6 2017-04-13 23:03:50 +02:00
Raphaël Vinot f5cc3d7533 Merge pull request #14 from dputtick/logging
Logging format improvements
2017-04-13 22:44:39 +02:00
Dan Puttick d470d6bb21 Open test log in bytes mode 2017-04-12 17:19:21 -04:00
Dan Puttick a8179d0688 Read test logs in bytes mode 2017-04-12 15:32:57 -04:00
Dan Puttick 45d71cb362 Fix unicode filename issues using fsencode
* Same problem we've had before - linux filenames can have non-unicode chars
in them
* We need to write the filename as raw bytes to the log
* os.fsencode lets us convert a utf-8 encoded string to bytes and ignore those
that can't be printed as unicode
* Still not clear if the log generated this way will be human-readable
2017-04-10 13:39:28 +02:00
Dan Puttick 053f30db93 Partial update to changelog 2017-04-10 13:33:00 +02:00
Dan Puttick 13460d643d Remove twiggy from install requirements 2017-04-10 13:25:51 +02:00
Dan Puttick 5865ddd94d Make root paths abspaths
* src_root_path and dst_root_path are now converted to abspaths when
initializing kittengroomer to avoid any weird issues with relative or misformed
paths
2017-04-10 13:22:36 +02:00
Dan Puttick 3e56787686 Update tests for new logger 2017-04-10 13:22:36 +02:00
Dan Puttick c43ac0697a Add comments/notes to helpers.py 2017-04-10 13:22:36 +02:00
Dan Puttick 67c90087ba Add time information to test logs 2017-04-10 13:22:36 +02:00
Dan Puttick 3f49612a23 Add new logger, move logging to filecheck
* Wrote a new text-based logger that displays all file information in the tree
instead of using two separate logs
* Stopped using twiggy since it wasn't giving us anything useful
* Moved a lot of the logging code to filecheck, since it didn't really seem
appropriate as an API. Left a Logging stub in kittengroomer to hold methods
that might be useful for implementing other loggers.
* For the new logger, had to change the way that we traverse the items in the
source file tree.
2017-04-10 13:22:20 +02:00
Dan Puttick f0e7607a3f Improve description strings in filecheck
* Description strings that appear in the log improved in filecheck for various
file types
* Added various comments
2017-04-10 13:00:34 +02:00
Dan Puttick c85ad27221 New test files 2017-04-10 12:54:07 +02:00
Dan Puttick ba407219d3 Open log file in text mode instead of bytes mode 2017-03-22 21:49:48 -04:00
Dan Puttick 52bb566cc3 Write basic log to log file 2017-03-22 12:07:43 -04:00
Dan Puttick 6f9e36a578 Change filecheck for new file description method
* self.add_file_string -> self.add_description
2017-03-22 12:04:22 -04:00
Dan Puttick 265f32ad6b Change the way description strings are handled 2017-03-22 10:28:00 -04:00
Dan Puttick 3e7b38c5d4 Improve doc strings on FileBase 2017-03-21 18:58:17 -04:00
Dan Puttick 6851461755 Change from two separate logs to one 2017-03-20 16:10:57 -04:00
Dan Puttick 51760ebbb1 Move default log setup back into filecheck
* Realized that the API consumer might want to write their own logging tool.
* FileBase and KittenGroomerBase will have no logging code.
* If the API consumer likes, they can import GroomerLogger and use it in their
implementation.
2017-03-20 16:10:57 -04:00
Dan Puttick 18857c8cf7 Change names of KittenGroomerBase root dir paths 2017-03-20 16:10:57 -04:00