diff --git a/doc/NOTES-RPI2.md b/doc/NOTES-RPI2.md deleted file mode 100644 index 18f9983..0000000 --- a/doc/NOTES-RPI2.md +++ /dev/null @@ -1,12 +0,0 @@ -USB Ports ---------- - -If you connect multiple keys to the Rpi, they will be listed that way: - -First: Top left -Second: Top right -Third: Bottom left -Forth: Bottom right - -tl;dr: as long as the source key (sda) is connected to the top left port, the destination (sdb) can be connected on any other port. - diff --git a/doc/NOTES.md b/doc/NOTES.md index c7b4220..a191e86 100644 --- a/doc/NOTES.md +++ b/doc/NOTES.md @@ -1,15 +1,15 @@ -Notes -===== +Usage Notes +=========== -* don't plug in USB devices with a hub because there's no way to tell it which - is source and target - its the first drive enumerated (top port) that is the +* Don't plug in USB devices with a hub because there's no way to tell it which + is source and target - its the first drive detected (top port) that is the source and the second (bottom port) is the target -* don't turn it off without shutting down the system, when grooming is done it +* Don't turn it off without shutting down the system, when grooming is done it shuts down automatically: losing power while it's running can trash the OS on the SD cards because SD cards don't always like dirty shutdowns (ie power loss) * Using a target usb stick that has a status light as long as the device has power is a really useful thing as there the other status lights on the groomer - are less than indicative at times: because the 'OK' led on the rPi toggles on activity + are less than indicative at times: because the 'OK' led on the RPi toggles on activity it can be off for a long time while processing something and only comes back on when that process finishes - hence why a USB that has some sort of LED activity when just plugged in (even if not reading or writing but while the USB port is @@ -18,46 +18,32 @@ Notes then be off on the USB device * Use a larger target device as all zip files get unpacked and processed onto the target -* if you have an hdmi monitor plugged in you can watch what's happening for about +* If you have an hdmi monitor plugged in you can watch what's happening for about 30 minutes until the rPI's power saving kicks in and turns off the monitor -* if only one usb stick is present at power up, it doesn't groom and looks like +* If only one usb stick is present at power up, it doesn't groom and looks like a normal rPi -* if you want to ssh into the rPi username is 'pi' password 'raspberry' as per defaults +* If you want to ssh into the RPi username is 'pi' password 'raspberry' as per defaults Technical notes =============== -* groomer script is in /opt/groomer/ with the other required files -* dependencies are libre-office and OpenJRE -* and the ip address is 192.168.1.89 -* the groomer process is kicked off in /etc/rc.local -* the heavy lifting takes place or is dispatched from /opt/groomer/groomer.sh - in that script file is what file types get processed (or if not listed there, - get ignored) -* there are two ways pdf's can get handled -right now they have their text extracted - to the target device, the other way copies it and extracts the text -* the pdf text extraction isn't perfect and is the slowest part of it, but should - be able to handle unicode stuff and currently doesn't do image extraction from - pdf's but could do that too +* Groomer script is in /opt/groomer/ with the other required files and the ip + address is 192.168.1.89 +* The groomer process is kicked off in /etc/rc.local +* The heavy lifting is dispatched from /opt/groomer/groomer.sh +* All files processing is in filecheck.py -Discussion -========== +USB Ports +========= -* however image exports of pdf pages only have the images and no text so it's not - like saving each page to a jpg which would be a really handy and safe way of - converting pdf's -* spread sheets and presentations get converted to pdfs to kill off any embedded - macros and it's assumed that it's not producing evil pdf's on export but does - nothing to sanitize any embedded links within those documents -* for spreadsheets, if they are longer than a page, only a page worth from that - sheet is exported right from the middle of the sheet (ie the top and bottom of - that sheet will get cut off and only the contents in the middle exported to pdf) - dumb but i figure if you want to go back to the source because it's interesting - enough on the groomed side of it, then you can take the extra precautions -* the groomed target only copies "safe" files, and does its best to convert any - potential unsafe files to a safer format -* safe files being one that I know of that can't contain malicious embedded macros - or other crap like that, and those than can get converted to something that wont - contain code after conversion +If you connect multiple keys to the RPi, they will be detected in this order: + +First: Top left +Second: Top right +Third: Bottom left +Forth: Bottom right + +* As long as the source key (sda) is connected to the top left port, the +destination (sdb) can be connected on any other port. diff --git a/doc/TODO.md b/doc/TODO.md deleted file mode 100644 index ee8be35..0000000 --- a/doc/TODO.md +++ /dev/null @@ -1,13 +0,0 @@ -TODO -==== - -* strip exif data and leave it in a .txt file next to the image it came from - => exiftool -* Scripts to generate a SD card automatically (win/mac/linux) -* Extract metadata fromĀ all the files => https://mat.boum.org/ - -HTML Files -========== - -- disable JS? -- cleanup external imports (js/css/images)