Raphaël Vinot
|
6ba019ec83
|
chg: Improve somewhat the useragents available for capturing
Fix #416
|
2022-06-09 18:58:17 +02:00 |
Raphaël Vinot
|
1817a3e13b
|
chg: sunday cleanup
|
2022-05-23 00:15:52 +02:00 |
Raphaël Vinot
|
d222ae04aa
|
new: Keep capture even if we have a network error
|
2022-05-03 12:23:16 +02:00 |
Raphaël Vinot
|
463d1d2d1a
|
new: autosubmit to FOX, bump deps
|
2022-05-02 13:04:55 +02:00 |
Raphaël Vinot
|
ef1094a331
|
chg: Bump deps, fix cookie issue
Fix #404
|
2022-04-29 00:44:03 +02:00 |
Raphaël Vinot
|
1679ccf90f
|
chg: Improve capture, ignore ssl issues.
|
2022-04-26 13:49:24 +02:00 |
Raphaël Vinot
|
77fbf47e73
|
fix: capture cleanup
|
2022-04-26 10:25:11 +02:00 |
Raphaël Vinot
|
147bc65992
|
fix: Mypy, docker
|
2022-04-26 00:59:57 +02:00 |
Raphaël Vinot
|
41c7e87458
|
fix: docker, improve error catching
|
2022-04-26 00:33:50 +02:00 |
Raphaël Vinot
|
5af278f84d
|
fix: issue in playwrightcapture module
|
2022-04-25 15:20:05 +02:00 |
Raphaël Vinot
|
4ad898a375
|
chg: Use packaged playwright capture module
|
2022-04-25 13:34:01 +02:00 |
Raphaël Vinot
|
c93a6c307d
|
chg: properly set cookies
|
2022-04-24 20:17:54 +03:00 |
Raphaël Vinot
|
680eb1b309
|
fix: better handling if capture fails.
|
2022-04-21 15:48:28 +03:00 |
Raphaël Vinot
|
8d159ffba0
|
new: Switch away from splash to use playwright
|
2022-04-21 14:55:07 +03:00 |
Raphaël Vinot
|
83fc0bd8f4
|
fix: shutil.move wants str (not Path) for python<3.9
|
2022-04-10 12:43:56 +02:00 |
Kimmo Linnavuo
|
a80b6a31e4
|
Use shutil.move instead of path rename when moving discarded captures
|
2022-04-08 15:28:06 +03:00 |
Raphaël Vinot
|
cf46dde1ed
|
chg: Add basic pre-hook config
|
2022-03-31 11:30:53 +02:00 |
Raphaël Vinot
|
ae9cb3e81c
|
chg: Bump deps
|
2022-03-29 21:13:02 +02:00 |
Raphaël Vinot
|
c9307b5159
|
chg: Improve start/stop for DBs
|
2021-12-02 14:39:32 +01:00 |
Raphaël Vinot
|
a55fb5380a
|
chg: Sync stop script with template
|
2021-11-26 14:16:22 -05:00 |
Raphaël Vinot
|
d7c9892957
|
fix: Wait for DBs to be down before returning in stop script
|
2021-11-26 13:48:46 -05:00 |
Raphaël Vinot
|
daca988f3f
|
chg: better handling of broken indexes in archiver
|
2021-11-26 12:36:35 -05:00 |
Raphaël Vinot
|
cef1088984
|
chg: programmatically shutdown DBs
|
2021-11-26 12:35:15 -05:00 |
Raphaël Vinot
|
58b50f2b24
|
new: Pass optional arbitrary HTTP headers to capture
|
2021-11-23 12:59:56 -08:00 |
Raphaël Vinot
|
bfb1e6b181
|
fix: Use default_public for all capture, including if submitted via the API
|
2021-11-02 14:58:31 -07:00 |
Raphaël Vinot
|
1f998b457f
|
chg: use template
|
2021-10-18 13:06:43 +02:00 |
Raphaël Vinot
|
6e9e3990c4
|
fix: Indexes not updated on tree rebuild, better handling of tree cache
|
2021-09-24 16:16:41 +02:00 |
Raphaël Vinot
|
48fc807e7d
|
new: Add monitoring for pickle cache status
|
2021-09-24 12:02:28 +02:00 |
Raphaël Vinot
|
32ee474be2
|
chg: Improve tree creation and cache
|
2021-09-22 17:09:04 +02:00 |
Raphaël Vinot
|
d1f673f3a7
|
chg: Cleanup passing listing key to and from bool in redis
|
2021-09-10 14:20:58 +02:00 |
Raphaël Vinot
|
9c7929569e
|
fix: The captures are visible on the index by default.
|
2021-09-08 20:43:56 +02:00 |
Raphaël Vinot
|
48b632aa1e
|
fix: Incorrect matching for listing key in capture (always false)
|
2021-09-08 10:53:31 +02:00 |
Raphaël Vinot
|
902c8f81b6
|
chg: Improve error message if the capture fails
Fix #257
|
2021-09-07 18:16:01 +02:00 |
Raphaël Vinot
|
dfbe40a52e
|
chg: reorder imports
|
2021-09-07 16:00:07 +02:00 |
Raphaël Vinot
|
c09adec333
|
chg: Improve logging.
|
2021-09-01 14:08:25 +02:00 |
Raphaël Vinot
|
797de9ddb3
|
fix: remove datefmt from logging.basicConfig, it was a bad idea.
|
2021-09-01 10:40:59 +02:00 |
Raphaël Vinot
|
2e5a5f3aff
|
fix: unlink indexes pointing to unknown directories
|
2021-08-30 14:45:44 +02:00 |
Raphaël Vinot
|
e56c70d1a1
|
chg: out of safety, do not remove a capture dir.
|
2021-08-30 12:54:17 +02:00 |
Raphaël Vinot
|
117500b777
|
chg: Make archiver an index generator
|
2021-08-30 12:48:13 +02:00 |
Raphaël Vinot
|
324736f62c
|
fix: Use proper exception on redis start
|
2021-08-27 18:08:34 +02:00 |
Raphaël Vinot
|
ae76cb77be
|
fix: Uncomment website start
|
2021-08-27 17:49:27 +02:00 |
Raphaël Vinot
|
8a51383d7a
|
chg: Move the process managment methods to the proper class
|
2021-08-27 17:28:26 +02:00 |
Raphaël Vinot
|
85e43fc677
|
chg: Make the website start a normal start script
|
2021-08-27 16:45:16 +02:00 |
Raphaël Vinot
|
d41b7735dd
|
chg: Improve storage, support both modes.
|
2021-08-26 15:49:19 +02:00 |
Raphaël Vinot
|
407e78ae7f
|
chg: More cleanup, support clean shutdown of multiple async captures
|
2021-08-25 16:40:51 +02:00 |
Raphaël Vinot
|
bf700e7a7b
|
chg: Major refactoring, move capture code to external script.
|
2021-08-25 13:36:48 +02:00 |
Raphaël Vinot
|
c732e38395
|
chg: Add logging in BG processing
|
2021-08-24 18:44:00 +02:00 |
Raphaël Vinot
|
81390d5ea0
|
chg: cleanup in the mail lookyloo class
|
2021-08-24 18:32:54 +02:00 |
Raphaël Vinot
|
8433cbcc1b
|
chg: Cleanup archiver, initialize index captures in start
|
2021-08-24 17:10:14 +02:00 |
Raphaël Vinot
|
ece30a33eb
|
chg: Fix typo in archiver
|
2021-08-23 16:56:17 +02:00 |
Raphaël Vinot
|
fb1685cedc
|
add: reset recent captures in archiving process
|
2021-08-23 16:19:50 +02:00 |
Raphaël Vinot
|
8f28335010
|
fix: properly match cut time
|
2021-08-23 15:51:06 +02:00 |
Raphaël Vinot
|
2c1971311a
|
chg: Make the cut-off date for archiving the 1st of the month
|
2021-08-23 15:36:59 +02:00 |
Raphaël Vinot
|
5c9b88a3ca
|
fix: Make sure all the archived UUIDs are removed
|
2021-08-23 15:29:21 +02:00 |
Raphaël Vinot
|
67e6571145
|
chg: Force init the archived indexes
|
2021-08-23 15:14:08 +02:00 |
Raphaël Vinot
|
53ceb9c329
|
chg: Cleanup when dir is moved, digit months on 2 values
|
2021-08-23 14:53:19 +02:00 |
Raphaël Vinot
|
d359bc7521
|
chg: Better use of cache, sanity checks
|
2021-08-23 12:17:44 +02:00 |
Raphaël Vinot
|
58b837cb6c
|
new: Archiver, refactoring.
|
2021-08-20 17:46:22 +02:00 |
Raphaël Vinot
|
6be9b69d95
|
chg: Use connection pool whenever possible
|
2021-08-18 18:01:04 +02:00 |
Raphaël Vinot
|
59f2a510c0
|
fix: properly catch broken capture, bump deps
|
2021-07-14 11:34:10 +02:00 |
Raphaël Vinot
|
1117ab6371
|
chg: add stats, avoid building big trees twice, bump deps
|
2021-05-26 18:25:06 -07:00 |
Raphaël Vinot
|
335ab662cf
|
new: Auto trigger modules in the bg process
|
2021-05-19 15:12:35 -07:00 |
Raphaël Vinot
|
f865ec912a
|
fix: Move set/unset running to abstract
Avoid issues when a script fails unexpectedly.
|
2021-04-09 14:33:42 +02:00 |
Raphaël Vinot
|
7707d638cf
|
new: Use async capture for the UI.
Add a method to make sure splash is up before trying to capture.
|
2021-04-08 19:15:53 +02:00 |
Raphaël Vinot
|
4847fdb670
|
fix: Windows path in update
|
2021-04-06 17:43:45 +02:00 |
Raphaël Vinot
|
c38ec90bb1
|
fix: Make update script windows compatible
|
2021-04-06 17:27:59 +02:00 |
Raphaël Vinot
|
fa6b4701c0
|
chg: update the cache at the right place.
|
2021-03-20 21:54:46 +01:00 |
Raphaël Vinot
|
13d34421dc
|
chg: Improve BG indexer
|
2021-03-20 01:13:37 +01:00 |
Raphaël Vinot
|
648d4d5b5b
|
chg: Add background ingester to the start script
|
2021-03-18 01:00:27 +01:00 |
Raphaël Vinot
|
b3541e0e78
|
new: background indexer
|
2021-03-12 16:53:00 +01:00 |
Raphaël Vinot
|
6059cb5219
|
chg: Remove useless code
|
2021-03-12 16:49:04 +01:00 |
Raphaël Vinot
|
82d9cc7b2f
|
fix: Properly rebuild indexed captures
|
2021-03-07 13:25:27 +01:00 |
Raphaël Vinot
|
3ec8015e14
|
chg: Better messages if website does not start
|
2021-02-21 23:40:47 +01:00 |
Raphaël Vinot
|
6149df06eb
|
chg: Make the cache entries a dataclass
Fix #99
|
2021-01-14 17:12:23 +01:00 |
Raphaël Vinot
|
354f269218
|
new: Integrate categorization in indexing
|
2020-11-09 16:02:54 +01:00 |
Raphaël Vinot
|
ea052c7c12
|
fix: Rename scrape -> capture in async
|
2020-11-05 14:14:33 +01:00 |
Raphaël Vinot
|
8b1e3585ea
|
chg: Improve initial caching.
|
2020-10-29 23:25:20 +01:00 |
Raphaël Vinot
|
39f88e9121
|
new: API to query URLs
|
2020-10-27 00:02:18 +01:00 |
Raphaël Vinot
|
c6c4da981c
|
chg: Improve start/stop
|
2020-10-22 16:41:00 +02:00 |
Raphaël Vinot
|
733e030839
|
fix: make flake8 happy
|
2020-10-13 16:38:56 +02:00 |
Fafner [_KeyZee_]
|
81fb9db3cf
|
Generating correct hashes
|
2020-10-13 16:18:01 +02:00 |
Fafner [_KeyZee_]
|
3d17a98799
|
Restart lookyloo after update
|
2020-10-13 15:05:36 +02:00 |
Raphaël Vinot
|
90a9ff9bb5
|
chg: Refactoring, add get_hashes
|
2020-10-09 18:05:25 +02:00 |
Raphaël Vinot
|
98eac69d1f
|
new: Add self check in update script.
|
2020-10-03 21:32:30 +02:00 |
Raphaël Vinot
|
c0ec0d7a50
|
chg: Bump minimal version of poetry, bump deps, fix pyproject
|
2020-10-03 21:19:43 +02:00 |
Raphaël Vinot
|
26cb2f1d53
|
chg: make 3rd party dl a python script
|
2020-09-28 13:57:21 +02:00 |
Raphaël Vinot
|
d33698357c
|
new: Update script.
|
2020-09-28 13:32:19 +02:00 |
Raphaël Vinot
|
8c97701ed7
|
fix: Force kill 3rdparty.sh
|
2020-09-22 16:33:55 +02:00 |
Raphaël Vinot
|
7a34095d9c
|
new: Config option for Flask IP and Port, reorganize config loading
|
2020-09-21 16:41:30 +02:00 |
Raphaël Vinot
|
9f4c77d5d2
|
chg: Cleanups, allow to add context from ressources page
|
2020-09-03 16:32:53 +02:00 |
Raphaël Vinot
|
1c5f4f5710
|
fix: Do not index private captures on public instance
|
2020-07-20 13:39:08 +02:00 |
Raphaël Vinot
|
dab2c53269
|
chg: More reasonable rebuild cache
|
2020-07-08 18:28:07 +02:00 |
Raphaël Vinot
|
0c5501016c
|
fix: Rebuild caches when tree doesn't exists
|
2020-07-08 15:52:26 +02:00 |
Raphaël Vinot
|
23419a31b9
|
fix: cleanup
|
2020-07-08 15:52:26 +02:00 |
Raphaël Vinot
|
29c78d3485
|
chg: Cleanup and improve index rendering
|
2020-07-08 15:51:45 +02:00 |
Raphaël Vinot
|
67b41ca8fb
|
chg: Improve intergration of cookies indexing
|
2020-07-08 15:51:01 +02:00 |
Raphaël Vinot
|
5ae7f0f7e4
|
new: Initial version of cookies indexing
|
2020-07-08 15:42:13 +02:00 |
Raphaël Vinot
|
d18f5f4f88
|
fix: Docker, capture form, error message.
|
2020-07-08 02:25:15 +02:00 |
Raphaël Vinot
|
05de56022f
|
chg: Use capture UUID as a reference everywhere
|
2020-06-29 12:01:31 +02:00 |
Raphaël Vinot
|
ce200717ec
|
chg: Update call to Lookyloo in async scrape
|
2020-03-31 16:57:16 +02:00 |
Raphaël Vinot
|
ba8574ff83
|
chg: Do not use eventlet/gevent anymore.
|
2020-01-23 11:32:36 +01:00 |
Raphaël Vinot
|
d34233ad5c
|
chg: Use poetry instead of pipenv
|
2020-01-21 17:39:18 +01:00 |
Raphaël Vinot
|
cfa300082f
|
fix: docker-compose should now work.
|
2019-11-01 21:05:08 -07:00 |
Raphaël Vinot
|
1dc71c4f0b
|
fix: Allow to disable scraping private IPs (async module).
|
2019-07-05 16:37:57 +02:00 |
Raphaël Vinot
|
306081281f
|
fix: reload cache on start, bump dependencies
|
2019-04-16 16:04:58 +02:00 |
Raphaël Vinot
|
d951d55367
|
fix: Missing import
|
2019-04-05 16:14:30 +02:00 |
Raphaël Vinot
|
12b8e4f949
|
chg: Improve async processing
|
2019-04-05 16:12:54 +02:00 |
Raphaël Vinot
|
da3d1fe392
|
fix: Avoid loading the cache multiple times
|
2019-04-05 15:07:22 +02:00 |
Raphaël Vinot
|
9ef9fef655
|
chg: Bump configs to be in-line with prod
|
2019-04-05 14:13:07 +02:00 |
Raphaël Vinot
|
35f4292ab0
|
fix: Systemd service, add proper stop script
|
2019-04-05 14:01:36 +02:00 |
Raphaël Vinot
|
1d244ef456
|
chg: Refactor code organisation
|
2019-01-30 14:30:01 +01:00 |
Raphaël Vinot
|
6bc316ebcf
|
new: Initial commit for client and async scraping
|
2019-01-29 18:37:13 +01:00 |
Raphaël Vinot
|
bbb8c5343f
|
chg: Cleanup, use pipfile
|
2019-01-23 15:13:29 +01:00 |