2017-05-03 14:25:58 +02:00
|
|
|
Overview
|
|
|
|
========
|
|
|
|
|
2018-06-20 10:48:13 +02:00
|
|
|
Redis and ARDB overview
|
2017-05-03 14:25:58 +02:00
|
|
|
--------------------------
|
|
|
|
|
2017-05-03 14:42:37 +02:00
|
|
|
* Redis on TCP port 6379
|
|
|
|
- DB 0 - Cache hostname/dns
|
|
|
|
- DB 1 - Paste meta-data
|
2017-05-03 14:25:58 +02:00
|
|
|
* Redis on TCP port 6380 - Redis Log only
|
2017-05-03 14:42:37 +02:00
|
|
|
* Redis on TCP port 6381
|
|
|
|
- DB 0 - PubSub + Queue and Paste content LRU cache
|
|
|
|
- DB 1 - _Mixer_ Cache
|
2018-06-20 10:48:13 +02:00
|
|
|
* ARDB on TCP port 6382
|
2019-02-18 15:24:47 +01:00
|
|
|
|
|
|
|
|
|
|
|
DB 1 - Curve
|
|
|
|
DB 2 - TermFreq
|
2021-02-10 15:28:56 +01:00
|
|
|
DB 3 - Trending/Trackers
|
2019-02-18 15:24:47 +01:00
|
|
|
DB 4 - Sentiments
|
|
|
|
DB 5 - TermCred
|
|
|
|
DB 6 - Tags
|
|
|
|
DB 7 - Metadata
|
|
|
|
DB 8 - Statistics
|
|
|
|
DB 9 - Crawler
|
2020-02-18 17:02:00 +01:00
|
|
|
DB 10 - Objects
|
2019-03-14 17:04:55 +01:00
|
|
|
|
2018-06-20 10:48:13 +02:00
|
|
|
* ARDB on TCP port <year>
|
2017-05-03 14:42:37 +02:00
|
|
|
- DB 0 - Lines duplicate
|
2018-06-20 10:48:13 +02:00
|
|
|
- DB 1 - Hashes
|
2017-05-03 14:25:58 +02:00
|
|
|
|
2019-02-18 15:24:47 +01:00
|
|
|
# Database Map:
|
|
|
|
|
2019-06-26 16:36:40 +02:00
|
|
|
### Redis cache
|
|
|
|
|
|
|
|
##### Brute force protection:
|
|
|
|
| Set Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| failed_login_ip:**ip** | **nb login failed** | TTL
|
|
|
|
| failed_login_user_id:**user_id** | **nb login failed** | TTL
|
|
|
|
|
2019-07-25 17:26:32 +02:00
|
|
|
##### Item Import:
|
|
|
|
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| **uuid**:nb_total | **nb total** | TTL *(if imported)*
|
|
|
|
| **uuid**:nb_end | **nb** | TTL *(if imported)*
|
|
|
|
| **uuid**:nb_sucess | **nb success** | TTL *(if imported)*
|
|
|
|
| **uuid**:end | **0 (in progress) or (item imported)** | TTL *(if imported)*
|
|
|
|
| **uuid**:processing | **process status: 0 or 1** | TTL *(if imported)*
|
|
|
|
| **uuid**:error | **error message** | TTL *(if imported)*
|
|
|
|
|
|
|
|
| Set Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| **uuid**:paste_submit_link | **item_path** | TTL *(if imported)*
|
|
|
|
|
2019-04-18 10:56:00 +02:00
|
|
|
## DB0 - Core:
|
|
|
|
|
|
|
|
##### Update keys:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| | |
|
|
|
|
| ail:version | **current version** |
|
|
|
|
| | |
|
|
|
|
| ail:update_**update_version** | **background update name** |
|
|
|
|
| | **background update name** |
|
|
|
|
| | **...** |
|
|
|
|
| | |
|
|
|
|
| ail:update_error | **update message error** |
|
|
|
|
| | |
|
|
|
|
| ail:update_in_progress | **update version in progress** |
|
|
|
|
| ail:current_background_update | **current update version** |
|
|
|
|
| | |
|
|
|
|
| ail:current_background_script | **name of the background script currently executed** |
|
|
|
|
| ail:current_background_script_stat | **progress in % of the background script** |
|
|
|
|
|
2019-06-20 14:47:59 +02:00
|
|
|
| Hset Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| ail:update_date | **update tag** | **update date** |
|
|
|
|
|
2019-05-03 16:52:05 +02:00
|
|
|
##### User Management:
|
2019-06-06 21:27:13 +02:00
|
|
|
| Hset Key | Field | Value |
|
2019-05-03 16:52:05 +02:00
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| user:all | **user id** | **password hash** |
|
2019-05-08 14:58:41 +02:00
|
|
|
| | | |
|
|
|
|
| user:tokens | **token** | **user id** |
|
|
|
|
| | | |
|
2019-06-07 17:14:11 +02:00
|
|
|
| user_metadata:**user id** | token | **token** |
|
2019-06-06 21:27:13 +02:00
|
|
|
| | change_passwd | **boolean** |
|
2019-06-20 15:49:40 +02:00
|
|
|
| | role | **role** |
|
2019-05-03 16:52:05 +02:00
|
|
|
|
2019-06-06 21:27:13 +02:00
|
|
|
| Set Key | Value |
|
2019-05-03 16:52:05 +02:00
|
|
|
| ------ | ------ |
|
|
|
|
| user_role:**role** | **user id** |
|
|
|
|
|
2019-06-06 21:27:13 +02:00
|
|
|
|
|
|
|
| Zrank Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| ail:all_role | **role** | **int, role priority (1=admin)** |
|
|
|
|
|
2019-07-24 15:51:06 +02:00
|
|
|
##### MISP Modules:
|
|
|
|
|
|
|
|
| Set Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| enabled_misp_modules | **module name** |
|
|
|
|
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| misp_module:**module name** | **module dict** |
|
|
|
|
|
2019-07-26 14:28:02 +02:00
|
|
|
##### Item Import:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| **uuid**:isfile | **boolean** |
|
|
|
|
| **uuid**:paste_content | **item_content** |
|
|
|
|
|
2019-04-15 13:27:46 +02:00
|
|
|
## DB2 - TermFreq:
|
2019-04-15 11:46:20 +02:00
|
|
|
|
2019-07-26 14:28:02 +02:00
|
|
|
| Set Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| submitted:uuid | **uuid** |
|
|
|
|
| **uuid**:ltags | **tag** |
|
|
|
|
| **uuid**:ltagsgalaxies | **tag** |
|
|
|
|
|
2019-09-13 16:33:34 +02:00
|
|
|
## DB3 - Leak Hunter:
|
2019-08-06 17:03:49 +02:00
|
|
|
|
2019-09-11 15:33:04 +02:00
|
|
|
##### Tracker metadata:
|
2019-08-06 17:03:49 +02:00
|
|
|
| Hset - Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
2019-09-13 16:33:34 +02:00
|
|
|
| tracker:**uuid** | tracker | **tacked word/set/regex** |
|
|
|
|
| | type | **word/set/regex** |
|
|
|
|
| | date | **date added** |
|
|
|
|
| | user_id | **created by user_id** |
|
|
|
|
| | dashboard | **0/1 Display alert on dashboard** |
|
|
|
|
| | description | **Tracker description** |
|
|
|
|
| | level | **0/1 Tracker visibility** |
|
2019-08-06 17:03:49 +02:00
|
|
|
|
2019-09-11 15:33:04 +02:00
|
|
|
##### Tracker by user_id (visibility level: user only):
|
2019-08-06 17:03:49 +02:00
|
|
|
| Set - Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-09-11 15:33:04 +02:00
|
|
|
| user:tracker:**user_id** | **uuid - tracker uuid** |
|
|
|
|
| user:tracker:**user_id**:**word/set/regex - tracker type** | **uuid - tracker uuid** |
|
2019-08-06 17:03:49 +02:00
|
|
|
|
2019-09-11 15:33:04 +02:00
|
|
|
##### Global Tracker (visibility level: all users):
|
2019-08-06 17:03:49 +02:00
|
|
|
| Set - Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-09-11 15:33:04 +02:00
|
|
|
| gobal:tracker | **uuid - tracker uuid** |
|
|
|
|
| gobal:tracker:**word/set/regex - tracker type** | **uuid - tracker uuid** |
|
2019-08-06 17:03:49 +02:00
|
|
|
|
2019-09-11 15:33:04 +02:00
|
|
|
##### All Tracker by type:
|
2019-08-06 17:03:49 +02:00
|
|
|
| Set - Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-09-11 15:33:04 +02:00
|
|
|
| all:tracker:**word/set/regex - tracker type** | **tracked item** |
|
2019-08-06 17:03:49 +02:00
|
|
|
|
|
|
|
| Set - Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-09-11 15:33:04 +02:00
|
|
|
| all:tracker_uuid:**tracker type**:**tracked item** | **uuid - tracker uuid** |
|
2019-08-06 17:03:49 +02:00
|
|
|
|
2019-09-11 15:33:04 +02:00
|
|
|
##### All Tracked items:
|
2019-08-06 17:03:49 +02:00
|
|
|
| Set - Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-09-11 15:33:04 +02:00
|
|
|
| tracker:item:**uuid**:**date** | **item_id** |
|
2019-08-06 17:03:49 +02:00
|
|
|
|
2019-09-11 15:33:04 +02:00
|
|
|
##### All Tracked tags:
|
2019-08-06 17:03:49 +02:00
|
|
|
| Set - Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-09-11 15:33:04 +02:00
|
|
|
| tracker:tags:**uuid** | **tag** |
|
2019-08-06 17:03:49 +02:00
|
|
|
|
2019-09-11 15:33:04 +02:00
|
|
|
##### All Tracked mail:
|
2019-08-06 17:03:49 +02:00
|
|
|
| Set - Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-09-11 15:33:04 +02:00
|
|
|
| tracker:mail:**uuid** | **mail** |
|
2019-08-06 17:03:49 +02:00
|
|
|
|
2019-09-11 15:33:04 +02:00
|
|
|
##### Refresh Tracker:
|
2019-04-15 11:46:20 +02:00
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-09-11 15:33:04 +02:00
|
|
|
| tracker:refresh:word | **last refreshed epoch** |
|
|
|
|
| tracker:refresh:set | - |
|
|
|
|
| tracker:refresh:regex | - |
|
2019-04-15 11:46:20 +02:00
|
|
|
|
2019-09-11 15:33:04 +02:00
|
|
|
##### Zset Stat Tracker:
|
2019-04-15 11:46:20 +02:00
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
2019-09-11 15:33:04 +02:00
|
|
|
| tracker:stat:**uuid** | **date** | **nb_seen** |
|
2019-04-15 11:46:20 +02:00
|
|
|
|
2019-08-09 14:20:13 +02:00
|
|
|
##### Stat token:
|
2019-04-15 11:46:20 +02:00
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
2019-08-09 14:20:13 +02:00
|
|
|
| stat_token_total_by_day:**date** | **word** | **nb_seen** |
|
2019-04-15 11:46:20 +02:00
|
|
|
| | | |
|
2019-08-09 14:20:13 +02:00
|
|
|
| stat_token_per_item_by_day:**date** | **word** | **nb_seen** |
|
|
|
|
|
|
|
|
| Set - Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| stat_token_history | **date** |
|
2019-04-15 11:46:20 +02:00
|
|
|
|
|
|
|
## DB6 - Tags:
|
2019-02-18 15:24:47 +01:00
|
|
|
|
|
|
|
##### Hset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
2019-03-14 17:04:55 +01:00
|
|
|
| tag_metadata:**tag** | first_seen | **date** |
|
|
|
|
| tag_metadata:**tag** | last_seen | **date** |
|
2019-02-18 15:24:47 +01:00
|
|
|
|
|
|
|
##### Set:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-03-14 17:04:55 +01:00
|
|
|
| list_tags | **tag** |
|
2020-01-06 17:07:52 +01:00
|
|
|
| list_tags:**object_type** | **tag** |
|
|
|
|
| list_tags:domain | **tag** |
|
|
|
|
||
|
2019-03-14 17:04:55 +01:00
|
|
|
| active_taxonomies | **taxonomie** |
|
|
|
|
| active_galaxies | **galaxie** |
|
|
|
|
| active_tag_**taxonomie or galaxy** | **tag** |
|
|
|
|
| synonym_tag_misp-galaxy:**galaxy** | **tag synonym** |
|
|
|
|
| list_export_tags | **user_tag** |
|
2020-01-06 17:07:52 +01:00
|
|
|
||
|
2019-03-14 17:04:55 +01:00
|
|
|
| **tag**:**date** | **paste** |
|
2020-01-06 17:07:52 +01:00
|
|
|
| **object_type**:**tag** | **object_id** |
|
|
|
|
||
|
|
|
|
| DB7 |
|
|
|
|
| tag:**object_id** | **tag** |
|
2019-02-18 15:24:47 +01:00
|
|
|
|
|
|
|
##### old:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-03-14 17:04:55 +01:00
|
|
|
| *tag* | *paste* |
|
2019-02-18 15:24:47 +01:00
|
|
|
|
2019-03-22 16:48:07 +01:00
|
|
|
## DB7 - Metadata:
|
|
|
|
|
|
|
|
#### Crawled Items:
|
|
|
|
##### Hset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| paste_metadata:**item path** | super_father | **first url crawled** |
|
|
|
|
| | father | **item father** |
|
|
|
|
| | domain | **crawled domain**:**domain port** |
|
2019-04-12 16:07:40 +02:00
|
|
|
| | screenshot | **screenshot hash** |
|
2019-02-18 15:24:47 +01:00
|
|
|
|
2019-04-08 17:04:09 +02:00
|
|
|
##### Set:
|
|
|
|
| Key | Field |
|
|
|
|
| ------ | ------ |
|
|
|
|
| tag:**item path** | **tag** |
|
|
|
|
| | |
|
|
|
|
| paste_children:**item path** | **item path** |
|
|
|
|
| | |
|
|
|
|
| hash_paste:**item path** | **hash** |
|
|
|
|
| base64_paste:**item path** | **hash** |
|
|
|
|
| hexadecimal_paste:**item path** | **hash** |
|
|
|
|
| binary_paste:**item path** | **hash** |
|
|
|
|
|
|
|
|
##### Zset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| nb_seen_hash:**hash** | **item** | **nb_seen** |
|
|
|
|
| base64_hash:**hash** | **item** | **nb_seen** |
|
|
|
|
| binary_hash:**hash** | **item** | **nb_seen** |
|
|
|
|
| hexadecimal_hash:**hash** | **item** | **nb_seen** |
|
|
|
|
|
2019-05-14 17:49:31 +02:00
|
|
|
#### PgpDump
|
|
|
|
|
|
|
|
##### Hset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
2019-05-24 12:02:43 +02:00
|
|
|
| pgpdump_metadata_key:*key id* | first_seen | **date** |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | last_seen | **date** |
|
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| pgpdump_metadata_name:*name* | first_seen | **date** |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | last_seen | **date** |
|
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| pgpdump_metadata_mail:*mail* | first_seen | **date** |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | last_seen | **date** |
|
|
|
|
|
|
|
|
##### set:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-05-24 12:02:43 +02:00
|
|
|
| set_pgpdump_key:*key id* | *item_path* |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| set_pgpdump_name:*name* | *item_path* |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| set_pgpdump_mail:*mail* | *item_path* |
|
2019-10-17 16:39:43 +02:00
|
|
|
| | |
|
|
|
|
| | |
|
|
|
|
| set_domain_pgpdump_**pgp_type**:**key** | **domain** |
|
2019-05-14 17:49:31 +02:00
|
|
|
|
|
|
|
##### Hset date:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ |
|
2019-05-24 12:02:43 +02:00
|
|
|
| pgpdump:key:*date* | *key* | *nb seen* |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| pgpdump:name:*date* | *name* | *nb seen* |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| pgpdump:mail:*date* | *mail* | *nb seen* |
|
2019-05-14 17:49:31 +02:00
|
|
|
|
|
|
|
##### zset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
2019-05-24 12:02:43 +02:00
|
|
|
| pgpdump_all:key | *key* | *nb seen* |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| pgpdump_all:name | *name* | *nb seen* |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| pgpdump_all:mail | *mail* | *nb seen* |
|
2019-05-14 17:49:31 +02:00
|
|
|
|
|
|
|
##### set:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-05-24 12:02:43 +02:00
|
|
|
| item_pgpdump_key:*item_path* | *key* |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| item_pgpdump_name:*item_path* | *name* |
|
2019-05-14 17:49:31 +02:00
|
|
|
| | |
|
2019-05-24 12:02:43 +02:00
|
|
|
| item_pgpdump_mail:*item_path* | *mail* |
|
2019-10-17 16:39:43 +02:00
|
|
|
| | |
|
|
|
|
| | |
|
|
|
|
| domain_pgpdump_**pgp_type**:**domain** | **key** |
|
2019-05-14 17:49:31 +02:00
|
|
|
|
2020-04-10 10:49:21 +02:00
|
|
|
#### SimpleCorrelation:
|
|
|
|
##### zset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| s_correl:*correlation name*:all | *object_id* | *nb_seen* |
|
|
|
|
| s_correl:date:*correlation name*:*date_day* | *object_id* | *nb_seen |
|
|
|
|
|
|
|
|
##### set:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| s_correl:set_*object type*_*correlation name*:*object_id* | *item_id* |
|
|
|
|
| *object type*:s_correl:*correlation name*:*object_id* | *correlation_id* |
|
|
|
|
|
|
|
|
object type: item + domain
|
|
|
|
|
|
|
|
##### hset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| 's_correl:*correlation name*:metadata:*obj_id* | first_seen | *first_seen* |
|
|
|
|
| 's_correl:*correlation name*:metadata:*obj_id* | last_seen | *last_seen* |
|
|
|
|
|
2019-05-21 16:14:09 +02:00
|
|
|
#### Cryptocurrency
|
|
|
|
|
|
|
|
Supported cryptocurrency:
|
|
|
|
- bitcoin
|
2019-10-17 16:39:43 +02:00
|
|
|
- bitcoin-cash
|
|
|
|
- dash
|
|
|
|
- etherum
|
|
|
|
- litecoin
|
|
|
|
- monero
|
|
|
|
- zcash
|
2019-05-21 16:14:09 +02:00
|
|
|
|
|
|
|
##### Hset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| cryptocurrency_metadata_**cryptocurrency name**:**cryptocurrency address** | first_seen | **date** |
|
|
|
|
| | last_seen | **date** |
|
|
|
|
|
|
|
|
##### set:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-10-17 16:39:43 +02:00
|
|
|
| set_cryptocurrency_**cryptocurrency name**:**cryptocurrency address** | **item_path** | PASTE
|
|
|
|
| domain_cryptocurrency_**cryptocurrency name**:**cryptocurrency address** | **domain** | DOMAIN
|
2019-05-21 16:14:09 +02:00
|
|
|
|
|
|
|
##### Hset date:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ |
|
2019-05-24 12:02:43 +02:00
|
|
|
| cryptocurrency:**cryptocurrency name**:**date** | **cryptocurrency address** | **nb seen** |
|
2019-05-21 16:14:09 +02:00
|
|
|
|
|
|
|
##### zset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| cryptocurrency_all:**cryptocurrency name** | **cryptocurrency address** | **nb seen** |
|
|
|
|
|
|
|
|
##### set:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-10-17 16:39:43 +02:00
|
|
|
| item_cryptocurrency_**cryptocurrency name**:**item_path** | **cryptocurrency address** | PASTE
|
|
|
|
| domain_cryptocurrency_**cryptocurrency name**:**item_path** | **cryptocurrency address** | DOMAIN
|
2019-05-21 16:14:09 +02:00
|
|
|
|
2019-10-17 16:39:43 +02:00
|
|
|
#### HASH
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| hash_domain:**domain** | **hash** |
|
|
|
|
| domain_hash:**hash** | **domain** |
|
2019-05-21 16:14:09 +02:00
|
|
|
|
2019-03-14 17:04:55 +01:00
|
|
|
## DB9 - Crawler:
|
2019-02-18 15:24:47 +01:00
|
|
|
|
2019-03-14 17:04:55 +01:00
|
|
|
##### Hset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
2019-04-08 17:04:09 +02:00
|
|
|
| **service type**_metadata:**domain** | first_seen | **date** |
|
2019-03-14 17:04:55 +01:00
|
|
|
| | last_check | **date** |
|
2019-03-22 16:48:07 +01:00
|
|
|
| | ports | **port**;**port**;**port** ... |
|
2019-03-14 17:04:55 +01:00
|
|
|
| | paste_parent | **parent last crawling (can be auto or manual)** |
|
2019-02-18 15:24:47 +01:00
|
|
|
|
2019-03-14 17:04:55 +01:00
|
|
|
##### Zset:
|
|
|
|
| Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
2019-04-24 16:42:05 +02:00
|
|
|
| crawler\_history\_**service type**:**domain**:**port** | **item root (first crawled item)** | **epoch (seconds)** |
|
|
|
|
|
|
|
|
##### Set:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| screenshot:**sha256** | **item path** |
|
2019-02-18 15:24:47 +01:00
|
|
|
|
2019-04-23 11:15:34 +02:00
|
|
|
##### crawler config:
|
2019-03-14 17:04:55 +01:00
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
2019-03-22 16:48:07 +01:00
|
|
|
| crawler\_config:**crawler mode**:**service type**:**domain** | **json config** |
|
2019-03-14 17:04:55 +01:00
|
|
|
|
2019-04-23 11:15:34 +02:00
|
|
|
##### automatic crawler config:
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| crawler\_config:**crawler mode**:**service type**:**domain**:**url** | **json config** |
|
|
|
|
|
2019-03-22 16:48:07 +01:00
|
|
|
###### exemple json config:
|
2019-03-14 17:04:55 +01:00
|
|
|
```json
|
|
|
|
{
|
|
|
|
"closespider_pagecount": 1,
|
|
|
|
"time": 3600,
|
|
|
|
"depth_limit": 0,
|
|
|
|
"har": 0,
|
|
|
|
"png": 0
|
|
|
|
}
|
|
|
|
```
|
2018-06-29 10:02:29 +02:00
|
|
|
|
2020-06-09 18:33:41 +02:00
|
|
|
### Splash containers and proxies:
|
|
|
|
| SET - Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| all_proxy | **proxy name** |
|
|
|
|
| all_splash | **splash name** |
|
|
|
|
|
|
|
|
| HSET - Key | Field | Value |
|
|
|
|
| ------ | ------ | ------ |
|
|
|
|
| proxy:metadata:**proxy name** | host | **host** |
|
|
|
|
| proxy:metadata:**proxy name** | port | **port** |
|
|
|
|
| proxy:metadata:**proxy name** | type | **type** |
|
|
|
|
| proxy:metadata:**proxy name** | crawler_type | **crawler_type** |
|
|
|
|
| proxy:metadata:**proxy name** | description | **proxy description** |
|
|
|
|
| | | |
|
|
|
|
| splash:metadata:**splash name** | description | **splash description** |
|
|
|
|
| splash:metadata:**splash name** | crawler_type | **crawler_type** |
|
|
|
|
| splash:metadata:**splash name** | proxy | **splash proxy (None if null)** |
|
|
|
|
|
|
|
|
| SET - Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| splash:url:**container name** | **splash url** |
|
|
|
|
| proxy:splash:**proxy name** | **container name** |
|
|
|
|
|
|
|
|
| Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| splash:map:url:name:**splash url** | **container name** |
|
|
|
|
|
2019-10-17 16:39:43 +02:00
|
|
|
##### CRAWLER QUEUES:
|
|
|
|
| SET - Key | Value |
|
|
|
|
| ------ | ------ |
|
|
|
|
| onion_crawler_queue | **url**;**item_id** | RE-CRAWL
|
|
|
|
| regular_crawler_queue | - |
|
|
|
|
| | |
|
|
|
|
| onion_crawler_priority_queue | **url**;**item_id** | USER
|
|
|
|
| regular_crawler_priority_queue | - |
|
|
|
|
| | |
|
|
|
|
| onion_crawler_discovery_queue | **url**;**item_id** | DISCOVER
|
|
|
|
| regular_crawler_discovery_queue | - |
|
|
|
|
|
|
|
|
##### TO CHANGE:
|
|
|
|
|
2018-07-30 09:21:22 +02:00
|
|
|
ARDB overview
|
2019-04-10 17:47:40 +02:00
|
|
|
|
2018-11-15 10:39:41 +01:00
|
|
|
----------------------------------------- SENTIMENT ------------------------------------
|
|
|
|
|
|
|
|
SET - 'Provider_set' Provider
|
2019-04-10 17:47:40 +02:00
|
|
|
|
2018-11-15 10:39:41 +01:00
|
|
|
KEY - 'UniqID' INT
|
|
|
|
|
|
|
|
SET - provider_timestamp UniqID
|
|
|
|
|
|
|
|
SET - UniqID avg_score
|
|
|
|
|
2018-11-20 14:39:45 +01:00
|
|
|
|
2018-06-29 10:02:29 +02:00
|
|
|
|
|
|
|
* DB 7 - Metadata:
|
2019-04-10 17:47:40 +02:00
|
|
|
|
2018-11-20 14:39:45 +01:00
|
|
|
|
|
|
|
----------------------------------------------------------------------------------------
|
2018-06-29 10:02:29 +02:00
|
|
|
----------------------------------------- BASE64 ----------------------------------------
|
|
|
|
|
|
|
|
HSET - 'metadata_hash:'+hash 'saved_path' saved_path
|
|
|
|
'size' size
|
|
|
|
'first_seen' first_seen
|
|
|
|
'last_seen' last_seen
|
|
|
|
'estimated_type' estimated_type
|
|
|
|
'vt_link' vt_link
|
|
|
|
'vt_report' vt_report
|
|
|
|
'nb_seen_in_all_pastes' nb_seen_in_all_pastes
|
2018-07-20 10:32:52 +02:00
|
|
|
'base64_decoder' nb_encoded
|
|
|
|
'binary_decoder' nb_encoded
|
2018-06-29 10:02:29 +02:00
|
|
|
|
2018-07-19 16:50:42 +02:00
|
|
|
SET - 'all_decoder' decoder*
|
|
|
|
|
2018-09-12 10:06:53 +02:00
|
|
|
SET - 'hash_all_type' hash_type *
|
2018-07-18 11:45:19 +02:00
|
|
|
SET - 'hash_base64_all_type' hash_type *
|
|
|
|
SET - 'hash_binary_all_type' hash_type *
|
|
|
|
|
2018-07-20 09:43:09 +02:00
|
|
|
ZADD - 'hash_date:'+20180622 hash * nb_seen_this_day
|
2018-06-29 10:02:29 +02:00
|
|
|
ZADD - 'base64_date:'+20180622 hash * nb_seen_this_day
|
2018-07-19 16:50:42 +02:00
|
|
|
ZADD - 'binary_date:'+20180622 hash * nb_seen_this_day
|
2018-06-29 10:02:29 +02:00
|
|
|
|
|
|
|
ZADD - 'base64_type:'+type date nb_seen
|
2018-07-18 11:45:19 +02:00
|
|
|
ZADD - 'binary_type:'+type date nb_seen
|
2018-07-23 11:11:52 +02:00
|
|
|
|
|
|
|
GET - 'base64_decoded:'+date nd_decoded
|
|
|
|
GET - 'binary_decoded:'+date nd_decoded
|