Analysis Information Leak framework
 
 
 
 
 
Go to file
Terrtia 5165a5de2f
fix: [Crawler] fix index + redis history key
2019-03-22 17:14:27 +01:00
ansible Update README.md 2018-07-23 19:10:05 +02:00
bin fix: [Crawler] fix index + redis history key 2019-03-22 17:14:27 +01:00
configs 6381 - by default just listen on loopback interface [security hardening] 2019-01-27 16:03:40 +01:00
doc new: [slides] Added training slides december 2018 2018-12-20 09:17:57 +01:00
docsphinx/source
etc/splash/proxy-profiles chg: [Onion] add onion splash crawler 2018-08-09 17:42:21 +02:00
files chg: [Crawler] change BDD, save i2p links 2018-08-21 15:54:53 +02:00
logs Travis, print logs 2016-01-19 12:01:45 +01:00
pystemon Install pystemon and start pystemon-feeder in docker 2018-09-03 16:01:39 +02:00
samples/2018/01/01 create first test 2018-04-11 10:14:33 +02:00
tests clean 2018-05-04 14:25:47 +02:00
var/www chg: [Crawler] handle port: crawling + history 2019-03-22 16:48:07 +01:00
.dockerignore Added Dockerfile to automate the build of a Docker image based on Ubuntu 2016-08-27 10:29:20 +02:00
.gitignore Merge branch 'master' into onion_crawler 2018-09-24 16:35:04 +02:00
.travis.yml fix travis tests 2 2018-04-26 17:06:19 +02:00
Dockerfile Changed the Dockerfile so that it starts with installing any and all outstanding updates for Ubuntu 2018-12-09 12:29:31 +00:00
HOWTO.md chg: [python] Added necessary checks in LAUNCH.sh to be able to launch the script without doing bin/activate 2018-11-22 14:57:24 +09:00
LICENSE
OVERVIEW.md chg: [Crawler] handle port: crawling + history 2019-03-22 16:48:07 +01:00
README.md chg: [doc] how to cite AIL in academic publications 2019-01-31 13:24:41 +01:00
crawler_hidden_services_install.sh chg: [Crawler] add docs 2018-09-27 11:14:29 +02:00
crawler_requirements.txt chg: [Crawler] add launcher and install 2018-09-24 16:23:14 +02:00
docker_start.sh fix: [LAUNCH.sh] make pystemon optionnal 2018-09-19 09:52:27 +02:00
installing_deps.sh fix: [install script] use virtualenv package 2018-12-11 16:55:47 +01:00
installing_deps_archlinux.sh chg: [doc] Marked unmaintained efforts as such. 2018-07-23 18:27:23 +02:00
pip3_packages_requirement.txt fix: [requirements] remove duplicate entry, #296 2018-12-05 08:57:17 +01:00
python3_upgrade.sh python 3 upgrade mv old redis db 2018-05-09 13:46:36 +02:00
reset_AIL.sh Added AIL reset script 2017-08-23 15:05:51 +02:00

README.md

AIL

Logo

AIL framework - Framework for Analysis of Information Leaks

AIL is a modular framework to analyse potential information leaks from unstructured data sources like pastes from Pastebin or similar services or unstructured data streams. AIL framework is flexible and can be extended to support other functionalities to mine or process sensitive information (e.g. data leak prevention).

Dashboard

Latest Release GitHub version
Contributors
License

Features

  • Modular architecture to handle streams of unstructured or structured information
  • Default support for external ZMQ feeds, such as provided by CIRCL or other providers
  • Multiple feed support
  • Each module can process and reprocess the information already processed by AIL
  • Detecting and extracting URLs including their geographical location (e.g. IP address location)
  • Extracting and validating potential leak of credit cards numbers, credentials, ...
  • Extracting and validating email addresses leaked including DNS MX validation
  • Module for extracting Tor .onion addresses (to be further processed for analysis)
  • Keep tracks of duplicates (and diffing between each duplicate found)
  • Extracting and validating potential hostnames (e.g. to feed Passive DNS systems)
  • A full-text indexer module to index unstructured information
  • Statistics on modules and web
  • Real-time modules manager in terminal
  • Global sentiment analysis for each providers based on nltk vader module
  • Terms, Set of terms and Regex tracking and occurrence
  • Many more modules for extracting phone numbers, credentials and others
  • Alerting to MISP to share found leaks within a threat intelligence platform using MISP standard
  • Detect and decode encoded file (Base64, hex encoded or your own decoding scheme) and store files
  • Detect Amazon AWS and Google API keys
  • Detect Bitcoin address and Bitcoin private keys
  • Detect private keys, certificate, keys (including SSH, OpenVPN)
  • Detect IBAN bank accounts
  • Tagging system with MISP Galaxy and MISP Taxonomies tags
  • UI paste submission
  • Create events on MISP and cases on The Hive
  • Automatic paste export at detection on MISP (events) and The Hive (alerts) on selected tags
  • Extracted and decoded files can be searched by date range, type of file (mime-type) and encoding discovered
  • Graph relationships between decoded file (hashes)
  • Tor hidden services crawler to crawl and parse output
  • Tor onion availability is monitored to detect up and down of hidden services
  • Browser hidden services are screenshot and integrated in the analysed output including a blurring screenshot interface (to avoid "burning the eyes" of the security analysis with specific content)
  • Tor hidden services is part of the standard framework, all the AIL modules are available to the crawled hidden services

Installation

Type these command lines for a fully automated installation and start AIL framework:

git clone https://github.com/CIRCL/AIL-framework.git
cd AIL-framework
./installing_deps.sh
cd ~/AIL-framework/
. ./AILENV/bin/activate
cd bin/
./LAUNCH.sh

The default installing_deps.sh is for Debian and Ubuntu based distributions. For Arch linux based distributions, you can replace it with installing_deps_archlinux.sh.

There is also a Travis file used for automating the installation that can be used to build and install AIL on other systems.

Installation Notes

In order to use AIL combined with ZFS or unprivileged LXC it's necessary to disable Direct I/O in $AIL_HOME/configs/6382.conf by changing the value of the directive use_direct_io_for_flush_and_compaction to false.

Python 3 Upgrade

To upgrade from an existing AIL installation, you have to launch python3_upgrade.sh, this script will delete and create a new virtual environment. The script will upgrade the packages but won't keep your previous data (neverthless the data is copied into a directory called old). If you install from scratch, you don't require to launch the python3_upgrade.sh.

Docker Quick Start (Ubuntu 16.04 LTS)

  1. Install Docker
sudo su
apt-get install -y curl
curl https://get.docker.com | /bin/bash
  1. Type these commands to build the Docker image:
git clone https://github.com/CIRCL/AIL-framework.git
cd AIL-framework
docker build -t ail-framework .
  1. To start AIL on port 7000, type the following command below:
docker run -p 7000:7000 ail-framework
  1. To debug the running container, type the following command and note the container name or identifier:
docker ps

After getting the name or identifier type the following commands:

docker exec -it CONTAINER_NAME_OR_IDENTIFIER bash
cd /opt/ail

Install using Ansible

Please check the Ansible readme.

Starting AIL web interface

To start the web interface, you first need to fetch the required JavaScript/CSS files:

cd $AILENV
cd var/www/
bash update_thirdparty.sh

and then you can start the web interface python script:

cd $AILENV
cd var/www/
./Flask_server.py

Eventually you can browse the status of the AIL framework website at the following URL:

http://localhost:7000/

Training

CIRCL organises training on how to use or extend the AIL framework. The next training will be Thursday, 20 Dec in Luxembourg.

HOWTO

HOWTO are available in HOWTO.md

Privacy and GDPR

AIL information leaks analysis and the GDPR in the context of collection, analysis and sharing information leaks document provides an overview how to use AIL in a lawfulness context especially in the scope of General Data Protection Regulation.

Research using AIL

If you write academic paper, relying or using AIL, it can be cited with the following BibTeX:

@inproceedings{mokaddem2018ail,
  title={AIL-The design and implementation of an Analysis Information Leak framework},
  author={Mokaddem, Sami and Wagener, G{\'e}rard and Dulaunoy, Alexandre},
  booktitle={2018 IEEE International Conference on Big Data (Big Data)},
  pages={5049--5057},
  year={2018},
  organization={IEEE}
}

Screenshots

Tor hidden service crawler

Tor hidden service

Trending-Web Trending-Modules

Extracted encoded files from pastes

Extracted files from pastes Relationships between extracted files from encoded file in unstructured data

Browsing

Browse-Pastes

Tagging system

Tags

MISP and The Hive, automatic events and alerts creation

paste_submit

Paste submission

paste_submit

Sentiment analysis

Sentiment

Terms manager and occurence

Term-Manager

Top terms

Term-Top Term-Plot

AIL framework screencast

Command line module manager

Module-Manager

License

    Copyright (C) 2014 Jules Debra
    Copyright (C) 2014-2019 CIRCL - Computer Incident Response Center Luxembourg (c/o smile, security made in Lëtzebuerg, Groupement d'Intérêt Economique)
    Copyright (c) 2014-2019 Raphaël Vinot
    Copyright (c) 2014-2019 Alexandre Dulaunoy
    Copyright (c) 2016-2019 Sami Mokaddem
    Copyright (c) 2018-2019 Thirion Aurélien

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU Affero General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU Affero General Public License for more details.

    You should have received a copy of the GNU Affero General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.