Lookyloo is a web interface allowing to scrape a website and then displays a tree of domains calling each other. https://lookyloo.circl.lu/
Go to file
Raphaël Vinot 6bc316ebcf new: Initial commit for client and async scraping 2019-01-29 18:37:13 +01:00
bin new: Initial commit for client and async scraping 2019-01-29 18:37:13 +01:00
cache new: Initial commit for client and async scraping 2019-01-29 18:37:13 +01:00
doc chg: Add screenshot 2018-01-05 17:10:38 +01:00
etc chg: Update system config files 2019-01-28 10:03:51 +01:00
lookyloo new: Initial commit for client and async scraping 2019-01-29 18:37:13 +01:00
.gitignore new: Initial commit for client and async scraping 2019-01-29 18:37:13 +01:00
3rdparty.sh chg: Cleanup, use pipfile 2019-01-23 15:13:29 +01:00
Dockerfile fix: docker stuff 2019-01-23 15:30:30 +01:00
LICENSE Update LICENSE 2018-03-16 11:54:46 +01:00
Pipfile new: Initial commit for client and async scraping 2019-01-29 18:37:13 +01:00
Pipfile.lock new: Initial commit for client and async scraping 2019-01-29 18:37:13 +01:00
README.md chg: Cleanup, use pipfile 2019-01-23 15:13:29 +01:00
__init__.py Add config to run as service behind nginx 2017-08-12 20:12:14 +02:00
docker-compose.yml fix: docker stuff 2019-01-23 15:30:30 +01:00
lookyloo.ini fix: disable file-wrapper 2018-03-22 18:50:37 +01:00
setup.py new: Initial commit for client and async scraping 2019-01-29 18:37:13 +01:00
wsgi.py Add config to run as service behind nginx 2017-08-12 20:12:14 +02:00

README.md

Lookyloo icon

Lookyloo is a web interface allowing to scrape a website and then displays a tree of domains calling each other.

What is that name?!

1. People who just come to look.
2. People who go out of their way to look at people or something often causing crowds and more disruption.
3. People who enjoy staring at watching other peoples misfortune. Oftentimes car onlookers to car accidents.
Same as Looky Lou; often spelled as Looky-loo (hyphen) or lookylou
In L.A. usually the lookyloo's cause more accidents by not paying full attention to what is ahead of them.

Source: Urban Dictionary

Screenshot

Screenshot of Lookyloo

Implementation details

This code is very heavily inspired by webplugin and adapted to use flask as backend.

The two core dependencies of this project are the following:

  • ETE Toolkit: A Python framework for the analysis and visualization of trees.
  • Splash: Lightweight, scriptable browser as a service with an HTTP API

Installation

IMPORTANT: Use pipenv

NOTE: Yes, it requires python3.6+. No, it will never support anything older.

Installation of Splash

You need a running splash instance, preferably on docker

sudo apt install docker.io
sudo docker pull scrapinghub/splash
sudo docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash --disable-ui --disable-lua
# On a server with a decent abount of RAM, you may want to run it this way:
# sudo docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash --disable-ui -s 100 --disable-lua -m 50000

Installation of Lookyloo

got clone https://github.com/CIRCL/lookyloo.git
cd lookyloo
pipenv install
echo LOOKYLOO_HOME="'`pwd`'" > .env
echo FLASK_APP="'`pwd`/lookyloo'" >> .env
wget https://d3js.org/d3.v5.min.js -O lookyloo/static/d3.v5.min.js
wget https://cdn.rawgit.com/eligrey/FileSaver.js/5733e40e5af936eb3f48554cf6a8a7075d71d18a/FileSaver.js -O lookyloo/static/FileSaver.js

Run the app

flask run

Run the app in production

With a reverse proxy (Nginx)

pip install uwsgi

Config files

You have to configure the two following files:

  • etc/nginx/sites-available/lookyloo
  • etc/systemd/system/lookyloo.service

And copy them to the appropriate directories and run the following command:

sudo ln -s /etc/nginx/sites-available/lookyloo /etc/nginx/sites-enabled

If needed, remove the default site

sudo rm /etc/nginx/sites-enabled/default

Make sure everything is working:

sudo systemctl start lookyloo
sudo systemctl enable lookyloo
sudo nginx -t
# If it is cool:
sudo service nginx restart

And you can open http://<IP-or-domain>/

Now, you should configure TLS (let's encrypt and so on)

Run the app with Docker

Dockerfile

The repository includes a Dockerfile for building a containerized instance of the app.

Lookyloo stores the scraped data in /lookyloo/scraped. If you want to persist the scraped data between runs it is sufficient to define a volume for this directory.

Running a complete setup with Docker Compose

Additionally you can start a complete setup, including the necessary Docker instance of splashy, by using Docker Compose and the included service definition in docker-compose.yml by running

docker-compose up

After building and startup is complete lookyloo should be available at http://localhost:5000/

If you want to persist the data between different runs uncomment the "volumes" definition in the last two lines of docker-compose.yml and define a data storage directory in your Docker host system there.