Merge pull request #28 from stricaud/sightingdb-format

Sightingdb format
pull/29/head
Alexandre Dulaunoy 2019-11-05 11:06:35 +01:00 committed by GitHub
commit cb08ca1c63
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 146 additions and 0 deletions

View File

@ -0,0 +1,8 @@
MMARK:=mmark -xml2 -page
docs = $(wildcard *.md)
all: $(docs)
$(MMARK) $< > $<.xml
xml2rfc --text $<.xml
xml2rfc --html $<.xml

138
sightingdb-format/raw.md Executable file
View File

@ -0,0 +1,138 @@
%%%
Title = "SightingDB format"
abbrev = "SightingDB format"
category = "info"
docName = "draft-tricaud-sightingdb-format"
ipr= "trust200902"
area = "Security"
date = 2019-11-03T00:00:00Z
[[author]]
initials="S."
surname="Tricaud"
fullname="Sebastien Tricaud"
abbrev="Devo Inc."
organization = "Devo Inc."
[author.address]
email = "sebastien.tricaud@devo.com"
phone = "+1 866-221-2254"
[author.address.postal]
street = "150 Cambridgepark Drive"
city = "Cambridge, MA"
code = "02140"
country = "USA"
%%%
.# Abstract
This document describes the format used by SightingDB to give automated context to a given Attribute
by counting occurences and tracking times of observability.
SightingDB was designed to provide to MISP a Scalable and Fast way to store and retrive Attributes.
{mainmatter}
# Introduction
Adding context to any Attribute is the key that makes it useful. While there exist numerous ways of doing it,
SightingDB does it by just counting.
Whenever somebody retrieves an Attribute, this counting is provided, allowing anyone to understand wether something
was observed few or many times.
## Conventions and Terminology
The key words "**MUST**", "**MUST NOT**", "**REQUIRED**", "**SHALL**", "**SHALL NOT**",
"**SHOULD**", "**SHOULD NOT**", "**RECOMMENDED**", "**MAY**", and "**OPTIONAL**" in this
document are to be interpreted as described in RFC 2119 [@!RFC2119].
# Format
## Overview
The SightingDB format is in the JSON [@!RFC8259] format. In SightingDB, a Sighting Object is composed of a single JSON object. This object contains the following fields: value, first_seen, last_seen, count, tags, ttl, frequency and manifold.
### Attribute Storage
The fields described previously describe an Attribute and all the required characteristics. However they are stored in a Namespace. A Namespace is similar to a path in a filesystem where the same file can be stored in multiple places.
### Namespace
A Namespace with multiple levels MUST be separated with the slash '/' character. There is no specification on how they are structured, since it depends on the use cases.
A Namespace starting with the underscore '_' character means it is private and internal to SightingDB. There are all reserved for the engine and MUST NOT be used.
Reserved namespaces are:
_expired/<namespace>: Which contains all the attributes that expired, preserving the origin namespace
_shadow/<namespace>: When a value is searched and does not exists, it is stored there
_stats: Statistics
_config: Configuration
_all: All the Attributes in one place, used to retrieve the 'manifold' property.
The Attribute Key MUST always be the last part of the Namespace.
#### Sample Namespaces
/Organization1/service/ipv4: Store values for ipv4 keys in /Organization1/service
/everything/domain: Store domains in /everything
### Attribute fields
#### value
The attribute value, used to store and retrieve information about an attribute. Note that value is not returned back in the JSON object, since it is queried, it is known.
#### first_seen
Time in UTC of the first time this value was captured
#### last_seen
Time in UTC of the last time this value was captured
#### count
How many time this value was written
#### tags
Tags follow how they are defined in MISP using the MISP Taxonomy. Each Tag is separated with the ';' character.
#### ttl
Time To Live, represents the expiration in seconds since the time the Attribute was created. Once it has expired, it moves in the private Namespace _expired.
When an Attribute has this field set to 0, it means it is not set to expired. This is the default behavior.
When an Attribute has this field set to a number greater than 0, the expiration status is computed only at retrieval time.
#### frequency
Frequency is the number of time an Attribute is seen in average per day. As this field can introduced latence, its implementation is OPTIONAL.
#### manifold
When a given Attribute Value is stored in different namespaces, the manifold field keeps track of them so it returns in how many different places this attributes exists. This is a simple counter.
## SightingDB Format - One Attribute
~~~~
{
"value":"127.0.0.1",
"first_seen":1530394819,
"last_seen":1572933618,
"count":578391,
"tags":"",
"ttl":0,
"frequency":1185,
"manifold": 17
}
~~~~
# Acknowledgements
The author wish to thank all the MISP community who are supporting the creation
of open standards in threat intelligence sharing. As well as amazing feedback gathered
during the MISP Summit 2019 in Luxembourg, in particular with Alexandre Dulaunoy and
Andras Iklody.
{backmatter}