chg: [sightingdb-format] added

pull/2/head
Alexandre Dulaunoy 2019-11-11 09:21:35 +01:00
parent d29737f589
commit 762393319e
No known key found for this signature in database
GPG Key ID: 09E2CD4944E6CBCD
2 changed files with 1049 additions and 0 deletions

657
rfc/sightingdb-format.html Normal file
View File

@ -0,0 +1,657 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head profile="http://www.w3.org/2006/03/hcard http://dublincore.org/documents/2008/08/04/dc-html/">
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
<title>SightingDB query format</title>
<style type="text/css" title="Xml2Rfc (sans serif)">
/*<![CDATA[*/
a {
text-decoration: none;
}
/* info code from SantaKlauss at http://www.madaboutstyle.com/tooltip2.html */
a.info {
/* This is the key. */
position: relative;
z-index: 24;
text-decoration: none;
}
a.info:hover {
z-index: 25;
color: #FFF; background-color: #900;
}
a.info span { display: none; }
a.info:hover span.info {
/* The span will display just on :hover state. */
display: block;
position: absolute;
font-size: smaller;
top: 2em; left: -5em; width: 15em;
padding: 2px; border: 1px solid #333;
color: #900; background-color: #EEE;
text-align: left;
}
a.smpl {
color: black;
}
a:hover {
text-decoration: underline;
}
a:active {
text-decoration: underline;
}
address {
margin-top: 1em;
margin-left: 2em;
font-style: normal;
}
body {
color: black;
font-family: verdana, helvetica, arial, sans-serif;
font-size: 10pt;
max-width: 55em;
}
cite {
font-style: normal;
}
dd {
margin-right: 2em;
}
dl {
margin-left: 2em;
}
ul.empty {
list-style-type: none;
}
ul.empty li {
margin-top: .5em;
}
dl p {
margin-left: 0em;
}
dt {
margin-top: .5em;
}
h1 {
font-size: 14pt;
line-height: 21pt;
page-break-after: avoid;
}
h1.np {
page-break-before: always;
}
h1 a {
color: #333333;
}
h2 {
font-size: 12pt;
line-height: 15pt;
page-break-after: avoid;
}
h3, h4, h5, h6 {
font-size: 10pt;
page-break-after: avoid;
}
h2 a, h3 a, h4 a, h5 a, h6 a {
color: black;
}
img {
margin-left: 3em;
}
li {
margin-left: 2em;
margin-right: 2em;
}
ol {
margin-left: 2em;
margin-right: 2em;
}
ol p {
margin-left: 0em;
}
p {
margin-left: 2em;
margin-right: 2em;
}
pre {
margin-left: 3em;
background-color: lightyellow;
padding: .25em;
}
pre.text2 {
border-style: dotted;
border-width: 1px;
background-color: #f0f0f0;
width: 69em;
}
pre.inline {
background-color: white;
padding: 0em;
}
pre.text {
border-style: dotted;
border-width: 1px;
background-color: #f8f8f8;
width: 69em;
}
pre.drawing {
border-style: solid;
border-width: 1px;
background-color: #f8f8f8;
padding: 2em;
}
table {
margin-left: 2em;
}
table.tt {
vertical-align: top;
}
table.full {
border-style: outset;
border-width: 1px;
}
table.headers {
border-style: outset;
border-width: 1px;
}
table.tt td {
vertical-align: top;
}
table.full td {
border-style: inset;
border-width: 1px;
}
table.tt th {
vertical-align: top;
}
table.full th {
border-style: inset;
border-width: 1px;
}
table.headers th {
border-style: none none inset none;
border-width: 1px;
}
table.left {
margin-right: auto;
}
table.right {
margin-left: auto;
}
table.center {
margin-left: auto;
margin-right: auto;
}
caption {
caption-side: bottom;
font-weight: bold;
font-size: 9pt;
margin-top: .5em;
}
table.header {
border-spacing: 1px;
width: 95%;
font-size: 10pt;
color: white;
}
td.top {
vertical-align: top;
}
td.topnowrap {
vertical-align: top;
white-space: nowrap;
}
table.header td {
background-color: gray;
width: 50%;
}
table.header a {
color: white;
}
td.reference {
vertical-align: top;
white-space: nowrap;
padding-right: 1em;
}
thead {
display:table-header-group;
}
ul.toc, ul.toc ul {
list-style: none;
margin-left: 1.5em;
margin-right: 0em;
padding-left: 0em;
}
ul.toc li {
line-height: 150%;
font-weight: bold;
font-size: 10pt;
margin-left: 0em;
margin-right: 0em;
}
ul.toc li li {
line-height: normal;
font-weight: normal;
font-size: 9pt;
margin-left: 0em;
margin-right: 0em;
}
li.excluded {
font-size: 0pt;
}
ul p {
margin-left: 0em;
}
.comment {
background-color: yellow;
}
.center {
text-align: center;
}
.error {
color: red;
font-style: italic;
font-weight: bold;
}
.figure {
font-weight: bold;
text-align: center;
font-size: 9pt;
}
.filename {
color: #333333;
font-weight: bold;
font-size: 12pt;
line-height: 21pt;
text-align: center;
}
.fn {
font-weight: bold;
}
.hidden {
display: none;
}
.left {
text-align: left;
}
.right {
text-align: right;
}
.title {
color: #990000;
font-size: 18pt;
line-height: 18pt;
font-weight: bold;
text-align: center;
margin-top: 36pt;
}
.vcardline {
display: block;
}
.warning {
font-size: 14pt;
background-color: yellow;
}
@media print {
.noprint {
display: none;
}
a {
color: black;
text-decoration: none;
}
table.header {
width: 90%;
}
td.header {
width: 50%;
color: black;
background-color: white;
vertical-align: top;
font-size: 12pt;
}
ul.toc a::after {
content: leader('.') target-counter(attr(href), page);
}
ul.ind li li a {
content: target-counter(attr(href), page);
}
.print2col {
column-count: 2;
-moz-column-count: 2;
column-fill: auto;
}
}
@page {
@top-left {
content: "Internet-Draft";
}
@top-right {
content: "December 2010";
}
@top-center {
content: "Abbreviated Title";
}
@bottom-left {
content: "Doe";
}
@bottom-center {
content: "Expires June 2011";
}
@bottom-right {
content: "[Page " counter(page) "]";
}
}
@page:first {
@top-left {
content: normal;
}
@top-right {
content: normal;
}
@top-center {
content: normal;
}
}
/*]]>*/
</style>
<link href="#rfc.toc" rel="Contents">
<link href="#rfc.section.1" rel="Chapter" title="1 Introduction">
<link href="#rfc.section.1.1" rel="Chapter" title="1.1 Conventions and Terminology">
<link href="#rfc.section.2" rel="Chapter" title="2 Format">
<link href="#rfc.section.2.1" rel="Chapter" title="2.1 Overview">
<link href="#rfc.section.2.1.1" rel="Chapter" title="2.1.1 Attribute Storage">
<link href="#rfc.section.2.1.2" rel="Chapter" title="2.1.2 Namespace">
<link href="#rfc.section.2.1.3" rel="Chapter" title="2.1.3 Attribute fields">
<link href="#rfc.section.2.2" rel="Chapter" title="2.2 SightingDB Format - One Attribute">
<link href="#rfc.section.2.3" rel="Chapter" title="2.3 Value">
<link href="#rfc.section.2.3.1" rel="Chapter" title="2.3.1 Configuring the value format for a Namespace">
<link href="#rfc.section.2.4" rel="Chapter" title="2.4 Bulk">
<link href="#rfc.section.2.4.1" rel="Chapter" title="2.4.1 Response">
<link href="#rfc.section.3" rel="Chapter" title="3 Security Considerations">
<link href="#rfc.section.4" rel="Chapter" title="4 Acknowledgements">
<link href="#rfc.references" rel="Chapter" title="5 Normative References">
<link href="#rfc.authors" rel="Chapter">
<meta name="generator" content="xml2rfc version 2.23.1 - https://tools.ietf.org/tools/xml2rfc" />
<link rel="schema.dct" href="http://purl.org/dc/terms/" />
<meta name="dct.creator" content="Tricaud, S." />
<meta name="dct.identifier" content="urn:ietf:id:" />
<meta name="dct.issued" scheme="ISO8601" content="2019-11-03" />
<meta name="dct.abstract" content="This document describes the format used by SightingDB to give automated context to a given Attribute by counting occurrences and tracking times of observability. SightingDB was designed to provide to MISP a Scalable and Fast way to store and retrieve Attributes." />
<meta name="description" content="This document describes the format used by SightingDB to give automated context to a given Attribute by counting occurrences and tracking times of observability. SightingDB was designed to provide to MISP a Scalable and Fast way to store and retrieve Attributes." />
</head>
<body>
<table class="header">
<tbody>
<tr>
<td class="left">Network Working Group</td>
<td class="right">S. Tricaud</td>
</tr>
<tr>
<td class="left">Internet-Draft</td>
<td class="right">Devo Inc.</td>
</tr>
<tr>
<td class="left">Expires: May 6, 2020</td>
<td class="right">November 3, 2019</td>
</tr>
</tbody>
</table>
<p class="title">SightingDB query format<br />
<span class="filename"></span></p>
<h1 id="rfc.abstract"><a href="#rfc.abstract">Abstract</a></h1>
<p>This document describes the format used by SightingDB to give automated context to a given Attribute by counting occurrences and tracking times of observability. SightingDB was designed to provide to MISP a Scalable and Fast way to store and retrieve Attributes.</p>
<h1 id="rfc.status"><a href="#rfc.status">Status of This Memo</a></h1>
<p>This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.</p>
<p>Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.</p>
<p>Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."</p>
<p>This Internet-Draft will expire on May 6, 2020.</p>
<h1 id="rfc.copyrightnotice"><a href="#rfc.copyrightnotice">Copyright Notice</a></h1>
<p>Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved.</p>
<p>This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.</p>
<hr class="noprint" />
<h1 class="np" id="rfc.toc"><a href="#rfc.toc">Table of Contents</a></h1>
<ul class="toc">
<li>1. <a href="#rfc.section.1">Introduction</a>
</li>
<ul><li>1.1. <a href="#rfc.section.1.1">Conventions and Terminology</a>
</li>
</ul><li>2. <a href="#rfc.section.2">Format</a>
</li>
<ul><li>2.1. <a href="#rfc.section.2.1">Overview</a>
</li>
<ul><li>2.1.1. <a href="#rfc.section.2.1.1">Attribute Storage</a>
</li>
<li>2.1.2. <a href="#rfc.section.2.1.2">Namespace</a>
</li>
<li>2.1.3. <a href="#rfc.section.2.1.3">Attribute fields</a>
</li>
</ul><li>2.2. <a href="#rfc.section.2.2">SightingDB Format - One Attribute</a>
</li>
<li>2.3. <a href="#rfc.section.2.3">Value</a>
</li>
<ul><li>2.3.1. <a href="#rfc.section.2.3.1">Configuring the value format for a Namespace</a>
</li>
</ul><li>2.4. <a href="#rfc.section.2.4">Bulk</a>
</li>
<ul><li>2.4.1. <a href="#rfc.section.2.4.1">Response</a>
</li>
</ul></ul><li>3. <a href="#rfc.section.3">Security Considerations</a>
</li>
<li>4. <a href="#rfc.section.4">Acknowledgements</a>
</li>
<li>5. <a href="#rfc.references">Normative References</a>
</li>
<li><a href="#rfc.authors">Author's Address</a>
</li>
</ul>
<h1 id="rfc.section.1">
<a href="#rfc.section.1">1.</a> <a href="#introduction" id="introduction">Introduction</a>
</h1>
<p id="rfc.section.1.p.1">Adding context to any Attribute is the key that makes it useful. While there exist numerous ways of doing it, SightingDB does it by just counting. Whenever somebody retrieves an Attribute, this counting is provided, allowing anyone to understand whenever something was observed few or many times.</p>
<h1 id="rfc.section.1.1">
<a href="#rfc.section.1.1">1.1.</a> <a href="#conventions-and-terminology" id="conventions-and-terminology">Conventions and Terminology</a>
</h1>
<p id="rfc.section.1.1.p.1">The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 <a href="#RFC2119" class="xref">[RFC2119]</a>.</p>
<h1 id="rfc.section.2">
<a href="#rfc.section.2">2.</a> <a href="#format" id="format">Format</a>
</h1>
<h1 id="rfc.section.2.1">
<a href="#rfc.section.2.1">2.1.</a> <a href="#overview" id="overview">Overview</a>
</h1>
<p id="rfc.section.2.1.p.1">The SightingDB format is in JSON <a href="#RFC8259" class="xref">[RFC8259]</a> format and used to query a SightingDB compatible connector. In SightingDB, a Sighting Object is composed of a single JSON object. This object contains the following fields: value, first<em>seen, last</em>seen, count, tags, ttl and manifold.</p>
<h1 id="rfc.section.2.1.1">
<a href="#rfc.section.2.1.1">2.1.1.</a> <a href="#attribute-storage" id="attribute-storage">Attribute Storage</a>
</h1>
<p id="rfc.section.2.1.1.p.1">The fields described previously describe an Attribute and all the required characteristics. However they are stored in a Namespace. A Namespace is similar to a path in a file-system where the same file can be stored in multiple places.</p>
<h1 id="rfc.section.2.1.2">
<a href="#rfc.section.2.1.2">2.1.2.</a> <a href="#namespace" id="namespace">Namespace</a>
</h1>
<p id="rfc.section.2.1.2.p.1">A Namespace with multiple levels MUST be separated with the slash '/' character. There is no specification on how they are structured, since it depends on the use cases.</p>
<p id="rfc.section.2.1.2.p.2">A Namespace starting with the underscore '_' character means it is private and internal to SightingDB. There are all reserved for the engine and MUST NOT be used.</p>
<p id="rfc.section.2.1.2.p.3">Reserved namespaces are:</p>
<p id="rfc.section.2.1.2.p.4">_expired/&lt;namespace&gt;: Which contains all the attributes that expired, preserving the origin namespace</p>
<p id="rfc.section.2.1.2.p.5">_shadow/&lt;namespace&gt;: When a value is searched and does not exists, it is stored there</p>
<p id="rfc.section.2.1.2.p.6">_stats: Statistics</p>
<p id="rfc.section.2.1.2.p.7">_config: Configuration</p>
<p id="rfc.section.2.1.2.p.8">_all: All the Attributes in one place, used to retrieve the 'manifold' property.</p>
<p id="rfc.section.2.1.2.p.9">The Attribute Key MUST always be the last part of the Namespace.</p>
<h1 id="rfc.section.2.1.2.1">
<a href="#rfc.section.2.1.2.1">2.1.2.1.</a> <a href="#sample-namespaces" id="sample-namespaces">Sample Namespaces</a>
</h1>
<p id="rfc.section.2.1.2.1.p.1">/Organization1/service/ipv4: Store values for ipv4 keys in /Organization1/service</p>
<p id="rfc.section.2.1.2.1.p.2">/everything/domain: Store domains in /everything</p>
<h1 id="rfc.section.2.1.3">
<a href="#rfc.section.2.1.3">2.1.3.</a> <a href="#attribute-fields" id="attribute-fields">Attribute fields</a>
</h1>
<h1 id="rfc.section.2.1.3.1">
<a href="#rfc.section.2.1.3.1">2.1.3.1.</a> <a href="#value" id="value">value</a>
</h1>
<p id="rfc.section.2.1.3.1.p.1">The attribute value, used to store and retrieve information about an attribute. Note that value is not returned back in the JSON object, since it is queried, it is known. The Value is described in a section below, as it is very specific and can be either "as is", a hash, encoded in base64 or any other convenient mechanism.</p>
<p id="rfc.section.2.1.3.1.p.2">The value implementation MUST offer at least: 1) Raw value 2) Base64 URL Encoded 3) SHA256 Hash</p>
<h1 id="rfc.section.2.1.3.2">
<a href="#rfc.section.2.1.3.2">2.1.3.2.</a> <a href="#first-seen" id="first-seen">first_seen</a>
</h1>
<p id="rfc.section.2.1.3.2.p.1">Time in UTC of the first time this value was captured</p>
<h1 id="rfc.section.2.1.3.3">
<a href="#rfc.section.2.1.3.3">2.1.3.3.</a> <a href="#last-seen" id="last-seen">last_seen</a>
</h1>
<p id="rfc.section.2.1.3.3.p.1">Time in UTC of the last time this value was captured</p>
<h1 id="rfc.section.2.1.3.4">
<a href="#rfc.section.2.1.3.4">2.1.3.4.</a> <a href="#count" id="count">count</a>
</h1>
<p id="rfc.section.2.1.3.4.p.1">How many time this value was written</p>
<h1 id="rfc.section.2.1.3.5">
<a href="#rfc.section.2.1.3.5">2.1.3.5.</a> <a href="#tags" id="tags">tags</a>
</h1>
<p id="rfc.section.2.1.3.5.p.1">Tags follow how they are defined in MISP using the MISP Taxonomy. Each Tag is separated with the ';' character.</p>
<h1 id="rfc.section.2.1.3.6">
<a href="#rfc.section.2.1.3.6">2.1.3.6.</a> <a href="#ttl" id="ttl">ttl</a>
</h1>
<p id="rfc.section.2.1.3.6.p.1">Time To Live, represents the expiration in seconds since the time the Attribute was created. Once it has expired, it moves in the private Namespace _expired.</p>
<p id="rfc.section.2.1.3.6.p.2">When an Attribute has this field set to 0, it means it is not set to expired. This is the default behavior.</p>
<p id="rfc.section.2.1.3.6.p.3">When an Attribute has this field set to a number greater than 0, the expiration status is computed only at retrieval time.</p>
<h1 id="rfc.section.2.1.3.7">
<a href="#rfc.section.2.1.3.7">2.1.3.7.</a> <a href="#manifold" id="manifold">manifold</a>
</h1>
<p id="rfc.section.2.1.3.7.p.1">When a given Attribute Value is stored in different namespaces, the manifold field keeps track of them so it returns in how many different places this attributes exists. This is a simple counter.</p>
<h1 id="rfc.section.2.2">
<a href="#rfc.section.2.2">2.2.</a> <a href="#sightingdb-format-one-attribute" id="sightingdb-format-one-attribute">SightingDB Format - One Attribute</a>
</h1>
<pre>{
"value":"127.0.0.1",
"first_seen":1530394819,
"last_seen":1572933618,
"count":578391,
"tags":"",
"ttl":0,
"manifold": 17
}
</pre>
<h1 id="rfc.section.2.3">
<a href="#rfc.section.2.3">2.3.</a> <a href="#value-1" id="value-1">Value</a>
</h1>
<p id="rfc.section.2.3.p.1">The value submitted can be in multiple format according to the use-case. Any implementation MUST offer three alternatives:</p>
<p id="rfc.section.2.3.p.2">1) Raw value: where nothing is encoded and the value is stored AS IS, such as show in the example above with the One Attribute in JSON.</p>
<p id="rfc.section.2.3.p.3">2) SHA256: which prevents from seeing content (see Security Considerations), has a fixed size and is convenient for most requirements</p>
<p id="rfc.section.2.3.p.4">3) Base64 URL: Where the specification of Base64 is followed, except the characters conflicting with an URL argument are replaced</p>
<p id="rfc.section.2.3.p.5">The value is configured as part of the Namespace. The private "_config" Namespace prefix stores this value storage mechanism.</p>
<h1 id="rfc.section.2.3.1">
<a href="#rfc.section.2.3.1">2.3.1.</a> <a href="#configuring-the-value-format-for-a-namespace" id="configuring-the-value-format-for-a-namespace">Configuring the value format for a Namespace</a>
</h1>
<p id="rfc.section.2.3.1.p.1">If one has the Namespace "/Organization1/BU1/ip" and want to store those IP addresses in SHA256, it will be configured like this: The Namespace is kept but prefixed by "<em>config" and has a json object about value format set. "/</em>config/Organization1/BU1/ip"</p>
<pre>{
"value_format":"SHA256"
}
</pre>
<p id="rfc.section.2.3.1.p.2">Where "value_format" is either: "SHA256", "RAW" or "BASE64URL".</p>
<h1 id="rfc.section.2.4">
<a href="#rfc.section.2.4">2.4.</a> <a href="#bulk" id="bulk">Bulk</a>
</h1>
<p id="rfc.section.2.4.p.1">When data must be sent and received in large amounts, it is preferable to embed in JSON all the objects at once. As such, for reading and writing, the format is the following:</p>
<pre>{
"items": [
{ "/your/namespace": "127.0.0.1" },
{ "/your/other/namespace": "110812f67fa1e1f0117f6f3d70241c1a42a7b07711a93c2477cc516d9042f9db" }
]
}
</pre>
<p id="rfc.section.2.4.p.2">Which will either store or retrieve the wanted data.</p>
<h1 id="rfc.section.2.4.1">
<a href="#rfc.section.2.4.1">2.4.1.</a> <a href="#response" id="response">Response</a>
</h1>
<p id="rfc.section.2.4.1.p.1">The response when retrieving sightings also has the list of items, in order, one per line of the results:</p>
<pre>{
"items": [
{ "first_seen":1530337182, "last_seen":1573110615, "count":93021, "tags":"", "ttl":0, "manifold": 1 },
{ "first_seen":1562930418, "last_seen":1573110404, "count":1020492, "tags":"", "ttl":8912, "manifold": 3 }
]
}
</pre>
<h1 id="rfc.section.3">
<a href="#rfc.section.3">3.</a> <a href="#security-considerations" id="security-considerations">Security Considerations</a>
</h1>
<p id="rfc.section.3.p.1">While this document solely focuses on the format, the reference implementation is SightingDB. The authentication, the data access is not handled by SightingDB. It is possible a value can leak if the access is too permissive.</p>
<p id="rfc.section.3.p.2">Even a Hashed value can be discovered, as re-hashing known values would match.</p>
<h1 id="rfc.section.4">
<a href="#rfc.section.4">4.</a> <a href="#acknowledgements" id="acknowledgements">Acknowledgements</a>
</h1>
<p id="rfc.section.4.p.1">The author wish to thank all the MISP community who are supporting the creation of open standards in threat intelligence sharing. As well as amazing feedback gathered during the MISP Summit 2019 in Luxembourg, in particular with Alexandre Dulaunoy and Andras Iklody.</p>
<h1 id="rfc.references">
<a href="#rfc.references">5.</a> Normative References</h1>
<table><tbody>
<tr>
<td class="reference"><b id="RFC2119">[RFC2119]</b></td>
<td class="top">
<a>Bradner, S.</a>, "<a href="https://tools.ietf.org/html/rfc2119">Key words for use in RFCs to Indicate Requirement Levels</a>", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997.</td>
</tr>
<tr>
<td class="reference"><b id="RFC8259">[RFC8259]</b></td>
<td class="top">
<a>Bray, T.</a>, "<a href="https://tools.ietf.org/html/rfc8259">The JavaScript Object Notation (JSON) Data Interchange Format</a>", STD 90, RFC 8259, DOI 10.17487/RFC8259, December 2017.</td>
</tr>
</tbody></table>
<h1 id="rfc.authors"><a href="#rfc.authors">Author's Address</a></h1>
<div class="avoidbreak">
<address class="vcard">
<span class="vcardline">
<span class="fn">Sebastien Tricaud</span>
<span class="n hidden">
<span class="family-name">Tricaud</span>
</span>
</span>
<span class="org vcardline">Devo Inc.</span>
<span class="adr">
<span class="vcardline">150 Cambridgepark Drive</span>
<span class="vcardline">
<span class="locality">Cambridge, MA</span>,
<span class="region"></span>
<span class="code">02140</span>
</span>
<span class="country-name vcardline">USA</span>
</span>
<span class="vcardline">Phone: +1 866-221-2254</span>
<span class="vcardline">EMail: <a href="mailto:sebastien.tricaud@devo.com">sebastien.tricaud@devo.com</a></span>
</address>
</div>
</body>
</html>

392
rfc/sightingdb-format.txt Normal file
View File

@ -0,0 +1,392 @@
Network Working Group S. Tricaud
Internet-Draft Devo Inc.
Expires: May 6, 2020 November 3, 2019
SightingDB query format
Abstract
This document describes the format used by SightingDB to give
automated context to a given Attribute by counting occurrences and
tracking times of observability. SightingDB was designed to provide
to MISP a Scalable and Fast way to store and retrieve Attributes.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on May 6, 2020.
Copyright Notice
Copyright (c) 2019 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Tricaud Expires May 6, 2020 [Page 1]
Internet-Draft SightingDB query format November 2019
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1. Conventions and Terminology . . . . . . . . . . . . . . . 2
2. Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1.1. Attribute Storage . . . . . . . . . . . . . . . . . . 2
2.1.2. Namespace . . . . . . . . . . . . . . . . . . . . . . 3
2.1.3. Attribute fields . . . . . . . . . . . . . . . . . . 3
2.2. SightingDB Format - One Attribute . . . . . . . . . . . . 4
2.3. Value . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.1. Configuring the value format for a Namespace . . . . 5
2.4. Bulk . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4.1. Response . . . . . . . . . . . . . . . . . . . . . . 6
3. Security Considerations . . . . . . . . . . . . . . . . . . . 6
4. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 6
5. Normative References . . . . . . . . . . . . . . . . . . . . 6
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 6
1. Introduction
Adding context to any Attribute is the key that makes it useful.
While there exist numerous ways of doing it, SightingDB does it by
just counting. Whenever somebody retrieves an Attribute, this
counting is provided, allowing anyone to understand whenever
something was observed few or many times.
1.1. Conventions and Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
2. Format
2.1. Overview
The SightingDB format is in JSON [RFC8259] format and used to query a
SightingDB compatible connector. In SightingDB, a Sighting Object is
composed of a single JSON object. This object contains the following
fields: value, first_seen, last_seen, count, tags, ttl and manifold.
2.1.1. Attribute Storage
The fields described previously describe an Attribute and all the
required characteristics. However they are stored in a Namespace. A
Namespace is similar to a path in a file-system where the same file
can be stored in multiple places.
Tricaud Expires May 6, 2020 [Page 2]
Internet-Draft SightingDB query format November 2019
2.1.2. Namespace
A Namespace with multiple levels MUST be separated with the slash '/'
character. There is no specification on how they are structured,
since it depends on the use cases.
A Namespace starting with the underscore '_' character means it is
private and internal to SightingDB. There are all reserved for the
engine and MUST NOT be used.
Reserved namespaces are:
_expired/<namespace>: Which contains all the attributes that expired,
preserving the origin namespace
_shadow/<namespace>: When a value is searched and does not exists, it
is stored there
_stats: Statistics
_config: Configuration
_all: All the Attributes in one place, used to retrieve the
'manifold' property.
The Attribute Key MUST always be the last part of the Namespace.
2.1.2.1. Sample Namespaces
/Organization1/service/ipv4: Store values for ipv4 keys in
/Organization1/service
/everything/domain: Store domains in /everything
2.1.3. Attribute fields
2.1.3.1. value
The attribute value, used to store and retrieve information about an
attribute. Note that value is not returned back in the JSON object,
since it is queried, it is known. The Value is described in a
section below, as it is very specific and can be either "as is", a
hash, encoded in base64 or any other convenient mechanism.
The value implementation MUST offer at least: 1) Raw value 2) Base64
URL Encoded 3) SHA256 Hash
Tricaud Expires May 6, 2020 [Page 3]
Internet-Draft SightingDB query format November 2019
2.1.3.2. first_seen
Time in UTC of the first time this value was captured
2.1.3.3. last_seen
Time in UTC of the last time this value was captured
2.1.3.4. count
How many time this value was written
2.1.3.5. tags
Tags follow how they are defined in MISP using the MISP Taxonomy.
Each Tag is separated with the ';' character.
2.1.3.6. ttl
Time To Live, represents the expiration in seconds since the time the
Attribute was created. Once it has expired, it moves in the private
Namespace _expired.
When an Attribute has this field set to 0, it means it is not set to
expired. This is the default behavior.
When an Attribute has this field set to a number greater than 0, the
expiration status is computed only at retrieval time.
2.1.3.7. manifold
When a given Attribute Value is stored in different namespaces, the
manifold field keeps track of them so it returns in how many
different places this attributes exists. This is a simple counter.
2.2. SightingDB Format - One Attribute
{
"value":"127.0.0.1",
"first_seen":1530394819,
"last_seen":1572933618,
"count":578391,
"tags":"",
"ttl":0,
"manifold": 17
}
Tricaud Expires May 6, 2020 [Page 4]
Internet-Draft SightingDB query format November 2019
2.3. Value
The value submitted can be in multiple format according to the use-
case. Any implementation MUST offer three alternatives:
1) Raw value: where nothing is encoded and the value is stored AS IS,
such as show in the example above with the One Attribute in JSON.
2) SHA256: which prevents from seeing content (see Security
Considerations), has a fixed size and is convenient for most
requirements
3) Base64 URL: Where the specification of Base64 is followed, except
the characters conflicting with an URL argument are replaced
The value is configured as part of the Namespace. The private
"_config" Namespace prefix stores this value storage mechanism.
2.3.1. Configuring the value format for a Namespace
If one has the Namespace "/Organization1/BU1/ip" and want to store
those IP addresses in SHA256, it will be configured like this: The
Namespace is kept but prefixed by "_config" and has a json object
about value format set. "/_config/Organization1/BU1/ip"
{
"value_format":"SHA256"
}
Where "value_format" is either: "SHA256", "RAW" or "BASE64URL".
2.4. Bulk
When data must be sent and received in large amounts, it is
preferable to embed in JSON all the objects at once. As such, for
reading and writing, the format is the following:
{
"items": [
{ "/your/namespace": "127.0.0.1" },
{ "/your/other/namespace": "110812f67fa1e1f0117f6f3d70241c1a42a7b07711a93c2477cc516d9042f9db" }
]
}
Which will either store or retrieve the wanted data.
Tricaud Expires May 6, 2020 [Page 5]
Internet-Draft SightingDB query format November 2019
2.4.1. Response
The response when retrieving sightings also has the list of items, in
order, one per line of the results:
{
"items": [
{ "first_seen":1530337182, "last_seen":1573110615, "count":93021, "tags":"", "ttl":0, "manifold": 1 },
{ "first_seen":1562930418, "last_seen":1573110404, "count":1020492, "tags":"", "ttl":8912, "manifold": 3 }
]
}
3. Security Considerations
While this document solely focuses on the format, the reference
implementation is SightingDB. The authentication, the data access is
not handled by SightingDB. It is possible a value can leak if the
access is too permissive.
Even a Hashed value can be discovered, as re-hashing known values
would match.
4. Acknowledgements
The author wish to thank all the MISP community who are supporting
the creation of open standards in threat intelligence sharing. As
well as amazing feedback gathered during the MISP Summit 2019 in
Luxembourg, in particular with Alexandre Dulaunoy and Andras Iklody.
5. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>.
[RFC8259] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
Interchange Format", STD 90, RFC 8259,
DOI 10.17487/RFC8259, December 2017,
<https://www.rfc-editor.org/info/rfc8259>.
Author's Address
Tricaud Expires May 6, 2020 [Page 6]
Internet-Draft SightingDB query format November 2019
Sebastien Tricaud
Devo Inc.
150 Cambridgepark Drive
Cambridge, MA 02140
USA
Phone: +1 866-221-2254
Email: sebastien.tricaud@devo.com
Tricaud Expires May 6, 2020 [Page 7]