cti-python-stix2/docs/guide/patterns.ipynb

510 lines
17 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# STIX2 Patterns\n",
"\n",
"The Python ``stix2`` library supports STIX 2 patterning insofar that patterns may be used for the pattern property of Indicators, identical to the STIX 2 specification. ``stix2`` does not evaluate patterns against STIX 2 content; for that functionality see [cti-pattern-matcher](https://github.com/oasis-open/cti-pattern-matcher).\n",
"\n",
"Patterns in the ``stix2`` library are built compositely from the bottom up, creating subcomponent expressions first before those at higher levels.\n",
"\n",
"## API Tips\n",
"\n",
"### ObservationExpression\n",
"\n",
"Within the STIX 2 Patterning specification, Observation Expressions denote a complete expression to be evaluated against a discrete observation. In other words, an Observation Expression must be created to apply to a single Observation instance. This is further made clear by the visual brackets(```[]```) that encapsulate an Observation Expression. Thus, whatever sub expressions that are within the Observation Expression are meant to be matched against the same Observable instance.\n",
"\n",
"This requirement manifests itself within the ``stix2`` library via ```ObservationExpression```. When creating STIX 2 observation expressions, whenever the current expression is complete, wrap it with ```ObservationExpression()```. This allows the complete pattern expression - no matter its complexity - to be rendered as a proper specification-adhering string. __*Note: When pattern expressions are added to Indicator objects, the expression objects are implicitly converted to string representations*__. While the extra step may seem tedious in the construction of simple pattern expressions, this explicit marking of observation expressions becomes vital when converting the pattern expressions to strings. \n",
"\n",
"In all the examples, you can observe how in the process of building pattern expressions, when an Observation Expression is completed, it is wrapped with ```ObservationExpression()```.\n",
"\n",
"### ParentheticalExpression\n",
"\n",
"Do not be confused by the ```ParentheticalExpression``` object. It is not a distinct expression type but is also used to properly craft pattern expressions by denoting order priority and grouping of expression components. Use it in a similar manner as ```ObservationExpression```, wrapping completed subcomponent expressions with ```ParentheticalExpression()``` if explicit ordering is required. For usage examples with ```ParentheticalExpression```'s, see [here](#Compound-Observation-Expressions).\n",
"\n",
"### BooleanExpressions vs CompoundObservationExpressions\n",
"\n",
"Be careful to note the difference between these two very similar pattern components. \n",
"\n",
"__BooleanExpressions__\n",
"\n",
" - [AndBooleanExpression](../api/stix2.patterns.rst#stix2.patterns.AndBooleanExpression)\n",
" - [OrbooleanExpression](../api/stix2.patterns.rst#stix2.patterns.OrBooleanExpression)\n",
" \n",
" __Usage__: When the boolean sub-expressions refer to the *same* root object \n",
"\n",
" __Example__:\n",
" ```[domain-name:value = \"www.5z8.info\" AND domain-name:resolvess_to_refs[*].value = \"'198.51.100.1/32'\"]```\n",
" \n",
" __Rendering__: when pattern is rendered, brackets or parenthesis will encapsulate boolean expression\n",
" \n",
"__CompoundObservationExpressions__\n",
"\n",
" - [AndObservationExpression](../api/stix2.patterns.rst#stix2.patterns.AndObservationExpression)\n",
" - [OrObservationExpression](../api/stix2.patterns.rst#stix2.patterns.OrObservationExpression)\n",
" \n",
" __Usage__: When the boolean sub-expressions refer to *different* root objects\n",
"\n",
" __Example__:\n",
" ```[file:name=\"foo.dll\"] AND [process:name = \"procfoo\"]```\n",
" \n",
" __Rendering__: when pattern is rendered, brackets will encapsulate each boolean sub-expression\n",
"\n",
"\n",
"\n",
"## Examples\n",
"\n",
"### Comparison Expressions"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from stix2 import DomainName, File, IPv4Address\n",
"from stix2 import (ObjectPath, EqualityComparisonExpression, ObservationExpression,\n",
" GreaterThanComparisonExpression, IsSubsetComparisonExpression,\n",
" FloatConstant, StringConstant)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Equality Comparison expressions"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\t[domain-name:value = 'site.of.interest.zaz']\n",
"\n",
"\t[file:parent_directory_ref.path = 'C:\\\\Windows\\\\System32']\n",
"\n"
]
}
],
"source": [
"lhs = ObjectPath(\"domain-name\", [\"value\"])\n",
"ece_1 = ObservationExpression(EqualityComparisonExpression(lhs, \"site.of.interest.zaz\"))\n",
"print(\"\\t{}\\n\".format(ece_1))\n",
"\n",
"lhs = ObjectPath(\"file\", [\"parent_directory_ref\",\"path\"])\n",
"ece_2 = ObservationExpression(EqualityComparisonExpression(lhs, \"C:\\\\Windows\\\\System32\"))\n",
"print(\"\\t{}\\n\".format(ece_2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Greater-than Comparison expressions"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\t[file:extensions.windows-pebinary-ext.sections[*].entropy > 7.0]\n",
"\n"
]
}
],
"source": [
"lhs = ObjectPath(\"file\", [\"extensions\", \"windows-pebinary-ext\", \"sections[*]\", \"entropy\"])\n",
"gte = ObservationExpression(GreaterThanComparisonExpression(lhs, FloatConstant(\"7.0\")))\n",
"print(\"\\t{}\\n\".format(gte))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### IsSubset Comparison expressions"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\t[network-traffic:dst_ref.value ISSUBSET '2001:0db8:dead:beef:0000:0000:0000:0000/64']\n",
"\n"
]
}
],
"source": [
"lhs = ObjectPath(\"network-traffic\", [\"dst_ref\", \"value\"])\n",
"iss = ObservationExpression(IsSubsetComparisonExpression(lhs, StringConstant(\"2001:0db8:dead:beef:0000:0000:0000:0000/64\")))\n",
"print(\"\\t{}\\n\".format(iss))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Compound Observation Expressions"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"from stix2 import (IntegerConstant, HashConstant, ObjectPath,\n",
" EqualityComparisonExpression, AndBooleanExpression,\n",
" OrBooleanExpression, ParentheticalExpression,\n",
" AndObservationExpression, OrObservationExpression,\n",
" FollowedByObservationExpression, ObservationExpression)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### AND boolean"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(AND)\n",
"[email-message:sender_ref.value = 'stark@example.com' AND email-message:subject = 'Conference Info']\n",
"\n"
]
}
],
"source": [
"ece3 = EqualityComparisonExpression(ObjectPath(\"email-message\", [\"sender_ref\", \"value\"]), \"stark@example.com\")\n",
"ece4 = EqualityComparisonExpression(ObjectPath(\"email-message\", [\"subject\"]), \"Conference Info\")\n",
"abe = ObservationExpression(AndBooleanExpression([ece3, ece4]))\n",
"print(\"(AND)\\n{}\\n\".format(abe))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### OR boolean"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(OR)\n",
"[url:value = 'http://example.com/foo' OR url:value = 'http://example.com/bar']\n",
"\n"
]
}
],
"source": [
"ece5 = EqualityComparisonExpression(ObjectPath(\"url\", [\"value\"]), \"http://example.com/foo\")\n",
"ece6 = EqualityComparisonExpression(ObjectPath(\"url\", [\"value\"]), \"http://example.com/bar\")\n",
"obe = ObservationExpression(OrBooleanExpression([ece5, ece6]))\n",
"print(\"(OR)\\n{}\\n\".format(obe))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### ( OR ) AND boolean"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(OR,AND)\n",
"[(file:name = 'pdf.exe' OR file:size = 371712) AND file:created = 2014-01-13 07:03:17+00:00]\n",
"\n"
]
}
],
"source": [
"ece7 = EqualityComparisonExpression(ObjectPath(\"file\", [\"name\"]), \"pdf.exe\")\n",
"ece8 = EqualityComparisonExpression(ObjectPath(\"file\", [\"size\"]), IntegerConstant(\"371712\"))\n",
"ece9 = EqualityComparisonExpression(ObjectPath(\"file\", [\"created\"]), \"2014-01-13T07:03:17Z\")\n",
"obe1 = OrBooleanExpression([ece7, ece8])\n",
"pobe = ParentheticalExpression(obe1)\n",
"abe1 = ObservationExpression(AndBooleanExpression([pobe, ece9]))\n",
"print(\"(OR,AND)\\n{}\\n\".format(abe1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### ( AND ) OR ( OR ) observation"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(AND,OR,OR)\n",
"([file:name = 'foo.dll'] AND [win-registry-key:key = 'HKEY_LOCAL_MACHINE\\\\foo\\\\bar']) OR [process:name = 'fooproc' OR process:name = 'procfoo']\n",
"\n"
]
}
],
"source": [
"ece20 = ObservationExpression(EqualityComparisonExpression(ObjectPath(\"file\", [\"name\"]), \"foo.dll\"))\n",
"ece21 = ObservationExpression(EqualityComparisonExpression(ObjectPath(\"win-registry-key\", [\"key\"]), \"HKEY_LOCAL_MACHINE\\\\foo\\\\bar\"))\n",
"ece22 = EqualityComparisonExpression(ObjectPath(\"process\", [\"name\"]), \"fooproc\")\n",
"ece23 = EqualityComparisonExpression(ObjectPath(\"process\", [\"name\"]), \"procfoo\")\n",
"# NOTE: we need to use AND/OR observation expression instead of just boolean \n",
"# expressions as the operands are not on the same object-type\n",
"aoe = ParentheticalExpression(AndObservationExpression([ece20, ece21]))\n",
"obe2 = ObservationExpression(OrBooleanExpression([ece22, ece23]))\n",
"ooe = OrObservationExpression([aoe, obe2])\n",
"print(\"(AND,OR,OR)\\n{}\\n\".format(ooe))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### FOLLOWED-BY"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(FollowedBy)\n",
"[file:hashes.MD5 = '79054025255fb1a26e4bc422aef54eb4'] FOLLOWEDBY [win-registry-key:key = 'HKEY_LOCAL_MACHINE\\\\foo\\\\bar']\n",
"\n"
]
}
],
"source": [
"ece10 = ObservationExpression(EqualityComparisonExpression(ObjectPath(\"file\", [\"hashes\", \"MD5\"]), HashConstant(\"79054025255fb1a26e4bc422aef54eb4\", \"MD5\")))\n",
"ece11 = ObservationExpression(EqualityComparisonExpression(ObjectPath(\"win-registry-key\", [\"key\"]), \"HKEY_LOCAL_MACHINE\\\\foo\\\\bar\"))\n",
"fbe = FollowedByObservationExpression([ece10, ece11])\n",
"print(\"(FollowedBy)\\n{}\\n\".format(fbe))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Qualified Observation Expressions"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"from stix2 import (TimestampConstant, HashConstant, ObjectPath, EqualityComparisonExpression,\n",
" AndBooleanExpression, WithinQualifier, RepeatQualifier, StartStopQualifier,\n",
" QualifiedObservationExpression, FollowedByObservationExpression,\n",
" ParentheticalExpression, ObservationExpression)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### WITHIN"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(WITHIN)\n",
"([file:hashes.MD5 = '79054025255fb1a26e4bc422aef54eb4'] FOLLOWEDBY [win-registry-key:key = 'HKEY_LOCAL_MACHINE\\\\foo\\\\bar']) WITHIN 300 SECONDS\n",
"\n"
]
}
],
"source": [
"ece10 = ObservationExpression(EqualityComparisonExpression(ObjectPath(\"file\", [\"hashes\", \"MD5\"]), HashConstant(\"79054025255fb1a26e4bc422aef54eb4\", \"MD5\")))\n",
"ece11 = ObservationExpression(EqualityComparisonExpression(ObjectPath(\"win-registry-key\", [\"key\"]), \"HKEY_LOCAL_MACHINE\\\\foo\\\\bar\"))\n",
"fbe = FollowedByObservationExpression([ece10, ece11])\n",
"par = ParentheticalExpression(fbe)\n",
"qoe = QualifiedObservationExpression(par, WithinQualifier(300))\n",
"print(\"(WITHIN)\\n{}\\n\".format(qoe))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### REPEATS, WITHIN"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(REPEAT, WITHIN)\n",
"[network-traffic:dst_ref.type = 'domain-name' AND network-traffic:dst_ref.value = 'example.com'] REPEATS 5 TIMES WITHIN 180 SECONDS\n",
"\n"
]
}
],
"source": [
"ece12 = EqualityComparisonExpression(ObjectPath(\"network-traffic\", [\"dst_ref\", \"type\"]), \"domain-name\")\n",
"ece13 = EqualityComparisonExpression(ObjectPath(\"network-traffic\", [\"dst_ref\", \"value\"]), \"example.com\")\n",
"abe2 = ObservationExpression(AndBooleanExpression([ece12, ece13]))\n",
"qoe1 = QualifiedObservationExpression(QualifiedObservationExpression(abe2, RepeatQualifier(5)), WithinQualifier(180))\n",
"print(\"(REPEAT, WITHIN)\\n{}\\n\".format(qoe1))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### START, STOP"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(START-STOP)\n",
"[file:name = 'foo.dll'] START t'2016-06-01T00:00:00Z' STOP t'2016-07-01T00:00:00Z'\n",
"\n"
]
}
],
"source": [
"ece14 = ObservationExpression(EqualityComparisonExpression(ObjectPath(\"file\", [\"name\"]), \"foo.dll\"))\n",
"ssq = StartStopQualifier(TimestampConstant('2016-06-01T00:00:00Z'), TimestampConstant('2016-07-01T00:00:00Z'))\n",
"qoe2 = QualifiedObservationExpression(ece14, ssq)\n",
"print(\"(START-STOP)\\n{}\\n\".format(qoe2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Attaching patterns to STIX2 Domain objects\n",
"\n",
"\n",
"### Example"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\n",
" \"type\": \"indicator\",\n",
" \"id\": \"indicator--219bc5fc-fdbf-4b54-a2fc-921be7ab3acb\",\n",
" \"created\": \"2018-08-29T23:58:00.548Z\",\n",
" \"modified\": \"2018-08-29T23:58:00.548Z\",\n",
" \"name\": \"Cryptotorch\",\n",
" \"pattern\": \"[file:name = '$$t00rzch$$.elf']\",\n",
" \"valid_from\": \"2018-08-29T23:58:00.548391Z\",\n",
" \"labels\": [\n",
" \"malware\",\n",
" \"ransomware\"\n",
" ]\n",
"}\n"
]
}
],
"source": [
"from stix2 import Indicator, EqualityComparisonExpression, ObservationExpression\n",
"\n",
"ece14 = ObservationExpression(EqualityComparisonExpression(ObjectPath(\"file\", [\"name\"]), \"$$t00rzch$$.elf\"))\n",
"ind = Indicator(name=\"Cryptotorch\", labels=[\"malware\", \"ransomware\"], pattern=ece14)\n",
"print(ind)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}