Welcome to Firepit’s documentation!

Firepit - STIX Columnar Storage

https://img.shields.io/pypi/v/firepit.svg Documentation Status Unit Test Status https://codecov.io/gh/opencybersecurityalliance/firepit/branch/develop/graph/badge.svg?token=Pu7pkqmE5W

Columnar storage for STIX 2.0 observations.

Features

  • Transforms STIX Observation SDOs to a columnar format

  • Inserts those transformed observations into SQL (currently sqlite3 and PostgreSQL)

Motivation

STIX 2.0 JSON is a graph-like data format. There aren’t many popular tools for working with graph-like data, but there are numerous tools for working with data from SQL databases. Firepit attempts to make those tools usable with STIX data obtained from stix-shifter.

Firepit also supports STIX 2.1

Firepit is primarily designed for use with the Kestrel Threat Hunting Language.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Installation

Stable release

To install firepit, run this command in your terminal:

$ pip install firepit

This is the preferred method to install firepit, as it will always install the most recent stable release.

If you don’t have pip installed, this Python installation guide can guide you through the process.

From sources

The sources for firepit can be downloaded from the Github repo.

You can either clone the public repository:

$ git clone git://github.com/opencybersecurityalliance/firepit

Or download the tarball:

$ curl -OJL https://github.com/opencybersecurityalliance/firepit/tarball/master

Once you have a copy of the source, you can install it with:

$ python setup.py install

Usage

As a package

To use firepit in a project:

from firepit import get_storage

db = get_storage('observations.db', session_id)
db.cache('my_query_id', ['bundle1.json', 'bundle2.json'])

After caching your STIX bundles, your database will contain the data.

Passing a file path to get_storage will use sqlite3. Passing in a PostgreSQL connection URI (e.g. postgresql://…) will instead attempt to connect to the PostgreSQL instance specified.

As a tool

You can use the firepit command line tool to ingest and query your data.

To make things easier, you can set a pair of environment variables:

export FIREPITDB=my_dbname
export FIREPITID=my_session_id

FIREPITDB is your DB filename (sqlite3) or connection URI (PostgreSQL). FIREPITID is a “session” ID you can use to keep your data organized.

$ firepit --help
Usage: firepit [OPTIONS] COMMAND [ARGS]...

  Columnar storage for STIX observations

Options:
  --dbname TEXT   Path/name of database
  --session TEXT  Session ID to data separation  [default: test-id]
  --help          Show this message and exit.

Commands:
  assign           Perform an operation on a column and name the result
  cache            Cache STIX observation data in SQL
  columns          Get the columns names of a view/table
  count            Get the count of rows in a view/table
  delete           Delete STIX observation data in SQL
  extract          Create a view of a subset of cached data
  filter           Create a filtered view of a subset of cached data
  get-appdata      Get the app-specific data for a view
  join             Join two views
  load             Cache STIX observation data in SQL
  lookup           Retrieve a view
  merge            Merge 2 or more views into a new view
  number-observed  Retrieve the count of values of a column from a view
  reassign         Update/replace STIX observation data in SQL
  remove           Remove a view
  rename           Rename a view
  schema           Get the schema of a view/table
  set-appdata      Set the app-specific data for a view
  sql              Run a SQL statement on the database [DANGEROUS!]
  summary          Retrieve timeframe and count from a view
  tables           Get all view/table names
  timestamped      Retrieve the timestamped values of a column from a view
  type             Get the SCO type of a view/table
  value-counts     Retrieve the value counts of a column from a view
  values           Retrieve the values of a STIX object path from a view
  viewdata         Get view data for views [default is all views]
  views            Get all view names

$ firepit cache --help
Usage: firepit cache [OPTIONS] QUERY_ID FILENAMES...

  Cache STIX observation data in SQL

Arguments:
  QUERY_ID      An identifier for this set of data  [required]
  FILENAMES...  STIX bundle files of query results  [required]

Options:
  --help  Show this message and exit.

splint

Firepit also includes a utility called splint. This tool supports multiple commands for manipulating STIX 2.0 bundles (2.1 is not yet supported), including randomzing IDs, shifting timestamps, and converting other log formats to STIX observations.

The convert command currently supports the JSON format at https://securitydatasets.com as well as Zeek conn.log and dns.log.

$ splint
Usage: splint [OPTIONS] COMMAND [ARGS]...

  STIX processing and linting

Options:
  --help  Show this message and exit.

Commands:
  convert        Convert various log files to STIX 2.0 bundles
  dedup-ids      Replace duplicate IDs with random IDs
  limit          Truncate STIX bundle
  randomize-ids  Randomize STIX observation IDs in a bundle
  timeshift      Timeshift STIX observations in a bundle
  upgrade        Upgrade a STIX 2.0 bundle to 2.1

Database

Supported Databases

Firepit supports sqlite3 and PostgreSQL.

Database Tables

STIX observation data is inserted into multiple tables within a “session” (a database file for sqlite3 and a “schema” in PostgreSQL). A table is created for each object type. Since STIX data is a graph (i.e. nodes and edges), Firepit also creates some special “edge” tables:

  • __contains: tracks which SCOs were contained in which observed-data SDOs

  • __reflist: models 1:N reference lists like process:opened_connection_refs

  • __queries: records which objects were inserted in which cache operations

  • __symtable: records the name and type of “views” created by firepit calls

These tables are prefixed with __ and considered “private” by firepit.

The STIX id property is used as the unique key for each table.

The observed-data Table

This tables contains the actual STIX Observed Data SDO that “conveys information about cyber security related entities such as files, systems, and networks using the STIX Cyber-observable Objects (SCOs).” [STIX-v2_1]

This SDO (and therefore table) holds the timestamps and count of actual observations, whereas SCOs (and their firepit tables) only contain the properties (columns) of their respective object types.

The examples below show how to link observed-data with SCOs via the “private” __contains table.

SCO Tables

Each SCO table (ipv4-addr, network-traffic, file, etc.) contains the properties present from the cached bundles. Firepit does not require any specific properties (though STIX does). Columns are only created for properties found.

For example, the network-traffic table should have properties src_ref (a reference to an object in either the ipv4-addr or ipv6-addr table) which represents the connection’s source address, dst_ref, src_port, dst_port, and protocols. The port properties are simple integers, and stored in integer columns. The protocols column is a list of strings; it’s stored as a JSON-encoded string.

STIX Object Paths

STIX object paths (e.g. network-traffic:src_ref.value) are a key part of STIX patterning, which (from Firepit’s perspective) is equivalent to a WHERE clause. They can contain implicit JOINs: network-traffic is a table, src_ref is the id property for an ipv4-addr (or ipv6-addr) which is the unique key for that table. value is a column in that referenced table.

Firepit operations will (in most cases) accept STIX object paths and create the required JOIN.

Example SQL queries

Full Network Traffic Information

The network-traffic SCO only contains references to the source and destination addresses. To see the actual addresses, you need to join the ipv4-addr table:

sqlite> select
   ...>   src.value as "src_ref.value",
   ...>   nt.src_port as "src_port",
   ...>   dst.value as "dst_ref.value",
   ...>   nt.dst_port as "dst_port",
   ...>   nt.protocols
   ...> from
   ...>   "network-traffic" as nt
   ...>   join "ipv4-addr" as src on nt.src_ref = src.id
   ...>   join "ipv4-addr" as dst on nt.dst_ref = dst.id
   ...> ;
src_ref.value  src_port    dst_ref.value  dst_port    protocols
-------------  ----------  -------------  ----------  ----------
192.168.1.156  60842       192.168.1.1    47413       ["tcp"]
127.0.0.1      60843       127.0.0.1      5357        ["tcp"]

The firepit CLI makes this easier; for example, using the lookup command:

$ firepit lookup network-traffic --columns src_ref.value,src_port,dst_ref.value,dst_port,protocols
src_ref.value      src_port  dst_ref.value      dst_port  protocols
---------------  ----------  ---------------  ----------  -----------
192.168.1.156         60842  192.168.1.1           47413  ["tcp"]
127.0.0.1             60843  127.0.0.1              5357  ["tcp"]

Most CLI commands have an API function of the same name in the SqlStorage class.

Timestamped SCOs

To see the first 3 IP addresses observed, join the special __contains and observed-data tables:

sqlite> select obs.first_observed as time, sco.value as 'IP'
   ...>  from "ipv4-addr" as sco
   ...>  join "__contains" as c on sco.id = c.target_ref
   ...>  join "observed-data" as obs on c.source_ref = obs.id
   ...>  order by time limit 3;
time                      IP
------------------------  -------------
2019-11-16T12:55:28.101Z  192.168.1.156
2019-11-16T12:55:28.101Z  192.168.1.1
2019-11-16T12:55:28.883Z  127.0.0.1

This is effectively equivalent to the CLI’s timestamped command or the API’s timestamped function:

$ firepit timestamped ipv4-addr value | head -5
first_observed            value
------------------------  ---------------
2019-11-16T12:55:28.101Z  192.168.1.156
2019-11-16T12:55:28.101Z  192.168.1.1
2019-11-16T12:55:28.883Z  127.0.0.1

Value counts

To get a count of observations of each IP address (the sqlite3 CLI truncates the value column):

sqlite> select sco.value, count(*) from "ipv4-addr" as sco
   ...>  join "__contains" as c on sco.id = c.target_ref
   ...>  join "observed-data" as obs on c.source_ref = obs.id
   ...>  group by sco.value;
value       count(*)
----------  ----------
127.0.0.1   413
172.16.0.1  33
172.16.0.1  7
172.16.0.1  8
172.16.0.1  24
172.16.0.2  13
192.168.1.  166
192.168.1.  138
192.168.1.  1
192.168.1.  3
192.168.1.  4
192.168.17  8
192.168.17  1
192.168.17  4
192.168.23  10
192.168.23  2
192.168.23  1
192.168.23  4

Again, this operation is provided by the CLI’s value-counts command or the API’s value_counts function:

$ firepit value-counts ipv4-addr value
value              count
---------------  -------
127.0.0.1            413
172.16.0.100          33
172.16.0.101           7
172.16.0.104           8
172.16.0.112          24
172.16.0.255          13
192.168.1.1          166
192.168.1.156        138
192.168.1.163          1
192.168.1.169          3
192.168.1.255          4
192.168.175.1          8
192.168.175.254        1
192.168.175.255        4
192.168.232.1         10
192.168.232.2          2
192.168.232.254        1
192.168.232.255        4
[STIX-v2_1]

STIX Version 2.1. Edited by Bret Jordan, Rich Piazza, and Trey Darley. 10 June 2021. OASIS Standard. https://docs.oasis-open.org/cti/stix/v2.1/os/stix-v2.1-os.html. Latest stage: https://docs.oasis-open.org/cti/stix/v2.1/stix-v2.1.html.

firepit

firepit package

Subpackages

Submodules

firepit.deref module

firepit.deref.auto_deref(store, view, ignore=None, paths=None)[source]

Automatically resolve refs for backward compatibility.

If paths is specified, only follow/deref those specific paths/properties.

Use auto_deref_cached if you already have col_dict in memory.

firepit.deref.auto_deref_cached(view, cols, col_dict, ignore=None, paths=None)[source]

Automatically resolve refs for backward compatibility.

If paths is specified, only follow/deref those specific paths/properties.

firepit.deref.unresolve(objects)[source]

Do the opposite of auto_deref: split out reference objects

firepit.exceptions module

exception firepit.exceptions.DatabaseMismatch(dbversion, expected)[source]

Bases: Exception

exception firepit.exceptions.DuplicateTable[source]

Bases: Exception

exception firepit.exceptions.IncompatibleType[source]

Bases: Exception

exception firepit.exceptions.InvalidAttr(msg)[source]

Bases: Exception

exception firepit.exceptions.InvalidObject(msg)[source]

Bases: Exception

exception firepit.exceptions.InvalidStixPath[source]

Bases: Exception

exception firepit.exceptions.InvalidViewname[source]

Bases: Exception

exception firepit.exceptions.SessionExists[source]

Bases: Exception

exception firepit.exceptions.SessionNotFound[source]

Bases: Exception

exception firepit.exceptions.StixPatternError(stix)[source]

Bases: Exception

exception firepit.exceptions.UnexpectedError[source]

Bases: Exception

exception firepit.exceptions.UnknownViewname[source]

Bases: Exception

firepit.generator module

firepit.pgcommon module

Some common PostgreSQL stuff used by both pgstorage.py (the normal interface) and aio.asyncpgstorage.py (the async interface).

firepit.pgcommon.pg_shorten(key)[source]

firepit.pgstorage module

firepit.props module

Utility functions for STIX properties

firepit.props.auto_agg(sco_type, prop, col_type)[source]

Infer an aggregation function based on column name and type

firepit.props.auto_agg_tuple(sco_type, prop, col_type)[source]

Infer an aggregation function based on column name and type

firepit.props.get_last(prop)[source]
firepit.props.is_ref(name)[source]
firepit.props.parse_path(path)[source]
firepit.props.parse_prop(sco_type, prop)[source]
firepit.props.path_metadata(path)[source]

Get metadata for a STIX object path

firepit.props.primary_prop(sco_type)[source]

Returns the “primary” property name for each SCO type

firepit.props.prop_metadata(sco_type, prop)[source]

Get metadata for a STIX object property

firepit.props.ref_type(sco_type, part)[source]

Get STIX SCO type for reference prop part

firepit.query module

Utilities for generating SQL while avoiding SQL injection vulns

class firepit.query.Aggregation(aggs)[source]

Bases: object

Aggregate rows

render(_placeholder, _dialect=None)[source]
class firepit.query.BinnedColumn(prop: str, n: int, unit: Optional[str] = None, table: Optional[str] = None, alias: Optional[str] = None)[source]

Bases: Column

Bin (or “bucket”) column values, persumably for easier grouping

render(_placeholder, dialect=None)[source]
class firepit.query.CoalescedColumn(names, alias)[source]

Bases: object

First non-null column from a list - used after a JOIN

class firepit.query.Column(name, table=None, alias=None)[source]

Bases: object

SQL Column name

endswith(s)[source]
class firepit.query.Count[source]

Bases: object

Count the rows in a result set

render(_placeholder, _dialect=None)[source]
class firepit.query.CountUnique(cols=None)[source]

Bases: object

Unique count of the rows in a result set

render(_placeholder, _dialect=None)[source]
class firepit.query.Filter(preds, op=' AND ')[source]

Bases: object

Alternative SQL WHERE clause

AND = ' AND '
OR = ' OR '
render(placeholder, _dialect=None)[source]
set_table(table)[source]

Specify table for ALL Predicates in Filter

class firepit.query.Group(cols)[source]

Bases: object

SQL GROUP clause

render(_placeholder, _dialect=None)[source]
exception firepit.query.InvalidAggregateFunction[source]

Bases: Exception

exception firepit.query.InvalidComparisonOperator[source]

Bases: Exception

exception firepit.query.InvalidJoinOperator[source]

Bases: Exception

exception firepit.query.InvalidPredicateOperand[source]

Bases: Exception

exception firepit.query.InvalidPredicateOperator[source]

Bases: Exception

exception firepit.query.InvalidQuery[source]

Bases: Exception

class firepit.query.Join(name, left_col=None, op=None, right_col=None, preds=None, how='INNER', alias=None, lhs=None)[source]

Bases: object

Join 2 tables

render(placeholder, _dialect=None)[source]
class firepit.query.Limit(num)[source]

Bases: object

SQL row count

render(_placeholder, _dialect=None)[source]
class firepit.query.Offset(num)[source]

Bases: object

SQL row offset

render(_placeholder, _dialect=None)[source]
class firepit.query.Order(cols)[source]

Bases: object

SQL ORDER BY clause

ASC = 'ASC'
DESC = 'DESC'
render(_placeholder, _dialect=None)[source]
class firepit.query.Predicate(lhs, op, rhs)[source]

Bases: object

Row value predicate

render(placeholder, _dialect=None)[source]
set_table(table)[source]

Specify table for ALL columns in Predicate

class firepit.query.Projection(cols)[source]

Bases: object

SQL SELECT (really projection - pick column subset) clause

render(placeholder, dialect=None)[source]
class firepit.query.Query(arg=None)[source]

Bases: object

SQL Query statement

SQL order of evaluations: FROM, including JOINs WHERE GROUP BY HAVING WINDOW functions SELECT (projection) DISTINCT UNION ORDER BY LIMIT and OFFSET

append(stage)[source]
extend(stages)[source]
render(placeholder, dialect=None)[source]
class firepit.query.Table(name)[source]

Bases: object

SQL Table selection

render(_placeholder, _dialect=None)[source]
class firepit.query.Unique[source]

Bases: object

Reduce the rows in a result set to unique tuples

render(placeholder, dialect=None)[source]

firepit.schemas module

firepit.sqlitestorage module

class firepit.sqlitestorage.SQLiteStorage(dbname)[source]

Bases: SqlStorage

columns(viewname)[source]

Get the column names (properties) of viewname

delete()[source]

Delete ALL data in this store

schema(viewname=None)[source]

Get the schema (names and types) of table/view viewname or all tables if not specified

tables()[source]

Get all table names

types(private=False)[source]

Get all table names that correspond to SCO types

firepit.sqlitestorage.get_storage(path)[source]
firepit.sqlitestorage.row_factory(cursor, row)[source]

firepit.sqlstorage module

class firepit.sqlstorage.SqlStorage[source]

Bases: object

assign(viewname, on, op=None, by=None, ascending=True, limit=None)[source]

DEPRECATED: Perform (unary) operation op on on and store result as viewname

assign_query(viewname, query, sco_type=None)[source]

Create a new view viewname defined by query

cache(query_id, bundles, batchsize=2000, **kwargs)[source]

Cache the result of a query/dataset

Takes the observed-data SDOs from bundles and “flattens” them, splits out SCOs by type, and inserts into a database with 1 table per type.

Accepts some keyword args for runtime options, some of which may depend on what database type is in use (e.g. sqlite3, postgresql, …)

Args:

query_id (str): a unique identifier for this set of bundles

bundles (list): STIX bundles (either in-memory Python objects or filename paths)

batchsize (int): number of objects to insert in 1 batch (defaults to 2000)

close()[source]
columns(viewname)[source]

Get the column names (properties) of viewname

count(viewname)[source]

Get the count of objects (rows) in viewname

delete()[source]

Delete ALL data in this store

extract(viewname, sco_type, query_id, pattern)[source]

Extract all sco_type object from the results of query_id and store as viewname

extract_observeddata_attribute(viewname, name_of_attribute, path=None, value=None, limit=None, run=True)[source]

Get the observations of value in viewname.`path` Returns list of dicts like {‘name_of_attribute’: ‘…’, ‘{column}’: ‘…’} name_of_attribute can be a str or list of str (to get multiple attrs)

filter(viewname, sco_type, input_view, pattern)[source]

Extract all sco_type object from input_view and store as viewname

finish(index=True)[source]

Do any DB-specific post-caching/insertion activity, such as indexing

get_appdata(viewname)[source]

Retrieve app-specific data for a viewname

get_view_data(viewnames=None)[source]

Retrieve information about one or more viewnames

group(newname, viewname, by, aggs=None)[source]

Create new view newname defined by grouping viewname by by

join(viewname, l_var, l_on, r_var, r_on)[source]

Join vars l_var and r_var and store result as viewname

load(viewname, objects, sco_type=None, query_id=None, preserve_ids=True)[source]

Import objects as type sco_type and store as viewname

lookup(viewname, cols='*', limit=None, offset=None, col_dict=None)[source]

Get the value of viewname

merge(viewname, input_views)[source]
number_observed(viewname, path, value=None)[source]

Get the count of observations of value in viewname.`path` Returns integer count

path_joins(viewname, sco_type, column)[source]

Determine if column has implicit Joins and return them if so

reassign(viewname, objects)[source]

Replace objects (or insert them if they’re not there)

remove_view(viewname)[source]

Remove view viewname

rename_view(oldname, newname)[source]

Rename view oldname to newname

run_query(query)[source]
schema(viewname=None)[source]

Get the schema (names and types) of table/view viewname or all tables if not specified

set_appdata(viewname, data)[source]

Attach app-specific data to a viewname

summary(viewname, path=None, value=None)[source]

Get the first and last observed time and number observed for observations of viewname, optionally specifying path and value. Returns list of dicts like {‘first_observed’: ‘2021-10-…’, ‘last_observed’: ‘2021-10-…’, ‘number_observed’: N}

table_type(viewname)[source]

Get the SCO type for table/view viewname

tables()[source]

Get all table names

timestamped(viewname, path=None, value=None, timestamp='first_observed', limit=None, run=True)[source]

Get the timestamped observations of value in viewname.`path` Returns list of dicts like {‘timestamp’: ‘2021-10-…’, ‘{column}’: ‘…’}

types(private=False)[source]

Get all table names that correspond to SCO types

upsert(cursor, tablename, obj, query_id, schema)[source]
upsert_many(cursor, tablename, objs, query_id, schema)[source]
value_counts(viewname, path)[source]

Get the count of observations of each value in viewname.`path` Returns list of dicts like {‘{column}’: ‘…’, ‘count’: 1}

values(path, viewname)[source]

Get the values of STIX object path path (a column) from viewname

views()[source]

Get all view names

firepit.sqlstorage.get_path_joins(viewname, sco_type, column)[source]

Determine if column has implicit Joins and return them if so

firepit.sqlstorage.infer_type(key, value)[source]

firepit.stix20 module

firepit.stix20.comp2sql(sco_type, prop, op, value, dialect)[source]
firepit.stix20.get_grammar()[source]
firepit.stix20.path2sql(sco_type, path)[source]
firepit.stix20.stix2sql(pattern, sco_type, dialect='sqlite3')[source]
firepit.stix20.summarize_pattern(pattern)[source]

firepit.stix21 module

firepit.stix21.makeid(sco, obs=None)[source]

firepit.timestamp module

firepit.timestamp.timefmt(t, prec=3)[source]

Format Python datetime t in RFC 3339-format

firepit.timestamp.to_datetime(timestamp)[source]

Convert RFC 3339-format timestamp to Python datetime

firepit.validate module

STIX and SQL identifier validators

firepit.validate.validate_name(name)[source]

Make sure name is a valid (SQL) identifier

firepit.validate.validate_path(path)[source]

Make sure path is a valid STIX object path or property name

firepit.woodchipper module

class firepit.woodchipper.FlatJsonMapper[source]

Bases: Mapper

convert(event)[source]
detect(event)[source]
class firepit.woodchipper.IscHoneypotJsonMapper[source]

Bases: Mapper

convert(event)[source]
detect(event)[source]
mapping = {'dest': 'network-traffic:dst_ref.value', 'dport': 'network-traffic:dst_port', 'proto': 'network-traffic:protocols', 'source': 'network-traffic:src_ref.value', 'sport': 'network-traffic:src_port', 'ts': ['first_observed', 'last_observed'], 'url': 'url:value', 'user_agent': "network-traffic:extensions.'http-request-ext'.request_header.'User-Agent'"}
class firepit.woodchipper.Mapper[source]

Bases: object

convert(event)[source]
detect(event)[source]
class firepit.woodchipper.SdsMapper[source]

Bases: Mapper

common_mapping = {'@timestamp': ['first_observed', 'last_observed'], 'Application': <function split_image>, 'Category': <function to_cat_list>, 'Channel': 'x-oca-event:module', 'EventID': <function to_action_code>, 'Hostname': 'x-oca-asset:hostname', 'Message': <function SdsMapper.<lambda>>, 'ProcessGuid': 'process:x_unique_id', 'ProcessId': 'process:pid', 'ProcessName': <function split_image>, 'SourceName': 'x-oca-event:provider', 'TimeCreated': ['first_observed', 'last_observed']}
convert(event)[source]
detect(event)[source]
static enhanced_action(message)[source]
event_types = {'ConnectPipe': 18, 'CreateKey': 12, 'CreatePipe': 17, 'DeleteKey': 12, 'DeleteValue': 12, 'SetValue': 13}
class firepit.woodchipper.ZeekCsvMapper[source]

Bases: Mapper

convert(event)[source]
detect(event)[source]
zeek_mapping = {'id.orig_h': 'network-traffic:src_ref.value', 'id.orig_p': 'network-traffic:src_port', 'id.resp_h': 'network-traffic:dst_ref.value', 'id.resp_p': 'network-traffic:dst_port', 'orig_ip_bytes': 'network-traffic:src_byte_count', 'orig_pkts': 'network-traffic:src_packets', 'proto': 'network-traffic:protocols', 'resp_ip_bytes': 'network-traffic:dst_byte_count', 'resp_pkts': 'network-traffic:dst_packets', 'ts': <function from_unix_time>}
class firepit.woodchipper.ZeekJsonMapper[source]

Bases: Mapper

common_mapping = {'id_orig_h': 'network-traffic:src_ref.value', 'id_orig_p': 'network-traffic:src_port', 'id_resp_h': 'network-traffic:dst_ref.value', 'id_resp_p': 'network-traffic:dst_port', 'proto': 'network-traffic:protocols', 'ts': <function from_unix_time>}
convert(event)[source]
detect(event)[source]
static process_answers(answers)[source]
zeek_mapping = {'conn': {'orig_ip_bytes': 'network-traffic:src_byte_count', 'orig_l2_addr': 'network-traffic:src_ref.resolves_to_refs[0].value', 'orig_pkts': 'network-traffic:src_packets', 'resp_ip_bytes': 'network-traffic:dst_byte_count', 'resp_l2_addr': 'network-traffic:dst_ref.resolves_to_refs[0].value', 'resp_pkts': 'network-traffic:dst_packets'}, 'dns': {'answers': <function ZeekJsonMapper.<lambda>>, 'query': 'domain-name:value'}}
firepit.woodchipper.convert(input_file, output_file=None)[source]
firepit.woodchipper.convert_to_stix(input_file)[source]
firepit.woodchipper.detect_filetype(input_file)[source]
firepit.woodchipper.dict2observation(creator, row)[source]
firepit.woodchipper.fixup_hashes(hashes: dict)[source]
firepit.woodchipper.format_val(sco_type, prop, val)[source]
firepit.woodchipper.from_unix_time(ts)[source]
firepit.woodchipper.guess_ref_type(sco_type, prop, val)[source]

Get data type for sco_type:prop reference

firepit.woodchipper.is_file_event(event_id)[source]
firepit.woodchipper.merge_mappings(common, specific, key=None)[source]

Merge common mapping into specific[key] mapping

firepit.woodchipper.process_event(event, mapping, event_id=None)[source]
firepit.woodchipper.process_events(events, mappers, ident)[source]
firepit.woodchipper.process_mapping(event, mapping)[source]
firepit.woodchipper.read_csv(fp, mappers, ident)[source]
firepit.woodchipper.read_json(fp, mappers, ident)[source]
firepit.woodchipper.read_log(fp, mappers, ident)[source]
firepit.woodchipper.recreate_dict(obj, prop, rest, val)[source]
firepit.woodchipper.set_obs_prop(observable, path, val, scos, key)[source]
firepit.woodchipper.split_file_hash(hash_string: str)[source]
firepit.woodchipper.split_file_path(abs_name: str, prefix='file:')[source]
firepit.woodchipper.split_hash(hash_string: str, prefix: str, tag: str = '')[source]
firepit.woodchipper.split_image(abs_name: str, prefix='process:')[source]
firepit.woodchipper.split_image_hash(hash_string: str)[source]
firepit.woodchipper.split_image_loaded(abs_name: str)[source]
firepit.woodchipper.split_loaded_hash(hash_string: str)[source]
firepit.woodchipper.split_parent_image(abs_name: str)[source]
firepit.woodchipper.split_reg_key_value(path: str)[source]
firepit.woodchipper.to_action_code(event_id)[source]

Convert windows event ID to x-oca-event action and code

firepit.woodchipper.to_cat_list(category)[source]
firepit.woodchipper.to_payload_bin(value)[source]
firepit.woodchipper.to_protocol(value)[source]

Module contents

Top-level package for STIX Columnar Storage.

firepit.get_storage(url, session_id=None)[source]

Get a storage object for firepit. url will determine the type; a file path means sqlite3. session_id is used in the case of postgresql to partition your data.

Contributing

Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.

You can contribute in many ways:

Types of Contributions

Report Bugs

Report bugs at https://github.com/opencybersecurityalliance/firepit/issues.

If you are reporting a bug, please include:

  • Your operating system name and version.

  • Any details about your local setup that might be helpful in troubleshooting.

  • Detailed steps to reproduce the bug.

Fix Bugs

Look through the GitHub issues for bugs. Anything tagged with “bug” and “help wanted” is open to whoever wants to implement it.

Implement Features

Look through the GitHub issues for features. Anything tagged with “enhancement” and “help wanted” is open to whoever wants to implement it.

Write Documentation

firepit could always use more documentation, whether as part of the official firepit docs, in docstrings, or even on the web in blog posts, articles, and such.

Submit Feedback

The best way to send feedback is to file an issue at https://github.com/opencybersecurityalliance/firepit/issues.

If you are proposing a feature:

  • Explain in detail how it would work.

  • Keep the scope as narrow as possible, to make it easier to implement.

  • Remember that this is a volunteer-driven project, and that contributions are welcome :)

Get Started!

Ready to contribute? Here’s how to set up firepit for local development.

  1. Fork the firepit repo on GitHub.

  2. Clone your fork locally:

    $ git clone git@github.com:your_name_here/firepit.git
    
  3. Install your local copy into a virtualenv. Assuming you have pyenv installed, this is how you set up your fork for local development:

    $ cd firepit/
    $ pyenv virtualenv python-3.9.2 firepit
    $ pyenv local firepit
    $ make setup
    
  4. Create a branch for local development:

    $ git checkout -b name-of-your-bugfix-or-feature
    

    Now you can make your changes locally.

  5. When you’re done making changes, check that your changes pass linting and the tests (this will be done automatically with a git pre-commit hook):

    $ make lint
    $ make test
    
  6. Commit your changes and push your branch to GitHub:

    $ git add .
    $ git commit -m "Your detailed description of your changes."
    $ git push origin name-of-your-bugfix-or-feature
    
  7. Submit a pull request through the GitHub website.

  8. Request a review from one of the maintainers.

Pull Request Guidelines

Before you submit a pull request, check that it meets these guidelines:

  1. The pull request should include tests.

  2. If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring, and add the feature to the list in README.rst.

  3. The pull request should work for Python 3.8, 3.9, 3.10, and 3.11. There is a GitHub workflow that will check for this automatically.

All contributions must be covered by a Contributor’s License Agreement (CLA) and Entity CLA (if you are contributing on behalf of your employer). You will get a prompt to sign CLA when you submit your first PR.

Tips

To run a subset of tests:

$ pytest tests/test_storage.py

Or:

$ pytest -k test_something...

Releasing

A reminder for the maintainers on how to release. Make sure all your changes are committed (including an entry in HISTORY.rst). Then run:

$ bump2version patch # possible: major / minor / patch
$ git push
$ git push --tags

Credits

Development Lead

Contributors

History

2.3.0 (2022-06-15)

  • Added query.BinnedColumn so you can group by time buckets

2.2.0 (2022-06-08)

  • Better STIX extension property support - Add a new __columns “private” table to store mapping from object path to column name - New path/prop metadata functions to supply metadata about STIX properties

  • Improved STIX process “deterministic” id generation - Use a unique ID from extension properties, if found - Use related x-oca-asset hostname or ID if available

2.1.0 (2022-05-18)

  • Add splint convert command to convert some logs files to STIX bundles

2.0.0 (2022-04-01)

  • Use a “normalized” SQL database

  • Initial STIX 2.1 support

1.3.0 (2021-10-04)

New assign_query API, minor query API improvements

  • new way to create views via assign_query

  • can now init a Query with a list instead of calling append

  • Some SQL injection protection in query classes

1.2.0 (2021-08-18)

  • Better support for grouped data

1.1.0 (2021-07-18)

  • First stable release

  • Concurrency fixes in cache()

1.0.0 (2021-05-18)

  • First release on PyPI.

Indices and tables