Supported Databases

Firepit supports sqlite3 and PostgreSQL.

Database Tables

STIX observation data is inserted into multiple tables within a “session” (a database file for sqlite3 and a “schema” in PostgreSQL). A table is created for each object type. Since STIX data is a graph (i.e. nodes and edges), Firepit also creates some special “edge” tables:

  • __contains: tracks which SCOs were contained in which observed-data SDOs

  • __reflist: models 1:N reference lists like process:opened_connection_refs

  • __queries: records which objects were inserted in which cache operations

  • __symtable: records the name and type of “views” created by firepit calls

These tables are prefixed with __ and considered “private” by firepit.

The STIX id property is used as the unique key for each table.

The observed-data Table

This tables contains the actual STIX Observed Data SDO that “conveys information about cyber security related entities such as files, systems, and networks using the STIX Cyber-observable Objects (SCOs).” [STIX-v2_1]

This SDO (and therefore table) holds the timestamps and count of actual observations, whereas SCOs (and their firepit tables) only contain the properties (columns) of their respective object types.

The examples below show how to link observed-data with SCOs via the “private” __contains table.

SCO Tables

Each SCO table (ipv4-addr, network-traffic, file, etc.) contains the properties present from the cached bundles. Firepit does not require any specific properties (though STIX does). Columns are only created for properties found.

For example, the network-traffic table should have properties src_ref (a reference to an object in either the ipv4-addr or ipv6-addr table) which represents the connection’s source address, dst_ref, src_port, dst_port, and protocols. The port properties are simple integers, and stored in integer columns. The protocols column is a list of strings; it’s stored as a JSON-encoded string.

STIX Object Paths

STIX object paths (e.g. network-traffic:src_ref.value) are a key part of STIX patterning, which (from Firepit’s perspective) is equivalent to a WHERE clause. They can contain implicit JOINs: network-traffic is a table, src_ref is the id property for an ipv4-addr (or ipv6-addr) which is the unique key for that table. value is a column in that referenced table.

Firepit operations will (in most cases) accept STIX object paths and create the required JOIN.

Example SQL queries

Full Network Traffic Information

The network-traffic SCO only contains references to the source and destination addresses. To see the actual addresses, you need to join the ipv4-addr table:

sqlite> select
   ...>   src.value as "src_ref.value",
   ...>   nt.src_port as "src_port",
   ...>   dst.value as "dst_ref.value",
   ...>   nt.dst_port as "dst_port",
   ...>   nt.protocols
   ...> from
   ...>   "network-traffic" as nt
   ...>   join "ipv4-addr" as src on nt.src_ref =
   ...>   join "ipv4-addr" as dst on nt.dst_ref =
   ...> ;
src_ref.value  src_port    dst_ref.value  dst_port    protocols
-------------  ----------  -------------  ----------  ----------  60842    47413       ["tcp"]      60843      5357        ["tcp"]

The firepit CLI makes this easier; for example, using the lookup command:

$ firepit lookup network-traffic --columns src_ref.value,src_port,dst_ref.value,dst_port,protocols
src_ref.value      src_port  dst_ref.value      dst_port  protocols
---------------  ----------  ---------------  ----------  -----------         60842           47413  ["tcp"]             60843              5357  ["tcp"]

Most CLI commands have an API function of the same name in the SqlStorage class.

Timestamped SCOs

To see the first 3 IP addresses observed, join the special __contains and observed-data tables:

sqlite> select obs.first_observed as time, sco.value as 'IP'
   ...>  from "ipv4-addr" as sco
   ...>  join "__contains" as c on = c.target_ref
   ...>  join "observed-data" as obs on c.source_ref =
   ...>  order by time limit 3;
time                      IP
------------------------  -------------

This is effectively equivalent to the CLI’s timestamped command or the API’s timestamped function:

$ firepit timestamped ipv4-addr value | head -5
first_observed            value
------------------------  ---------------

Value counts

To get a count of observations of each IP address (the sqlite3 CLI truncates the value column):

sqlite> select sco.value, count(*) from "ipv4-addr" as sco
   ...>  join "__contains" as c on = c.target_ref
   ...>  join "observed-data" as obs on c.source_ref =
   ...>  group by sco.value;
value       count(*)
----------  ----------   413  33  7  8  24  13
192.168.1.  166
192.168.1.  138
192.168.1.  1
192.168.1.  3
192.168.1.  4
192.168.17  8
192.168.17  1
192.168.17  4
192.168.23  10
192.168.23  2
192.168.23  1
192.168.23  4

Again, this operation is provided by the CLI’s value-counts command or the API’s value_counts function:

$ firepit value-counts ipv4-addr value
value              count
---------------  -------            413          33           7           8          24          13          166        138          1          3          4          8        1        4         10          2        1        4

STIX Version 2.1. Edited by Bret Jordan, Rich Piazza, and Trey Darley. 10 June 2021. OASIS Standard. Latest stage: