As you type a query, it renders a graph of the nodes and their relationship
visualize the query, not the results
Query Debugger Vis:
As you type a query, it gets continuously evaluated and Hod returns a DOT representation of
the class structure of the returned results
with this, we also get the 'one hop' of relationships/classes away from those that are returned.
allows you to "explore" the building graph
Results display:
when you click a row, you get those items, their relationships, and the "degree 1" links
out from each of those nodes (relationship + node)
- or have clicking the nodes expand that out
Structure
TODO Items
Write out the 10 most common queries:
longitudinal study
dashboard
control loop
etc
Infrastructure
load database from disk
save the predicate index
make sure to load in Brick.ttl as well
need object pool to reduce allocations:
btree
investigate others
easy backups:
periodic zips?
explicit command?
leveldb should make this easy
periodic compact range before dumping?
Operators
Action Operators:
SELECT:
retrieves list of the resolved tuples
maybe add:
ability to select key/value pairs on returned nodes
COUNT
counts number of resolved tuples
LIMIT
limit the number of returned rows
Tests:
full query tests on known dataset
in progress
Filters:
path predicates:
path (matches path)
path1/path2 (matches path1 followed by path2)
also extends to path+, etc
path+ (matches 1 or more path)
path* (matches 0 or more path)
path? (matches 0 or 1 path)
path1|path2 (matches path1 OR path2):
can be combined with other path predicates
UNION/OR:
implicitly, all triples in a query are AND
Specify URLs in the query
Features:
key/value pairs:
plan out the structure and how these fit into database:
maybe want to call these 'links'? They are really just pointers
to other data sources, e.g. URI or UUID
can also be timestamp (date added, etc)
plan out filters on these:
where timestamp >/</= timestamp?
maybe we can just retrieve these when we get a node; they are not part of
the query engine
will be associated with some generation of the node
generations:
logical timestamping of entities:
should be a COW structure
having more generations shouldn't impact the latency of the common
case (most recent generation)
idea: prefix all entity IDs with the generation (another 4 bytes?). Most current
generation is [0 0 0 0]; atomically need to change the generation on updates
remember inserts should be transactional; we should not see any intermediate forms
of the database
latency of inserts isn't important:
want to make sure that we are consistent w/n a generation
don't want the control loop to get different results WITHIN an iteration. Rather,
we should see changes reflected in between iterations.
Consider adding a type of "generation lock":
query the most recent generation, and keep me querying on that generation
until I release the lock
Key-Value Pairs on Nodes
Called 'links':
do we 'type' these, e.g. URI, UUID, etc? or just leave as text:
probably want to leave as text? But there are standard values
UUID
BW_URI
HTTP_URI
also support timestamp type w/ key-value
stored in their own database:
struct is:
type Link struct {
Entity [4]byte
Key []byte
Value []byte
}
btree key is entity + keyname, so we can easily do prefix iteration over the entity prefix to get the keys for that
links are not integrated into the selection clause:
they are not a way to distinguish between nodes; only to retrieve extra information about the nodes
links are retrieved upon the resolution of the select clause
select clause syntax:
// select the URI for the sensor
SELECT ?sensor[uri] WHERE
// select the time-added for the vav and uri for the sensor
SELECT ?vav[added] ?sensor[uri] WHERE
// select the uri and uuid of the sensor
SELECT ?sensor[uri,uuid] WHERE
// get all links for the sensor
SELECT ?sensor[*] WHERE
how are the links added? These are not part of TTL:
idea 1: interpret a special TTL relationships (bf:hasLink, for example) as a "link"