LENTS (Local Extendable Nested Tagging System)

Date: 2025 Apr 14

Words: 483

LENTS is a close-to-the-metal implementation of ENTS written in C with a custom binary file format. It is the most hackerish version of ENTS, has the smallest database size, and was made as a low-level learning project. It is not very extendable and the amount of tags possible is limited, so LENTS is also an option.

LENTS usage

`process`

The output for lents process tags.yaml will always be to a .lents hidden dot-file. This output cannot be changed, for the sake of parsing.

The LENTS file format

Terminology and stylistic choices
- Some set theory will be used in combination with colloquial English because it is the best way to communicate some things clearly and unambiguously
- Hex-line: refers to a standard 16-byte row in a hex dump
- Metadata: is a common term meaning “data about the data” used here.
- Ipsodata: contrasted with metadata. “Data that is the data itself”. Commonly used English terms for “the data itself” such as “core data”, “primary data”, “source data”, and “raw data” do not capture what I want to mean here, the data itself which is opposed to the data-about-the-data. Uses the Latin root “ipso”, meaning “itself” as a prefix.
- Offset: When offset is said here, I am referring to only the length of the offset and not the content of the offset.
Over-arching design choices for LENTS
- LENTS is a directed acyclic graph structure where hierarchically organized tags (nodes with parent-child relationships) map to files (destination nodes)
- The file format is designed for human-readability in hex dump
- The file format is structured a two-level index to two offset lookup tables (tags & file->tags)
File Header (32 Bytes, or 2 hex-lines)
1. LENTS magic number in unicode
2. Version number
3. Offsets
  1. Start of tags ipsodata
  2. Start of file-to-tags metadata
  3. Start of file-to-tags ipsodata
Tags Data

Each tag ipsodata can vary significantly in length, so an offset table is used to save lots of file space at minimal expense of lookup time. All the attributes in the ipsodata can also vary in length from eachother, so a “tag individual ipsodata part length” field is needed in the metadata. Tag ipsodata offset is the distance measured from (3.3.a). The tag UUID is not neccassary for the tag ipsodata, but is included for debug purposes.

Tags Metadata = {∀\forall Tag Metadata}
- Tag Metadata = {Tag UUID, Tag individual ipsodata part length, Tag ipsodata offset}
Tags Ipsodata = {∀\forall Tag Ipsodata}
- Tag Ipsodata = {Tag UUID, Tag name, Tag Ancestry UUIDs, Tag Children UUIDs}

File-to-tags Relationship Data

The same heuristics about length variation and UUID-inclusion from (4) apply here. File-to-tags metadata starts at (3.3.b) and file ipsodata offset is the distance measured from (3.3.c).

Files Metadata = {∀\forall File Metadata}
- File Metadata = {File UUID, File individual ipsodata part length, File ipsodata offset}
Files Ipsodata = {∀\forall File Ipsodata}
- File Ipsodata = {File UUID, Tags UUIDs}

<<<Back to Ents

Diego Cabello

LENTS (Local Extendable Nested Tagging System)

LENTS usage

process

The LENTS file format

`process`