The Nomenclator
Date: 2025 Jun 08
Words: 970
Draft: 1 (Most recent)
THIS POST IS AN EXPLORATORY POST
the emergent problem
with recent technological advances of the 2020s, the amount of research that researchers have had to pour through has never been higher.1 at the same time, certain stand-alone broader fields like physics have hit “dead ends” and haven’t had any major breakthroughs; while cross-domain researchers and programs have become of paramount importance for the emerging fields of the 2020s.2 the newer coding agents of the 2020s like AlphaEvolve3have yielded cross-domain breakthroughs and shown what can be possible with “agentic”-computer-assisted human research teams. cross-domain breakthroughs are increasingly becoming the main driver of academic progress in the 2020s.
one of the main obstacles that actively prevent cross-domain breakthroughs is ambiguous terminology across disciplines. terminological chaos is becoming a bigger bottleneck. And if computer agents are getting better at cross-domain synthesis, humans need better coordination tools to keep up. so, to unlock the next steps in academic research methodology; to build even better computer systems such as AlphaEvolve; and also to keep humans on the same edge as these computer systems, resolving these ambiguities is crucial. this brings up the necessity for a nomenclator
proposal
the nomenclator is a proposed human committee of academic leaders across disciplines to the purpose of naming new concepts and ensuring that future terminologies do not conflict.
further on the problem
researchers waste enormous amounts of time figuring out if they’re talking about the same concepts when they use words like “complexity”4 or “emergence.” sometimes apparent disagreements are actually just people using the same word for completely different things. sometimes genuine insights get buried because they don’t have clear terminology to express them.
there are a lot of foundational concepts floating around academia that are genuinely irreducable - things like “monad,” “quark,” “axiom,” “datum” - but they’re scattered across fields without systematic organization. meanwhile, people keep borrowing everyday english words (“turtle,” “patch,” “observer” in netlogo) to describe fundamental computational concepts, creating namespace collisions and conceptual confusion - this poor terminology and unusable concepts from netlogo5 produced about 20 years of research into “agents” which has largely been useless for 2020s agentic usecases.
the nomenclator would systematically identify these irreducable concepts and give them proper names, while cleaning up terminological chaos in existing fields.
precedence
it would be similar in function to:
- Academic Organizations – IUPAC, International Astronomical Union, the International Committee on Taxonomy of Viruses, Mathematical Subject Classification (MSC)
- Language Authorities – Académie française, Real Academia Española, Icelandic Language Committee, Pontifical Academy for Latin
- Standards Bodies – IEEE, ISO
- Computer Standards bodies + conventions: IETF, Unicode Consortium, POSIX, language specs
naming the naming committee
nomenclator is the name of the ancient roman officials who assigned names
nomenclator organization and structure
the nomenclator human committee would use a codified systematic criteria and a monolith computer program to decide on importance of new concepts to be named.
governance and procedures
~160 members total. small enough that everyone knows everyone (dunbar number considerations), large enough for field representation.
core committees:
- maintainer committee – handles the open source monolith system
- linguistics and cognitive science committee – phonetics, semantics, historical linguistics expertise, how language shapes thought, terminology adoption patterns
- field representatives – rotating 6-year terms, ~5-7 seats per major discipline, elected by professional societies (aps, acs, etc.)
naming criterion
- category 1: compound existing words for concepts that can be naturally represented with them (e.g. machine learning, social network)
- category 2: classical roots for important concepts or breakthroughs (e.g. genome, cybernetics, meme)
- category 3: entirely new phonemes for foundational breakthroughs (e.g. quark, qubit)
pure anglosphere participation for now - coordinating across languages adds massive complexity. can expand internationally later.
the monolith
the monolith is a computer program that will be open source with a core maintainer committee handling the algorithmic infrastructure. the monolith would also be used to determine the importance of proposed new terminologies, and used to find unused phonic space for category three terms.
algorithmic importance calculation:
- network analysis in the field (co-citation patterns)
- citation velocity for concepts and authors
- nlp vector similarity search to map conceptual relationships
- automated detection of terminological conflicts
human oversight for edge cases:
- appeals process for contested decisions
- evaluation of concepts the algorithm can’t handle
- final approval for category 3 (new phoneme) assignments
application process for new terms
researchers submit concepts that need terminology through formal application:
- provide evidence of conceptual importance (citations, cross-field usage)
- demonstrate existing terminological confusion or gaps
- submit to algorithmic evaluation + human review
- appeals process for contested decisions
necessity
this is a one-shot decision with permanent consequences. once a nomenclator exists and gains authority, it becomes the standard. any competitor would be like the avignon papacy - a schism that weakens the whole system.
the governance structure choice is permanent and shapes academic discourse forever. there won’t be a “nomenclator 2.0” if the first implementation has flawed governance - the network effects and institutional adoption make the initial design irreversible.
this means getting the governance structure right immediately is crucial. too democratic → paralysis and bikeshedding. too small → vulnerability to getting hijacked. the balance has to be perfect from day one.
most “impossible” coordination problems are actually just missing infrastructure. once you build the right tools, the “impossible” stuff becomes trivial.
arxiv has been rising in popularity. this link could just illustrate the rise of arxiv’s popularity, but i use it here to illustrate the volume individual researchers are consuming is rising. arXiv Monthly Downloads↩︎
AlphaEvolve: A Gemini-powered coding agent for designing advanced algorithms↩︎
A previous discussion I the author was part of about “complexodynamics” spent most of the time clearing up term ambiguities. https://scottaaronson.blog/?p=762↩︎
NetLogo: Design and Implementation of a Multi-Agent Modeling Environment↩︎