Semantic Analysis

After parsing, CDTk constructs a SemanticTable — a category-theoretic representation of the program as objects, morphisms, and two-cells. This table is the universal intermediate form that CDTk uses to translate between any two grammars.

SemanticTable Model

The SemanticTable class and its row types are defined in the CDTk namespace:

public record ObjectRow   (string Name, string Type, string? Value);
public record MorphismRow (string Name, string Domain, string Codomain, string Body);
public record TwoCellRow (string Name, string Source, string Target, string Transform);

public class SemanticTable {
    public List<ObjectRow>   Objects   { get; init; } = [];
    public List<MorphismRow> Morphisms { get; init; } = [];
    public List<TwoCellRow>  TwoCells  { get; init; } = [];
}

Objects (ObjectRow)

An ObjectRow represents a named value in the program — a variable, constant, or literal:

FieldDescriptionExample
NameIdentifier name or token text"x", "42"
TypeType annotation string"int", "string"
ValueInitial/literal value (optional)"42", null
💡
Terminal Text Contract
Terminal ObjectRow.Name contains the scanned token text (e.g., "if", "42"), not the token name (e.g., "KW_IF"). Pipeline.runWithText substitutes matched text into terminal nodes after parsing.

Morphisms (MorphismRow)

A MorphismRow represents a function or method. In category theory terms, a morphism maps from a domain object to a codomain object:

FieldDescriptionC# Example
NameFunction name"Factorial"
DomainParameter list (comma-separated)"int n"
CodomainReturn type"int"
BodyFunction body as translated text"return n * Factorial(n - 1);"
// The C# function below produces this MorphismRow:
// Name="Factorial", Domain="int n", Codomain="int", Body="return n * Factorial(n-1);"
public static int Factorial(int n) {
    return n * Factorial(n - 1);
}

Two-Cells (TwoCellRow)

A TwoCellRow represents a natural transformation — a mapping between morphisms. This is used for operators, closures, and higher-order transformations:

FieldDescription
NameTransformation name or operator symbol
SourceSource morphism name
TargetTarget morphism name
TransformThe transformation expression

Building a SemanticTable

CDTk builds the SemanticTable automatically when you call Compiler.CompileText(). For direct access:

// Build table from C# source
SemanticTable table = Compiler.BuildSemanticTable(
    source:  csSource,
    grammar: new CSharpGrammar()
);

Console.WriteLine($"Objects:   {table.Objects.Count}");
Console.WriteLine($"Morphisms: {table.Morphisms.Count}");
Console.WriteLine($"TwoCells:  {table.TwoCells.Count}");

// Inspect individual functions
foreach (var fn in table.Morphisms)
    Console.WriteLine($"{fn.Codomain} {fn.Name}({fn.Domain}) => {fn.Body}");

CBOR Binary Encoding

The SemanticTable can be serialized to CBOR (Concise Binary Object Representation) for storage or transmission. This is used internally by the binary compilation pipeline:

// Encode to CBOR bytes
byte[] cbor = TableCbor.Encode(table);
File.WriteAllBytes("program.cbor", cbor);

// Decode from CBOR bytes
byte[] bytes = File.ReadAllBytes("program.cbor");
SemanticTable decoded = TableCbor.Decode(bytes);

// Also available via Compiler API
byte[] binary = Compiler.CompileToBinary(
    input:  new CSharpGrammar(),
    output: new X86AsmGrammar(),
    text:   csSource
);

SDXF Bundle Tags

When the entire compilation context (input grammar + output grammar + source files) is serialized as an SDXF bundle, the following tag layout is used. These tags must match between the C# SdfxEncoder and the F# UAB engine:

Tag (hex)LevelMeaning
0x01BundleRoot bundle node
0x10BundleInput grammar definition
0x11BundleOutput grammar definition
0x20BundleSource inputs container
0x21BundleSingle source input entry
0x30GrammarTokens container
0x31GrammarSingle token entry (name + kind + pattern)
0x40GrammarRules container
0x41GrammarSingle rule entry (kind + children)
0x50GrammarRoot rule ID

Translation Pipeline

The full pipeline from source text to translated output:

// Step 1: Lex + parse source with input grammar
ParseTree tree = inputGrammar.Parse(sourceText);

// Step 2: Assign structural roles to tokens
inputGrammar.ApplyStaticMaps();

// Step 3: Build SemanticTable from parse tree
SemanticTable table = inputGrammar.BuildTable(tree);

// Step 4: Translate table from input to output semantics
SemanticTable translated = Translator.Translate(table, inputGrammar, outputGrammar);

// Step 5: Generate target-language text
string result = outputGrammar.Render(translated);
All in one call
Use Compiler.CompileText(inputGrammar, outputGrammar, sourceText) to run all five steps in one line.