Semantic Analysis
After parsing, CDTk constructs a SemanticTable — a category-theoretic representation of the program as objects, morphisms, and two-cells. This table is the universal intermediate form that CDTk uses to translate between any two grammars.
SemanticTable Model
The SemanticTable class and its row types are defined in the CDTk namespace:
public record ObjectRow (string Name, string Type, string? Value);
public record MorphismRow (string Name, string Domain, string Codomain, string Body);
public record TwoCellRow (string Name, string Source, string Target, string Transform);
public class SemanticTable {
public List<ObjectRow> Objects { get; init; } = [];
public List<MorphismRow> Morphisms { get; init; } = [];
public List<TwoCellRow> TwoCells { get; init; } = [];
}
Objects (ObjectRow)
An ObjectRow represents a named value in the program — a variable, constant, or literal:
| Field | Description | Example |
|---|---|---|
Name | Identifier name or token text | "x", "42" |
Type | Type annotation string | "int", "string" |
Value | Initial/literal value (optional) | "42", null |
ObjectRow.Name contains the scanned token text (e.g., "if", "42"), not the token name (e.g., "KW_IF"). Pipeline.runWithText substitutes matched text into terminal nodes after parsing.Morphisms (MorphismRow)
A MorphismRow represents a function or method. In category theory terms, a morphism maps from a domain object to a codomain object:
| Field | Description | C# Example |
|---|---|---|
Name | Function name | "Factorial" |
Domain | Parameter list (comma-separated) | "int n" |
Codomain | Return type | "int" |
Body | Function body as translated text | "return n * Factorial(n - 1);" |
// The C# function below produces this MorphismRow:
// Name="Factorial", Domain="int n", Codomain="int", Body="return n * Factorial(n-1);"
public static int Factorial(int n) {
return n * Factorial(n - 1);
}
Two-Cells (TwoCellRow)
A TwoCellRow represents a natural transformation — a mapping between morphisms. This is used for operators, closures, and higher-order transformations:
| Field | Description |
|---|---|
Name | Transformation name or operator symbol |
Source | Source morphism name |
Target | Target morphism name |
Transform | The transformation expression |
Building a SemanticTable
CDTk builds the SemanticTable automatically when you call Compiler.CompileText(). For direct access:
// Build table from C# source
SemanticTable table = Compiler.BuildSemanticTable(
source: csSource,
grammar: new CSharpGrammar()
);
Console.WriteLine($"Objects: {table.Objects.Count}");
Console.WriteLine($"Morphisms: {table.Morphisms.Count}");
Console.WriteLine($"TwoCells: {table.TwoCells.Count}");
// Inspect individual functions
foreach (var fn in table.Morphisms)
Console.WriteLine($"{fn.Codomain} {fn.Name}({fn.Domain}) => {fn.Body}");
CBOR Binary Encoding
The SemanticTable can be serialized to CBOR (Concise Binary Object Representation) for storage or transmission. This is used internally by the binary compilation pipeline:
// Encode to CBOR bytes
byte[] cbor = TableCbor.Encode(table);
File.WriteAllBytes("program.cbor", cbor);
// Decode from CBOR bytes
byte[] bytes = File.ReadAllBytes("program.cbor");
SemanticTable decoded = TableCbor.Decode(bytes);
// Also available via Compiler API
byte[] binary = Compiler.CompileToBinary(
input: new CSharpGrammar(),
output: new X86AsmGrammar(),
text: csSource
);
SDXF Bundle Tags
When the entire compilation context (input grammar + output grammar + source files) is serialized as an SDXF bundle, the following tag layout is used. These tags must match between the C# SdfxEncoder and the F# UAB engine:
| Tag (hex) | Level | Meaning |
|---|---|---|
0x01 | Bundle | Root bundle node |
0x10 | Bundle | Input grammar definition |
0x11 | Bundle | Output grammar definition |
0x20 | Bundle | Source inputs container |
0x21 | Bundle | Single source input entry |
0x30 | Grammar | Tokens container |
0x31 | Grammar | Single token entry (name + kind + pattern) |
0x40 | Grammar | Rules container |
0x41 | Grammar | Single rule entry (kind + children) |
0x50 | Grammar | Root rule ID |
Translation Pipeline
The full pipeline from source text to translated output:
// Step 1: Lex + parse source with input grammar
ParseTree tree = inputGrammar.Parse(sourceText);
// Step 2: Assign structural roles to tokens
inputGrammar.ApplyStaticMaps();
// Step 3: Build SemanticTable from parse tree
SemanticTable table = inputGrammar.BuildTable(tree);
// Step 4: Translate table from input to output semantics
SemanticTable translated = Translator.Translate(table, inputGrammar, outputGrammar);
// Step 5: Generate target-language text
string result = outputGrammar.Render(translated);
Compiler.CompileText(inputGrammar, outputGrammar, sourceText) to run all five steps in one line.