Examples

Real-world CDTk examples demonstrating the full compilation pipeline across multiple target languages and binary formats.

C# ↔ Python Round-trip

Translate C# functions to Python and back. The round-trip verifies that no semantic information is lost:

using CDTk;

var csSource = """
public static int Factorial(int n) {
    if (n <= 1) { return 1; }
    return n * Factorial(n - 1);
}
""";

// C# → Python
string py = Compiler.CompileText(
    new CSharpGrammar(), new PythonGrammar(), csSource);

Console.WriteLine(py);
// def Factorial(n):
//     if n <= 1:
//         return 1
//     return n * Factorial(n - 1)

// Python → C# (round-trip)
string csBack = Compiler.CompileText(
    new PythonGrammar(), new CSharpGrammar(), py);

Console.WriteLine(csBack);
// public static int Factorial(int n) { ... }

C# → WebAssembly (WAT)

WasmGrammar.Render() converts Python/C# infix expressions to WAT stack instructions. The output is valid WebAssembly Text Format that can be assembled with wat2wasm:

// C# → WAT (via semantic table)
string wat = Compiler.CompileText(
    new CSharpGrammar(), new WasmGrammar(), csSource);

Console.WriteLine(wat);
// (module
//   (func $Factorial (param $n i32) (result i32)
//     local.get $n
//     i32.const 1
//     i32.le_s
//     if (result i32)
//       i32.const 1
//     else
//       local.get $n
//       local.get $n
//       i32.const 1
//       i32.sub
//       call $Factorial
//       i32.mul
//     end
//   )
//   (export "Factorial" (func $Factorial))
// )

// WAT → C# (decompilation)
string csDecompiled = Compiler.CompileText(
    new WasmGrammar(), new CSharpGrammar(), wat);

C# → LLVM IR

LlvmGrammar targets LLVM Intermediate Representation, which can then be optimized with opt and compiled to any platform with llc:

string llvm = Compiler.CompileText(
    new CSharpGrammar(), new LlvmGrammar(), csSource);

Console.WriteLine(llvm);
// define i32 @Factorial(i32 %n) {
// entry:
//   %cond = icmp sle i32 %n, 1
//   br i1 %cond, label %then, label %else
// then:
//   ret i32 1
// else:
//   %nm1 = sub i32 %n, 1
//   %rec = call i32 @Factorial(i32 %nm1)
//   %res = mul i32 %n, %rec
//   ret i32 %res
// }

// LLVM IR token count: 147

Multi-step Translation Chain

CDTk translations compose freely. Each output can become the input for the next step:

var cs   = new CSharpGrammar();
var py   = new PythonGrammar();
var wasm = new WasmGrammar();
var llvm = new LlvmGrammar();

// C# → Python → WASM → LLVM → back to C#
var step1 = Compiler.CompileText(cs,   py,   csSource);
var step2 = Compiler.CompileText(py,   wasm, step1);
var step3 = Compiler.CompileText(wasm, llvm, step2);
var step4 = Compiler.CompileText(llvm, cs,   step3);

// step4 is semantically equivalent to csSource
Console.WriteLine(step4);

In-memory Binary Compilation

Use the binary compilation API to avoid temporary files entirely:

// CompileToBinary: text → CBOR binary (SemanticTable)
byte[] cbor = Compiler.CompileToBinary(
    input:  new CSharpGrammar(),
    output: new WasmGrammar(),
    text:   csSource);

// DecompileFromBinary: CBOR binary → text
string decompiled = Compiler.DecompileFromBinary(
    input:  new WasmGrammar(),
    output: new CSharpGrammar(),
    data:   cbor);

// CompileBinary: binary in → binary out (e.g., CBOR → PE EXE)
byte[] exe = Compiler.CompileBinary(
    input:  new CSharpGrammar(),
    output: new X86AsmGrammar(),
    data:   cbor);

CRAB: Native x86-64 PE EXE

The CRAB project compiles C# source directly to a native Windows PE32+ executable. The pipeline: CSharpGrammarX86AsmGrammarX86AsmParserX86CodeGenPeWriter:

// Run from: dotnet run --project CRAB/CRAB.csproj

var csSource = """
public static void Main() {
    Console.WriteLine("Hello, World!");
}
""";

// Full C# → x86-64 EXE pipeline
byte[] exe = CrabCompiler.Compile(csSource);
File.WriteAllBytes("hello.exe", exe);

// hello.exe is a valid 64-bit PE32+ executable
// Run with: ./hello.exe  (Windows x86-64)

CRAB uses three internal components, all in CRAB/X86BinaryBackend.cs:

ComponentRole
X86AsmParserParses CDTk's x86 assembly text into an AST
X86CodeGenEncodes AST nodes to x86-64 machine code bytes
PeWriterWraps machine code in a valid PE32+ executable

QUILL: ML Token Classification

QUILL uses ML.NET (SDCA) to classify tokens based on 128-dimensional feature vectors. It's used to improve CDTk's grammar disambiguation:

// Run with: dotnet run --project QUILL/QUILL.csproj

// Training.csv format: 128 float columns + 1 label column = 129 total
// Features are extracted by FSharp.CDTk/UAB.fs Features.extract
// Extract pads/truncates to exactly 128 dimensions

// QUILL ML.NET pipeline (QUILL/Program.cs):
// 1. Load Training.csv
// 2. MapValueToKey("Label")   ← required: SDCA needs key-type labels
// 3. Concatenate 128 features → "Features" vector
// 4. Train SDCA multi-class classifier
// 5. Evaluate on test set → report accuracy
Required: MapValueToKey
QUILL must call MapValueToKey("Label") before the SDCA trainer because Training.csv labels are strings. Without it, the pipeline throws SchemaShape mismatch: Label is not a key type.

9-Step Stress Test

The testing project runs a 9-step translation chain to verify the full pipeline end-to-end. It translates 74 C# functions through C#→Py→WASM→LLVM→Py→C#→WASM→Py→LLVM→C# and verifies semantic equivalence at each step:

# Run all tests
dotnet run --project Testing/Testing.csproj

# Expected output (all phases PASS):
# Phase 1: C#    → Python  [PASS]  74 functions
# Phase 2: Py    → C# (rt) [PASS]  round-trip verified
# Phase 3: Py    → C#      [PASS]  74 functions
# Phase 4: C#    → Py (rt) [PASS]  round-trip verified
# Phase 5: C#    → WASM    [PASS]  74 functions
# Phase 6: WASM  → C# (rt) [PASS]  round-trip verified
# Phase 7: C#    → LLVM    [PASS]  74 functions
# Phase 8: LLVM  → C# (rt) [PASS]  round-trip verified
# Phase 9: Py    → WASM    [PASS]  57 functions
# ✓ All 9 phases PASS

Grammar token counts used in the stress test:

GrammarToken CountFile
C#107Testing/Grammars/CSharpGrammar.cs
Python85Testing/Grammars/PythonGrammar.cs
WASM (WAT)107Testing/Grammars/WasmGrammar.cs
LLVM IR147Testing/Grammars/LlvmGrammar.cs

The test uses const UserFuncCount = 9 to split the function list between stdlib functions and user example functions. Update this constant if you add or remove example functions.