Examples
Real-world CDTk examples demonstrating the full compilation pipeline across multiple target languages and binary formats.
C# ↔ Python Round-trip
Translate C# functions to Python and back. The round-trip verifies that no semantic information is lost:
using CDTk;
var csSource = """
public static int Factorial(int n) {
if (n <= 1) { return 1; }
return n * Factorial(n - 1);
}
""";
// C# → Python
string py = Compiler.CompileText(
new CSharpGrammar(), new PythonGrammar(), csSource);
Console.WriteLine(py);
// def Factorial(n):
// if n <= 1:
// return 1
// return n * Factorial(n - 1)
// Python → C# (round-trip)
string csBack = Compiler.CompileText(
new PythonGrammar(), new CSharpGrammar(), py);
Console.WriteLine(csBack);
// public static int Factorial(int n) { ... }
C# → WebAssembly (WAT)
WasmGrammar.Render() converts Python/C# infix expressions to WAT stack instructions. The output is valid WebAssembly Text Format that can be assembled with wat2wasm:
// C# → WAT (via semantic table)
string wat = Compiler.CompileText(
new CSharpGrammar(), new WasmGrammar(), csSource);
Console.WriteLine(wat);
// (module
// (func $Factorial (param $n i32) (result i32)
// local.get $n
// i32.const 1
// i32.le_s
// if (result i32)
// i32.const 1
// else
// local.get $n
// local.get $n
// i32.const 1
// i32.sub
// call $Factorial
// i32.mul
// end
// )
// (export "Factorial" (func $Factorial))
// )
// WAT → C# (decompilation)
string csDecompiled = Compiler.CompileText(
new WasmGrammar(), new CSharpGrammar(), wat);
C# → LLVM IR
LlvmGrammar targets LLVM Intermediate Representation, which can then be optimized with opt and compiled to any platform with llc:
string llvm = Compiler.CompileText(
new CSharpGrammar(), new LlvmGrammar(), csSource);
Console.WriteLine(llvm);
// define i32 @Factorial(i32 %n) {
// entry:
// %cond = icmp sle i32 %n, 1
// br i1 %cond, label %then, label %else
// then:
// ret i32 1
// else:
// %nm1 = sub i32 %n, 1
// %rec = call i32 @Factorial(i32 %nm1)
// %res = mul i32 %n, %rec
// ret i32 %res
// }
// LLVM IR token count: 147
Multi-step Translation Chain
CDTk translations compose freely. Each output can become the input for the next step:
var cs = new CSharpGrammar();
var py = new PythonGrammar();
var wasm = new WasmGrammar();
var llvm = new LlvmGrammar();
// C# → Python → WASM → LLVM → back to C#
var step1 = Compiler.CompileText(cs, py, csSource);
var step2 = Compiler.CompileText(py, wasm, step1);
var step3 = Compiler.CompileText(wasm, llvm, step2);
var step4 = Compiler.CompileText(llvm, cs, step3);
// step4 is semantically equivalent to csSource
Console.WriteLine(step4);
In-memory Binary Compilation
Use the binary compilation API to avoid temporary files entirely:
// CompileToBinary: text → CBOR binary (SemanticTable)
byte[] cbor = Compiler.CompileToBinary(
input: new CSharpGrammar(),
output: new WasmGrammar(),
text: csSource);
// DecompileFromBinary: CBOR binary → text
string decompiled = Compiler.DecompileFromBinary(
input: new WasmGrammar(),
output: new CSharpGrammar(),
data: cbor);
// CompileBinary: binary in → binary out (e.g., CBOR → PE EXE)
byte[] exe = Compiler.CompileBinary(
input: new CSharpGrammar(),
output: new X86AsmGrammar(),
data: cbor);
CRAB: Native x86-64 PE EXE
The CRAB project compiles C# source directly to a native Windows PE32+ executable. The pipeline: CSharpGrammar → X86AsmGrammar → X86AsmParser → X86CodeGen → PeWriter:
// Run from: dotnet run --project CRAB/CRAB.csproj
var csSource = """
public static void Main() {
Console.WriteLine("Hello, World!");
}
""";
// Full C# → x86-64 EXE pipeline
byte[] exe = CrabCompiler.Compile(csSource);
File.WriteAllBytes("hello.exe", exe);
// hello.exe is a valid 64-bit PE32+ executable
// Run with: ./hello.exe (Windows x86-64)
CRAB uses three internal components, all in CRAB/X86BinaryBackend.cs:
| Component | Role |
|---|---|
X86AsmParser | Parses CDTk's x86 assembly text into an AST |
X86CodeGen | Encodes AST nodes to x86-64 machine code bytes |
PeWriter | Wraps machine code in a valid PE32+ executable |
QUILL: ML Token Classification
QUILL uses ML.NET (SDCA) to classify tokens based on 128-dimensional feature vectors. It's used to improve CDTk's grammar disambiguation:
// Run with: dotnet run --project QUILL/QUILL.csproj
// Training.csv format: 128 float columns + 1 label column = 129 total
// Features are extracted by FSharp.CDTk/UAB.fs Features.extract
// Extract pads/truncates to exactly 128 dimensions
// QUILL ML.NET pipeline (QUILL/Program.cs):
// 1. Load Training.csv
// 2. MapValueToKey("Label") ← required: SDCA needs key-type labels
// 3. Concatenate 128 features → "Features" vector
// 4. Train SDCA multi-class classifier
// 5. Evaluate on test set → report accuracy
MapValueToKey("Label") before the SDCA trainer because Training.csv labels are strings. Without it, the pipeline throws SchemaShape mismatch: Label is not a key type.9-Step Stress Test
The testing project runs a 9-step translation chain to verify the full pipeline end-to-end. It translates 74 C# functions through C#→Py→WASM→LLVM→Py→C#→WASM→Py→LLVM→C# and verifies semantic equivalence at each step:
# Run all tests
dotnet run --project Testing/Testing.csproj
# Expected output (all phases PASS):
# Phase 1: C# → Python [PASS] 74 functions
# Phase 2: Py → C# (rt) [PASS] round-trip verified
# Phase 3: Py → C# [PASS] 74 functions
# Phase 4: C# → Py (rt) [PASS] round-trip verified
# Phase 5: C# → WASM [PASS] 74 functions
# Phase 6: WASM → C# (rt) [PASS] round-trip verified
# Phase 7: C# → LLVM [PASS] 74 functions
# Phase 8: LLVM → C# (rt) [PASS] round-trip verified
# Phase 9: Py → WASM [PASS] 57 functions
# ✓ All 9 phases PASS
Grammar token counts used in the stress test:
| Grammar | Token Count | File |
|---|---|---|
| C# | 107 | Testing/Grammars/CSharpGrammar.cs |
| Python | 85 | Testing/Grammars/PythonGrammar.cs |
| WASM (WAT) | 107 | Testing/Grammars/WasmGrammar.cs |
| LLVM IR | 147 | Testing/Grammars/LlvmGrammar.cs |
The test uses const UserFuncCount = 9 to split the function list between stdlib functions and user example functions. Update this constant if you add or remove example functions.