Diagnostics
CDTk provides structured diagnostics for every phase of compilation. Each pipeline stage throws a specific exception type so you can pinpoint failures quickly.
Exception Types
| Exception | Phase | Thrown When |
|---|---|---|
LexException | Lexer | Source text contains a character or sequence that matches no token pattern |
ParseException | Parser | Token stream does not match any rule at the current position |
SemanticException | Semantic | SemanticTable is malformed — missing morphism, invalid transform, type mismatch |
TranslationException | Translation | No matching structural role or name mapping found between grammars |
RenderException | Code generation | Grammar.Render() throws an unhandled exception |
BinaryException | Binary generation | Grammar.GenerateBinary() or PE writer fails |
Error Handling
Wrap Compiler.CompileText() calls in a structured catch block to report errors per phase:
try {
string output = Compiler.CompileText(inputGrammar, outputGrammar, source);
Console.WriteLine(output);
}
catch (LexException ex) {
Console.Error.WriteLine(
$"[LEX] Line {ex.Line}, col {ex.Column}: {ex.Message}");
Console.Error.WriteLine(
$" Unrecognized token near: '{ex.Fragment}'");
}
catch (ParseException ex) {
Console.Error.WriteLine(
$"[PARSE] Line {ex.Line}: {ex.Message}");
Console.Error.WriteLine(
$" Expected: {ex.Expected}, got: '{ex.Found}'");
}
catch (SemanticException ex) {
Console.Error.WriteLine(
$"[SEM] {ex.Message}");
}
catch (RenderException ex) {
Console.Error.WriteLine(
$"[GEN] {ex.Message}");
Console.Error.WriteLine(ex.InnerException?.ToString());
}
Lex Errors
A LexException fires when the source contains a character or sequence that no token pattern can match. Common causes:
- Missing token — you forgot to declare a token for a character used in the source (e.g.,
@in a language where@is a decorator prefix). - Wrong order — a multi-char token declared after a single-char one causes the prefix to be consumed first.
- Encoding issue — BOM characters, non-ASCII quotes, or CRLF in Custom regex patterns.
// Fix: add the missing token
public static Token DECORATOR = Punct("@");
// Fix: declare ARROW before MINUS
public static Token ARROW = Op("->"); // must come before MINUS
public static Token MINUS = Op("-");
Parse Errors
A ParseException fires when the token stream doesn't match the grammar's rules at the current position. The exception includes line/column and the expected rule:
// Typical parse error output:
// [PARSE] Line 12: Expected 'RPAREN' but got 'SEMI'
// In rule: FnDecl > Params > Param
To debug parse errors, enable verbose mode to see rule matching progress:
Compiler.Verbose = true;
// [Parse] Trying FnDecl at token 0 "fn" — OK
// [Parse] Trying Params at token 3 "a" — OK
// [Parse] Trying RPAREN at token 5 ";" — FAIL
Semantic Errors
Semantic errors typically occur when the input parse tree cannot be fully mapped to a SemanticTable. Causes include:
- Function body is empty (no morphism body found).
- A referenced name is not in the table (missing stdlib mapping).
- Type annotation is ambiguous or missing for a
TypeKeywordrole.
catch (SemanticException ex) {
// ex.Message often contains the function name and row type
// e.g. "Missing body for morphism 'Factorial'"
Console.Error.WriteLine($"Semantic error: {ex.Message}");
}
Render & Binary Errors
RenderException wraps any exception thrown inside Grammar.Render(). BinaryException wraps exceptions from GenerateBinary() or the PE writer. Both preserve the inner exception for debugging:
catch (BinaryException ex) {
Console.Error.WriteLine($"Binary error: {ex.Message}");
Console.Error.WriteLine(ex.InnerException?.StackTrace);
}
Verbose Logging
Set Compiler.Verbose = true to enable per-phase diagnostic output to stderr. This is invaluable for debugging new grammars:
Compiler.Verbose = true;
string result = Compiler.CompileText(input, output, source);
// Example output:
// [Lex] 107 tokens scanned in 2ms
// [Parse] 42 rule matches, parse tree depth 8
// [Roles] 12 structural role assignments applied
// [Sem] 3 objects, 5 morphisms, 0 two-cells
// [Trans] 5/5 morphisms translated (100%)
// [Gen] 128 chars emitted in 1ms
QUILL Diagnostics
The QUILL ML.NET pipeline has its own diagnostic requirements. The most common issues:
| Issue | Cause | Fix |
|---|---|---|
| SchemaShape mismatch | Training.csv column count ≠ 129 | Ensure exactly 128 float columns + 1 label column |
| SDCA trainer error: "Label is not a key type" | Missing MapValueToKey("Label") step | Add MapValueToKey("Label") before SDCA in the pipeline |
| Feature vector dimension mismatch | VectorType(128) doesn't match CSV column count | UAB Features.extract pads/truncates to 128; check Training.csv |
| Poor accuracy | Training data too small or unbalanced | Add more balanced examples; use 10-fold cross-validation |
Known Issues & Fixes
| Issue | Location | Fix (v9.0) |
|---|---|---|
CA2014: stackalloc in loop | QUILL/Helper.cs | Moved stackalloc outside foreach loop |
| SDCA trainer rejects string labels | QUILL/Program.cs | Added MapValueToKey("Label") before SDCA step |
TypeKeyword tokens stripped | Step 3 in CDTk.fs | Added tgtIsTypeKw3 check; TypeKeyword now preserved |
| Nonterminal rules missing from SDXF | SdfxEncoder.Walk() | Walk now follows Ref() lambdas via reflection |