Diagnostics

CDTk provides structured diagnostics for every phase of compilation. Each pipeline stage throws a specific exception type so you can pinpoint failures quickly.

Exception Types

ExceptionPhaseThrown When
LexExceptionLexerSource text contains a character or sequence that matches no token pattern
ParseExceptionParserToken stream does not match any rule at the current position
SemanticExceptionSemanticSemanticTable is malformed — missing morphism, invalid transform, type mismatch
TranslationExceptionTranslationNo matching structural role or name mapping found between grammars
RenderExceptionCode generationGrammar.Render() throws an unhandled exception
BinaryExceptionBinary generationGrammar.GenerateBinary() or PE writer fails

Error Handling

Wrap Compiler.CompileText() calls in a structured catch block to report errors per phase:

try {
    string output = Compiler.CompileText(inputGrammar, outputGrammar, source);
    Console.WriteLine(output);
}
catch (LexException ex) {
    Console.Error.WriteLine(
        $"[LEX]   Line {ex.Line}, col {ex.Column}: {ex.Message}");
    Console.Error.WriteLine(
        $"        Unrecognized token near: '{ex.Fragment}'");
}
catch (ParseException ex) {
    Console.Error.WriteLine(
        $"[PARSE] Line {ex.Line}: {ex.Message}");
    Console.Error.WriteLine(
        $"        Expected: {ex.Expected}, got: '{ex.Found}'");
}
catch (SemanticException ex) {
    Console.Error.WriteLine(
        $"[SEM]   {ex.Message}");
}
catch (RenderException ex) {
    Console.Error.WriteLine(
        $"[GEN]   {ex.Message}");
    Console.Error.WriteLine(ex.InnerException?.ToString());
}

Lex Errors

A LexException fires when the source contains a character or sequence that no token pattern can match. Common causes:

  • Missing token — you forgot to declare a token for a character used in the source (e.g., @ in a language where @ is a decorator prefix).
  • Wrong order — a multi-char token declared after a single-char one causes the prefix to be consumed first.
  • Encoding issue — BOM characters, non-ASCII quotes, or CRLF in Custom regex patterns.
// Fix: add the missing token
public static Token DECORATOR = Punct("@");

// Fix: declare ARROW before MINUS
public static Token ARROW = Op("->");  // must come before MINUS
public static Token MINUS = Op("-");

Parse Errors

A ParseException fires when the token stream doesn't match the grammar's rules at the current position. The exception includes line/column and the expected rule:

// Typical parse error output:
// [PARSE] Line 12: Expected 'RPAREN' but got 'SEMI'
//         In rule: FnDecl > Params > Param

To debug parse errors, enable verbose mode to see rule matching progress:

Compiler.Verbose = true;
// [Parse] Trying FnDecl at token 0 "fn"  — OK
// [Parse] Trying Params at token 3 "a"   — OK
// [Parse] Trying RPAREN at token 5 ";"   — FAIL

Semantic Errors

Semantic errors typically occur when the input parse tree cannot be fully mapped to a SemanticTable. Causes include:

  • Function body is empty (no morphism body found).
  • A referenced name is not in the table (missing stdlib mapping).
  • Type annotation is ambiguous or missing for a TypeKeyword role.
catch (SemanticException ex) {
    // ex.Message often contains the function name and row type
    // e.g. "Missing body for morphism 'Factorial'"
    Console.Error.WriteLine($"Semantic error: {ex.Message}");
}

Render & Binary Errors

RenderException wraps any exception thrown inside Grammar.Render(). BinaryException wraps exceptions from GenerateBinary() or the PE writer. Both preserve the inner exception for debugging:

catch (BinaryException ex) {
    Console.Error.WriteLine($"Binary error: {ex.Message}");
    Console.Error.WriteLine(ex.InnerException?.StackTrace);
}

Verbose Logging

Set Compiler.Verbose = true to enable per-phase diagnostic output to stderr. This is invaluable for debugging new grammars:

Compiler.Verbose = true;
string result = Compiler.CompileText(input, output, source);

// Example output:
// [Lex]   107 tokens scanned in 2ms
// [Parse] 42 rule matches, parse tree depth 8
// [Roles] 12 structural role assignments applied
// [Sem]   3 objects, 5 morphisms, 0 two-cells
// [Trans] 5/5 morphisms translated (100%)
// [Gen]   128 chars emitted in 1ms

QUILL Diagnostics

The QUILL ML.NET pipeline has its own diagnostic requirements. The most common issues:

IssueCauseFix
SchemaShape mismatchTraining.csv column count ≠ 129Ensure exactly 128 float columns + 1 label column
SDCA trainer error: "Label is not a key type"Missing MapValueToKey("Label") stepAdd MapValueToKey("Label") before SDCA in the pipeline
Feature vector dimension mismatchVectorType(128) doesn't match CSV column countUAB Features.extract pads/truncates to 128; check Training.csv
Poor accuracyTraining data too small or unbalancedAdd more balanced examples; use 10-fold cross-validation

Known Issues & Fixes

IssueLocationFix (v9.0)
CA2014: stackalloc in loopQUILL/Helper.csMoved stackalloc outside foreach loop
SDCA trainer rejects string labelsQUILL/Program.csAdded MapValueToKey("Label") before SDCA step
TypeKeyword tokens strippedStep 3 in CDTk.fsAdded tgtIsTypeKw3 check; TypeKeyword now preserved
Nonterminal rules missing from SDXFSdfxEncoder.Walk()Walk now follows Ref() lambdas via reflection