Documentation Index
Fetch the complete documentation index at: https://docs.chameleondb.dev/llms.txt
Use this file to discover all available pages before exploring further.
Parser and AST
The parser transforms .cham schema files into a validated Abstract Syntax Tree (AST) using LALRPOP, a Rust parser generator based on LR(1) grammar.
Overview
The parsing pipeline consists of:
- Lexical Analysis - Tokenize input using LALRPOP’s built-in lexer
- Syntax Analysis - Build AST from tokens following grammar rules
- Error Enhancement - Add source context and helpful suggestions
- AST Construction - Create immutable schema representation
Architecture
.cham source
↓
LALRPOP Lexer (schema.lalrpop)
↓
Tokens: entity, Ident, "{", ":", etc.
↓
LALRPOP Parser (LR(1) grammar)
↓
AST (Schema → Entity → Field/Relation)
↓
Enhanced Error Reporting
↓
Validated Schema
Parser Entry Point
Location: chameleon-core/src/parser/mod.rs:13
pub fn parse_schema(input: &str) -> Result<Schema, ChameleonError> {
match schema::SchemaParser::new().parse(input) {
Ok(schema) => Ok(schema),
Err(e) => {
let err: ChameleonError = e.into();
Err(enhance_parse_error(err, input))
}
}
}
Key features:
- Zero-copy parsing where possible
- Enhanced error messages with source snippets
- Line/column precision for all syntax errors
LALRPOP Grammar
Location: chameleon-core/src/parser/schema.lalrpop
Entry Point
pub Schema: Schema = {
<entities:Entity*> => {
let mut schema = Schema::new();
for entity in entities {
schema.add_entity(entity);
}
schema
}
};
Entity Syntax
Entity: Entity = {
"entity" <name:Ident> "{" <items:EntityItem*> "}" => {
let mut entity = Entity::new(name);
for item in items {
match item {
EntityItem::Field(f) => entity.add_field(f),
EntityItem::Relation(r) => entity.add_relation(r),
}
}
entity
}
};
Field Syntax
Field: Field = {
<name:Ident> ":" <ft:FieldType> <mods:FieldModifier*> <backend:BackendAnnotation?> "," => {
let mut field = Field {
name,
field_type: ft,
nullable: false,
unique: false,
primary_key: false,
default: None,
backend: backend,
};
for modifier in mods {
match modifier {
FieldModifier::Primary => field.primary_key = true,
FieldModifier::Unique => field.unique = true,
FieldModifier::Nullable => field.nullable = true,
FieldModifier::Default(v) => field.default = Some(v),
}
}
field
}
};
Supported Types
FieldType: FieldType = {
"uuid" => FieldType::UUID,
"string" => FieldType::String,
"int" => FieldType::Int,
"decimal" => FieldType::Decimal,
"bool" => FieldType::Bool,
"timestamp" => FieldType::Timestamp,
"float" => FieldType::Float,
"vector" "(" <n:NumericLit> ")" => FieldType::Vector(n),
"[" <inner:FieldType> "]" => FieldType::Array(Box::new(inner)),
};
match {
r"\s*" => { }, // Skip whitespace
r"//[^\n\r]*[\n\r]*" => { }, // Skip line comments
_
}
AST Structures
Location: chameleon-core/src/ast/mod.rs
Schema
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct Schema {
pub entities: Vec<Entity>,
}
impl Schema {
pub fn new() -> Self
pub fn add_entity(&mut self, entity: Entity)
pub fn get_entity(&self, name: &str) -> Option<&Entity>
pub fn get_entity_mut(&mut self, name: &str) -> Option<&mut Entity>
}
Entity
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct Entity {
pub name: String,
pub fields: HashMap<String, Field>,
pub relations: HashMap<String, Relation>,
}
impl Entity {
pub fn new(name: String) -> Self
pub fn add_field(&mut self, field: Field)
pub fn add_relation(&mut self, relation: Relation)
}
Field
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct Field {
pub name: String,
pub field_type: FieldType,
pub nullable: bool,
pub unique: bool,
pub primary_key: bool,
pub default: Option<DefaultValue>,
pub backend: Option<BackendAnnotation>,
}
Field constraints:
primary_key - Entity identifier (required, exactly one per entity)
unique - Unique constraint (enforced at DB level)
nullable - NULL allowed (defaults to NOT NULL)
FieldType
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub enum FieldType {
UUID,
String,
Int,
Decimal,
Bool,
Timestamp,
Float,
Vector(usize), // Vector embeddings with dimension
Array(Box<FieldType>), // Arrays of any supported type
}
Relation
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct Relation {
pub name: String,
pub kind: RelationKind,
pub target_entity: String,
pub foreign_key: Option<String>,
pub through: Option<String>, // For many-to-many
}
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub enum RelationKind {
HasOne,
HasMany,
BelongsTo,
ManyToMany,
}
Backend Annotations
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub enum BackendAnnotation {
OLTP, // Default (PostgreSQL)
Cache, // @cache (planned: Redis)
OLAP, // @olap (planned: DuckDB)
Vector, // @vector (planned: pgvector/Milvus)
ML, // @ml (planned: feature store)
}
Error Enhancement
Location: chameleon-core/src/parser/mod.rs:80
The parser enhances errors with:
1. Source Context
fn extract_snippet(source: &str, line: usize, column: usize) -> String {
let lines: Vec<&str> = source.lines().collect();
let target_line = lines[line - 1];
// Format: line number │ source code
// │ ^^^^ error pointer
snippet.push_str(&format!("{:>width$} │ {}\n", line, target_line, ...));
snippet.push_str(&format!("{:>width$} │ ", "", ...));
// Add ^ characters to underline the error
}
Example output:
12 │ email: strnig unique,
│ ^^^^^^
Did you mean 'string'?
2. Position Calculation
fn offset_to_position(source: &str, offset: usize) -> (usize, usize) {
let mut line = 1;
let mut column = 1;
for (i, ch) in source.chars().enumerate() {
if i >= offset { break; }
if ch == '\n' {
line += 1;
column = 1;
} else {
column += 1;
}
}
(line, column)
}
3. Smart Suggestions
fn add_suggestions(mut detail: ParseErrorDetail) -> ParseErrorDetail {
if let Some(token) = &detail.token {
let token_lower = token.to_lowercase();
// Detect common typos
if token_lower.contains("entiy") {
detail.suggestion = Some("Did you mean 'entity'?");
}
else if token_lower.contains("primry") {
detail.suggestion = Some("Did you mean 'primary'?");
}
// ... more heuristics
}
detail
}
Common suggestions:
- Keyword typos:
entiy → entity, primry → primary
- Missing colons: “Fields must have a type after the colon”
- Unclosed braces: “Missing closing brace”
- EOF errors: “You may be missing a closing brace }“
Build Process
Location: chameleon-core/build.rs
fn main() {
// Generate parser from LALRPOP grammar
lalrpop::Configuration::new()
.set_out_dir(out_dir)
.process_dir(parser_dir)
.expect("Failed to process LALRPOP files");
// Generated parser: OUT_DIR/parser/schema.rs
// Included via: include!(concat!(env!("OUT_DIR"), "/parser/schema.rs"));
}
Generated artifacts:
$OUT_DIR/parser/schema.rs - LR(1) parser state machine
- Compile-time only - not included in library distribution
| Operation | Time | Notes |
|---|
| Parse schema (cold) | ~10ms | One-time cost per schema load |
| Parse schema (warm) | ~2ms | With OS page cache |
| AST construction | ~1ms | Minimal allocation overhead |
| Error enhancement | ~0.5ms | Only on parse failures |
Memory usage:
- Schema AST: ~50 bytes per field + ~80 bytes per relation
- Parser state machine: ~200KB (generated at compile time)
Example Usage
Basic Parsing
use chameleon::parser::parse_schema;
let input = r#"
entity User {
id: uuid primary,
email: string unique,
created_at: timestamp default now(),
}
"#;
let schema = parse_schema(input)?;
assert_eq!(schema.entities.len(), 1);
Accessing AST
let user = schema.get_entity("User").unwrap();
assert_eq!(user.fields.len(), 3);
let id_field = user.fields.get("id").unwrap();
assert!(id_field.primary_key);
assert_eq!(id_field.field_type, FieldType::UUID);
Error Handling
match parse_schema("invalid { syntax") {
Ok(_) => unreachable!(),
Err(ChameleonError::ParseError(detail)) => {
println!("Line {}, Column {}", detail.line, detail.column);
println!("{}", detail.snippet.unwrap());
println!("Suggestion: {}", detail.suggestion.unwrap());
}
Err(e) => panic!("Unexpected error: {}", e),
}
Testing
Location: chameleon-core/src/parser/mod.rs:170
Test coverage:
- Simple entities with fields
- Relations (HasMany, BelongsTo, etc.)
- Backend annotations (
@cache, @olap, @vector)
- Vector types with dimensions
- Array types (
[string], [decimal])
- Complex multi-backend schemas
Example test:
#[test]
fn test_backend_annotations() {
let input = r#"
entity Product {
id: uuid primary,
views_today: int @cache,
monthly_sales: decimal @olap,
embedding: vector(384) @vector,
}
"#;
let schema = parse_schema(input).unwrap();
let product = schema.get_entity("Product").unwrap();
assert_eq!(product.fields.get("views_today").unwrap().backend,
Some(BackendAnnotation::Cache));
assert_eq!(product.fields.get("embedding").unwrap().field_type,
FieldType::Vector(384));
}
See Also