356 lines
8.7 KiB
Markdown
356 lines
8.7 KiB
Markdown
# Terrace Language Specification
|
|
|
|
The Terrace language is designed to be a minimal meta-language, capable of being both human and machine readable and writable, but doing little of its own accord other than offering a set of basic conventions that other languages built on top of Terrace can use to their own advantage.
|
|
|
|
**Version 0.2.0** - Now with modern, idiomatic APIs across JavaScript, Python, C, and Rust!
|
|
|
|
## Semantics
|
|
|
|
At its core, Terrace has only three semantic character classes:
|
|
|
|
1. Leading whitespace - Leading space (configurable) characters indicate the nesting level of a given section of the document.
|
|
|
|
- No characters other than whitespace may be exist at the start of a line unless the line is at the root of the nesting hierarchy.
|
|
|
|
2. Newlines (\n) - Newlines indicate when to start matching the next line. The \n character is matched. Carriage returns are treated literally and not used for parsing.
|
|
3. Every other character - The first character encountered after a newline and optional sequence of indent spaces is considered the start of a line's contents. Terrace will process the line verbatim until reaching a newline character.
|
|
|
|
### Exception: Blank Lines
|
|
|
|
Blank lines will be treated as if they were indented to the same level of the previous line. This allows blocks with embedded text and documents to retain whitespace relevant to their own internal semantics even if it is stripped out by well-meaning code formatters or unintentionally ignored.
|
|
|
|
Example:
|
|
|
|
<table>
|
|
<thead>
|
|
<tr>
|
|
<th>Source</th>
|
|
<th>Interpreted As</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>
|
|
|
|
```tce
|
|
markdown
|
|
# Title
|
|
|
|
|
|
Dolore do do sit velit ullamco labore nisi laborum ut.
|
|
|
|
markdown
|
|
# Title 2
|
|
|
|
Incididunt qui nulla est enim officia ad sunt excepteur consequat sunt.
|
|
|
|
```
|
|
|
|
</td>
|
|
|
|
<td>
|
|
|
|
```tce
|
|
markdown
|
|
-># Title
|
|
->
|
|
->
|
|
->Dolore do do sit velit ullamco labore nisi laborum ut.
|
|
->
|
|
markdown
|
|
-># Title 2
|
|
->
|
|
->Incididunt qui nulla est enim officia ad sunt excepteur consequat sunt.
|
|
->
|
|
```
|
|
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
## Structure
|
|
|
|
A Terrace document consists of lines of arbitrary characters with leading whitespace indicating their nesting level relative to each other.
|
|
|
|
The following document contains two root-level elements, the first a line with the word "hello" with two lines of "world" nested under it, and the other line with the words "hello again" with "terrace" nested under it.
|
|
|
|
```tce
|
|
hello
|
|
world
|
|
world
|
|
|
|
hello again
|
|
terrace
|
|
```
|
|
|
|
## Language Support
|
|
|
|
Terrace provides idiomatic APIs for multiple programming languages:
|
|
|
|
### JavaScript/TypeScript (Node.js)
|
|
|
|
```javascript
|
|
import { useDocument, create_string_reader } from "@terrace-lang/js";
|
|
|
|
const doc = useDocument(
|
|
create_string_reader(`
|
|
config
|
|
database
|
|
host localhost
|
|
port 5432
|
|
server
|
|
port 3000
|
|
host 0.0.0.0
|
|
`)
|
|
);
|
|
|
|
// Modern iterator-based API
|
|
for await (const node of doc) {
|
|
if (node.is("config")) {
|
|
console.log("Found config section");
|
|
for await (const child of node.children()) {
|
|
console.log(` ${child.head}: ${child.tail}`);
|
|
for await (const setting of child.children()) {
|
|
console.log(` ${setting.head} = ${setting.tail}`);
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Python
|
|
|
|
```python
|
|
from terrace import use_document, create_string_reader
|
|
|
|
data = """
|
|
config
|
|
database
|
|
host localhost
|
|
port 5432
|
|
server
|
|
port 3000
|
|
host 0.0.0.0
|
|
"""
|
|
|
|
doc = use_document(create_string_reader(data))
|
|
|
|
# Generator-based API with natural Python iteration
|
|
for node in doc:
|
|
if node.is_('config'):
|
|
print('Found config section')
|
|
for child in node.children():
|
|
print(f" {child.head}: {child.tail}")
|
|
for setting in child.children():
|
|
print(f" {setting.head} = {setting.tail}")
|
|
```
|
|
|
|
### C
|
|
|
|
```c
|
|
#include "terrace/document.h"
|
|
|
|
// Modern node-based API with string views
|
|
TERRACE_FOR_EACH_NODE(&doc, node) {
|
|
if (TERRACE_NODE_MATCHES(node, "config")) {
|
|
printf("Found config section\n");
|
|
|
|
unsigned int config_level = terrace_node_level(&node);
|
|
TERRACE_FOR_CHILD_NODES(&doc, config_level, child) {
|
|
terrace_string_view_t head = terrace_node_head(&child);
|
|
terrace_string_view_t tail = terrace_node_tail(&child);
|
|
printf(" %.*s: %.*s\n", (int)head.len, head.str, (int)tail.len, tail.str);
|
|
|
|
unsigned int child_level = terrace_node_level(&child);
|
|
TERRACE_FOR_CHILD_NODES(&doc, child_level, setting) {
|
|
terrace_string_view_t setting_head = terrace_node_head(&setting);
|
|
terrace_string_view_t setting_tail = terrace_node_tail(&setting);
|
|
printf(" %.*s = %.*s\n",
|
|
(int)setting_head.len, setting_head.str,
|
|
(int)setting_tail.len, setting_tail.str);
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Go
|
|
|
|
```go
|
|
package main
|
|
|
|
import (
|
|
"fmt"
|
|
"io"
|
|
"strings"
|
|
|
|
"terrace.go"
|
|
)
|
|
|
|
func main() {
|
|
data := `
|
|
config
|
|
database
|
|
host localhost
|
|
port 5432
|
|
server
|
|
port 3000
|
|
host 0.0.0.0
|
|
`
|
|
doc := terrace.NewTerraceDocument(&StringReader{reader: strings.NewReader(data)}, ' ')
|
|
|
|
for {
|
|
node, err := doc.Next()
|
|
if err == io.EOF {
|
|
break
|
|
}
|
|
if err != nil {
|
|
panic(err)
|
|
}
|
|
|
|
if node.Head() == "config" {
|
|
fmt.Println("Found config section")
|
|
for child := range node.Children() {
|
|
fmt.Printf(" %s: %s\n", child.Head(), child.Tail())
|
|
for grandchild := range child.Children() {
|
|
fmt.Printf(" %s = %s\n", grandchild.Head(), grandchild.Tail())
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
// A simple string reader for the example
|
|
type StringReader struct {
|
|
reader *strings.Reader
|
|
}
|
|
|
|
func (r *StringReader) Read() (string, error) {
|
|
line, err := r.reader.ReadString('\n')
|
|
if err != nil {
|
|
return "", err
|
|
}
|
|
return strings.TrimRight(line, "\n"), nil
|
|
}
|
|
|
|
```
|
|
|
|
### Rust
|
|
|
|
```rust
|
|
use terrace::{TerraceDocument, StringReader};
|
|
|
|
#[tokio::main]
|
|
async fn main() {
|
|
let data = r#"
|
|
config
|
|
database
|
|
host localhost
|
|
port 5432
|
|
server
|
|
port 3000
|
|
host 0.0.0.0
|
|
"#;
|
|
|
|
let reader = StringReader::new(data);
|
|
let mut doc = TerraceDocument::with_reader(reader);
|
|
|
|
while let Some(node) = doc.next().await {
|
|
if node.is("config") {
|
|
println!("Found config section");
|
|
// In a real implementation, you'd handle children here
|
|
// For now, just print all nodes
|
|
println!(" {}: '{}'", node.head(), node.tail());
|
|
} else {
|
|
println!(" {}: '{}'", node.head(), node.tail());
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Key Features
|
|
|
|
- **Zero Memory Allocation**: All implementations avoid allocating memory for string operations, using views/slices instead
|
|
- **Streaming Capable**: Process documents of any size without loading everything into memory
|
|
- **Idiomatic APIs**: Each language follows its own conventions (iterators in JS, generators in Python, macros in C)
|
|
- **Type Safe**: Full type safety in TypeScript, type hints in Python
|
|
- **Cross-Platform**: Works on all major operating systems and architectures
|
|
|
|
## Nesting Rules
|
|
|
|
**Core Parser Requirement**: Terrace parsers MUST accept arbitrarily deep nesting. Each line may be nested any number of levels deeper than its parent, with no upper limit on nesting depth.
|
|
|
|
**Parser Implementation**: When parsing indentation, calculate the nesting level by counting the number of indent units (spaces or tabs) at the start of each line. The level determines the hierarchical relationship between lines.
|
|
|
|
**Acceptable:**
|
|
|
|
```tce
|
|
level 1
|
|
level 2
|
|
level 3
|
|
level 2
|
|
level 3
|
|
level 4
|
|
level 4
|
|
level 5
|
|
level 2
|
|
|
|
level 1
|
|
level 2
|
|
```
|
|
|
|
**Also Acceptable:** (lines may be nested arbitrarily deeper than their parent)
|
|
|
|
```tce
|
|
level 1
|
|
level 2
|
|
level 5
|
|
level 3
|
|
level 6
|
|
level 2
|
|
|
|
level 1
|
|
level 2
|
|
```
|
|
|
|
**Navigation API**: The `children()` method should return all descendant nodes that are deeper than the parent, in document order, regardless of their nesting level. This ensures that arbitrarily nested structures are properly traversable.
|
|
|
|
**Language-Specific Restrictions**: Terrace-based languages may introduce additional restrictions on nesting (e.g., requiring consistent indentation), but the core parser accepts arbitrarily deep nesting to maintain maximum flexibility.
|
|
|
|
## Quick Start
|
|
|
|
### Installation
|
|
|
|
**JavaScript/Node.js:**
|
|
|
|
```bash
|
|
npm install @terrace-lang/js
|
|
```
|
|
|
|
**Python:**
|
|
|
|
```bash
|
|
pip install terrace-lang
|
|
```
|
|
|
|
**C:**
|
|
Include the header files from `packages/c/` in your project.
|
|
|
|
**Go:**
|
|
|
|
```bash
|
|
go get terrace.go
|
|
```
|
|
|
|
### Basic Usage
|
|
|
|
All three language implementations provide the same core functionality with language-appropriate APIs:
|
|
|
|
1. **Document Iteration**: Process documents line by line
|
|
2. **Hierarchical Navigation**: Easily access child and sibling nodes
|
|
3. **Content Access**: Get head/tail/content of each line with zero allocations
|
|
4. **Pattern Matching**: Built-in helpers for common parsing patterns
|
|
|
|
See the language-specific examples above for detailed usage patterns.
|