Metadata System
This document explains the metadata system in ReX, which provides structured descriptive information about content.
Overview
The metadata system provides a flexible way to associate descriptive information with content beyond its primary data. This enables content to carry context, relationships, and additional properties that enhance its usability and richness.
Metadata Fundamentals
Metadata in the system is represented by the ContentMetadata
interface:
interface ContentMetadata {
// Common fields
title?: string
description?: string
createdAt?: Date
updatedAt?: Date
// References to other content
references?: ContentReference[]
// Extension point for custom metadata
[key: string]: any
}
Key characteristics of the metadata system:
- Flexible Structure: The index signature allows for arbitrary additional properties
- Optional Fields: All standard fields are optional
- Strong Typing: Core fields have specific types for better type safety
- Content Relationships: The references field supports linking to other content
Core Metadata Fields
Identity Metadata
// Identity-related fields
title?: string; // Human-readable title
description?: string; // Brief description
slug?: string; // URL-friendly identifier
id?: string; // Unique identifier
Temporal Metadata
// Time-related fields
createdAt?: Date; // Creation timestamp
updatedAt?: Date; // Last update timestamp
publishedAt?: Date; // Publication timestamp
expiresAt?: Date; // Expiration timestamp
Categorization Metadata
// Categorization fields
tags?: string[]; // Simple tag list
categories?: string[]; // Category classifications
language?: string; // Content language (ISO code)
region?: string; // Geographic region
Authorship Metadata
// Authorship fields
author?: string | { // Simple string or object
name: string;
email?: string;
url?: string;
};
contributors?: Array<string | { // List of contributors
name: string;
email?: string;
url?: string;
role?: string;
}>;
State Metadata
// State-related fields
status?: 'draft' | 'published' | 'archived'; // Content status
visibility?: 'public' | 'private' | 'protected'; // Content visibility
version?: string; // Content version identifier
draft?: boolean; // Whether content is in draft state
References System
The metadata system includes a references system for expressing relationships between content:
interface ContentReference {
uri: string // URI of referenced content
type: 'embed' | 'link' | 'dependency' // Reference type
title?: string // Optional title
description?: string // Optional description
}
Reference types:
- Embed: Content directly embedded within the parent (e.g., an image in a Markdown document)
- Link: Content linked to but not embedded (e.g., a link to another article)
- Dependency: Content required for functioning but not visually present (e.g., a CSS file for HTML)
Example:
const articleWithReferences: Content<string> = {
data: '# Article with References\n\n\n\n[Link to related](related.md)',
contentType: 'text/markdown',
metadata: {
title: 'Article with References',
references: [
{ uri: 'image.jpg', type: 'embed', title: 'Featured Image' },
{ uri: 'related.md', type: 'link', title: 'Related Article' },
],
},
}
Domain-Specific Metadata
The metadata system supports domain-specific extensions through specialized interfaces:
Document Metadata
interface DocumentMetadata extends ContentMetadata {
author?: string
publishedAt?: Date
tags?: string[]
language?: string
wordCount?: number
readingTime?: number
toc?: Array<{ title: string; level: number; id: string }>
}
Media Metadata
interface ImageMetadata extends ContentMetadata {
width?: number
height?: number
format?: string
alt?: string
caption?: string
}
interface VideoMetadata extends ContentMetadata {
width?: number
height?: number
duration?: number
format?: string
thumbnail?: string
}
Data Metadata
interface DataMetadata extends ContentMetadata {
schema?: string
schemaVersion?: string
validatedAt?: Date
isValid?: boolean
}
Metadata Extraction
The system includes utilities for extracting metadata from content:
Frontmatter Extraction
// Extract metadata from Markdown frontmatter
const markdownWithFrontmatter = `---
title: Hello World
author: John Doe
tags:
- example
- markdown
createdAt: 2025-03-15
---
# Hello World
This is an example.`
const { data, metadata } = extractFrontmatter(markdownWithFrontmatter)
// data: "# Hello World\n\nThis is an example."
// metadata: {
// title: "Hello World",
// author: "John Doe",
// tags: ["example", "markdown"],
// createdAt: Date("2025-03-15")
// }
Media Metadata Extraction
// Extract metadata from an image
const imageMetadata = await extractImageMetadata(imageData)
// {
// width: 1200,
// height: 800,
// format: "jpeg",
// exif: { ... }
// }
Metadata Validation
The system supports metadata validation through schema validation:
// Define a schema for blog post metadata
const blogPostMetadataSchema = {
type: 'object',
required: ['title', 'author', 'createdAt'],
properties: {
title: { type: 'string', minLength: 1 },
author: { type: 'string', minLength: 1 },
createdAt: { type: 'string', format: 'date-time' },
tags: {
type: 'array',
items: { type: 'string' },
},
draft: { type: 'boolean' },
},
}
// Validate metadata
const isValid = validateMetadata(metadata, blogPostMetadataSchema)
Metadata Serialization
Metadata can be serialized for storage or transfer:
// Serialize metadata to JSON
const serialized = JSON.stringify(metadata)
// Deserialize from JSON with date handling
const deserialized = JSON.parse(serialized, (key, value) => {
// Convert date strings back to Date objects
if (key === 'createdAt' || key === 'updatedAt' || key === 'publishedAt') {
return new Date(value)
}
return value
})
Usage Patterns
Basic Metadata
// Creating content with basic metadata
const content: Content<string> = {
data: '# Hello World\n\nThis is a sample document.',
contentType: 'text/markdown',
metadata: {
title: 'Hello World',
description: 'A sample Markdown document',
createdAt: new Date(),
updatedAt: new Date(),
},
}
Rich Metadata
// Creating content with rich metadata
const blogPost: Content<string> = {
data: '# Advanced Techniques\n\nThis post explores advanced techniques...',
contentType: 'text/markdown',
metadata: {
title: 'Advanced Techniques',
description: 'Exploring advanced content techniques',
author: {
name: 'Jane Smith',
email: '[email protected]',
},
tags: ['advanced', 'tutorial', 'content'],
createdAt: new Date('2025-03-01T12:00:00Z'),
updatedAt: new Date('2025-03-15T09:30:00Z'),
publishedAt: new Date('2025-03-16T10:00:00Z'),
status: 'published',
readingTime: 8, // minutes
wordCount: 1500,
},
}
Metadata with References
// Creating content with references in metadata
const articleWithImages: Content<string> = {
data: '# Article with Images\n\n\n\n',
contentType: 'text/markdown',
metadata: {
title: 'Article with Images',
description: 'An article demonstrating image references',
references: [
{
uri: 'images/first.jpg',
type: 'embed',
title: 'First Image',
},
{
uri: 'images/second.jpg',
type: 'embed',
title: 'Second Image',
},
],
},
}
Querying by Metadata
// Hypothetical query based on metadata
const recentPosts = await store.query({
contentType: 'text/markdown',
metadata: {
tags: { $contains: 'tutorial' },
publishedAt: { $gte: new Date('2025-01-01') },
status: 'published',
},
})
Updating Metadata
// Reading existing content
const content = await store.read('posts/hello.md')
// Updating only the metadata
await store.write('posts/hello.md', {
...content,
metadata: {
...content.metadata,
updatedAt: new Date(),
tags: [...(content.metadata.tags || []), 'updated'],
},
})
Metadata and Content Lifecycle
Metadata changes throughout the content lifecycle:
- Creation: Basic metadata is added (title, createdAt)
- Editing: Metadata is updated (updatedAt, version)
- Publishing: Publication metadata is added (publishedAt, status)
- Categorization: Organizational metadata is added (tags, categories)
- Linking: References to other content are added
- Archiving: Archival metadata is added (archivedAt, status)
Best Practices
- Consistent Fields: Use common field names consistently across content
- Standard Types: Use standard types (Date for timestamps, arrays for lists)
- Minimal Required Fields: Keep required metadata minimal to simplify content creation
- Explicit Extension: Create explicit interfaces for domain-specific metadata
- Validation: Validate metadata against schemas for consistency
Related Concepts
- Content Model: The overall content model
- Content Structure: How content is structured
- [TODO]
References: How content references work