Content Structure

This document explains the fundamental structure of content in the ReX system and how this structure enables flexibility and consistency across different environments.

Overview

Content structure refers to how content is organized, addressed, and represented in the system. The structure is designed to be simple yet flexible, allowing for a wide range of content types while maintaining a consistent programming interface.

Core Content Structure

At its most basic level, content in the system consists of three components:

Data: The actual content (text, binary data, structured data)
Content Type: The MIME type of the content (e.g., text/markdown, image/png)
Metadata: Descriptive information about the content

This structure is expressed in the Content<T> interface:

typescript

interface Content<T = string> {
  data: T
  contentType: string
  metadata: ContentMetadata
}

The generic type parameter T allows for different data representations:

Content<string>: Text-based content (Markdown, HTML, etc.)
Content<Uint8Array> or Content<Buffer>: Binary content (images, PDFs, etc.)
Content<object>: Structured content (JSON, etc.)

Content Addressing

Content is addressed using a URI-based system that is:

Storage-Agnostic: The same URI format works across different storage backends
Path-Like: Uses familiar path notation for intuitive navigation
Hierarchical: Supports nested content organization

typescript

// Example URIs
const textUri = 'blog/posts/hello-world.md'
const imageUri = 'assets/images/logo.png'
const configUri = 'config/settings.json'

The URI system is implemented with the following components:

URI Parsing

typescript

interface ParsedContentURI {
  original: string // The original URI string
  segments: string[] // Path segments
  extension?: string // File extension (if any)
  contentType?: string // Content type inferred from extension
}

// Example
const parsed = parseContentUri('blog/posts/hello-world.md')
// {
//   original: 'blog/posts/hello-world.md',
//   segments: ['blog', 'posts', 'hello-world'],
//   extension: 'md',
//   contentType: 'text/markdown'
// }

URI Resolution

URIs can be relative or absolute within the content store:

typescript

// Resolving a relative URI against a base URI
const base = 'blog/posts/'
const relative = '../images/photo.jpg'
const resolved = resolveContentUri(relative, base)
// 'blog/images/photo.jpg'

// Normalizing a URI (removing . and .. segments)
const normalized = normalizeContentUri('blog/posts/../images/./photo.jpg')
// 'blog/images/photo.jpg'

URI Mapping

Different storage adapters map URIs to their native path formats:

typescript

// FileSystem adapter
const fsPath = mapUriToPath('blog/posts/hello.md', { root: '/content' })
// '/content/blog/posts/hello.md'

// HTTP adapter
const url = mapUriToPath('blog/posts/hello.md', {
  root: 'https://example.com/api',
})
// 'https://example.com/api/blog/posts/hello.md'

Content Collections

Content is often organized into collections with a predictable structure:

typescript

interface ContentCollection<T = string> {
  [uri: string]: Content<T>
}

Collections can be:

Homogeneous: All content has the same type

typescript

const blogPosts: ContentCollection<string> = {
  'post-1.md': {
    /* Markdown content */
  },
  'post-2.md': {
    /* Markdown content */
  },
}

Heterogeneous: Content can have different types

typescript

const mixedCollection: ContentCollection<unknown> = {
  'post.md': { data: '# Post', contentType: 'text/markdown', metadata: {} },
  'image.png': { data: binaryData, contentType: 'image/png', metadata: {} },
  'config.json': {
    data: { key: 'value' },
    contentType: 'application/json',
    metadata: {},
  },
}

Content Hierarchies

Content is organized hierarchically, mimicking a filesystem structure:

blog/
  posts/
    hello-world.md
    getting-started.md
  images/
    header.jpg
    profile.png
config/
  settings.json
  users.json

This hierarchical structure enables:

Logical Organization: Content is arranged in a meaningful way
Pattern-Based Access: Content can be accessed using glob patterns
Namespace Isolation: Different content areas are separated

Pattern Matching

The system supports glob-style pattern matching for content discovery:

typescript

// Find all Markdown files in the blog/posts directory
const posts = await store.list('blog/posts/*.md')

// Find all JSON files recursively
const jsonFiles = await store.list('**/*.json')

// Find specific file patterns
const imageFiles = await store.list('blog/images/{logo,banner}*.{png,jpg}')

Content References

Content can reference other content through a reference system:

typescript

interface ContentReference {
  uri: string
  type: 'embed' | 'link' | 'dependency'
  title?: string
  description?: string
}

// Example: Blog post with image references
const blogPost: Content<string> = {
  data: '# Post with Images\n\n![Logo](../images/logo.png)\n\n![Banner](../images/banner.jpg)',
  contentType: 'text/markdown',
  metadata: {
    title: 'Post with Images',
    references: [
      { uri: 'blog/images/logo.png', type: 'embed', title: 'Logo' },
      { uri: 'blog/images/banner.jpg', type: 'embed', title: 'Banner' },
    ],
  },
}

Reference types include:

Embed: Content directly embedded in the parent content
Link: Content referenced by the parent but not embedded
Dependency: Content required by the parent for functionality