Concept: the Steem Ledger Index Protocol (SLIP)

remlaps (75)in #desci • 2 months ago (edited)

Introduction

For today's amusement, we have yet another project that I can't possibly find time to complete at any time in the foreseeable future. Anyone want to lend me a 10-person programming team for a year or two? 😏

Recently, I've been playing around with using AI in the way that is suggested, here. So far, it's a lot of words that boil down to "use these prompts".

Prompt 1

You're a senior software engineer. We're going to build the PRD of a project together.

VERY IMPORTANT:

Ask one question at a time

Each question should be based on previous answers

Go deeper on every important detail required

IDEA:
< paste here your idea >

Prompt 2

Compile those findings into a PRD. Use markdown format. It should contain the following sections:

Project overview

Core requirements

Core features

Core components

App/user flow

Techstack

Implementation plan

^{Pixabay license from Innovalabs}

That's as far as I made it so far. Next up (in theory), I have to create a task list and start building. So, what was my idea? Here it is, straight from the prompt:

We're going to create a decentralized hierarchical storage protocol on the Steem blockchain. Sequential data will be stored in custom_json transactions. Indexes will be stored on decentralized clients (that we're going to create). The protocol should be flexible enough to work with any kind of data (that can be stored in JSON).

Gemini and I came up with the catchy name "SLIP" for this protocol (Steem Ledger Index Protocol).

Key points of the protocol:

Sequential data are stored on the Steem blockchain using custom_json transactions
Database indexes are maintained in app-specific decentralized clients.
The protocol is intended to work with any data that can be described in JSON.

The AI and I also managed to complete a PRD (Project Requirements Document). I'll paste it as an appendix below, but I haven't read through it carefully or proved it out in the real world yet, so take it with a BIG grain of salt.

Potential Use Case: Human Verification via Captcha

After I had (in theory) the beginning of a protocol design, I immediately started thinking about creating a reference implementation. What useful information can I store to demonstrate this protocol in a proof of concept?

The thing that occurred to me is a sort of a captcha/verification service. I have no interest in getting to ID verification and KYC level authentication, but at least we can verify that there's a human person who's associated with an account. Yes, one person can operate multiple accounts, but the use of captchas could make it impractical for one person to impersonate numerous others. Anyway, I'm not trying to solve identity. That's someone else's challenge. I'm just proving out the protocol.

So, how would it work? Well, I'm still brainstorming, so I don't know exactly, but let's start with this:

A (JavaScript) browser extension gives the account owner a captcha to solve, and uses the SLIP protocol to store the account name, the verifier, and captcha completion time on the blockchain (this could be accompanied by some sort of burning/payment mechanism).
- This would probably need to be stored by a verifying account, not by the account owner, themselves. (there is some hand-waving here)
Python indexing programs are scanning the blockchain for new verifications and adding them to the indexers' databases when found.
A curator or voting service could use the SLIP indexes from step 2 in order to implement voting or delegation decisions.
- A curator might only vote for accounts who have solved a captcha from a trusted verifier within the last X days
- A delegator might withdraw delegations from accounts who haven't completed a verified captcha within Y days.

Other possible use cases

I think there's any number of possible scientific and business datasets that might also benefit from a protocol like SLIP. For example:

Climate datasets with worldwide temperature and precipitation observations from Steemizens
Astronomical observations from Steemizens in the style of globeatnight
Ecological datasets where Steemizens could record animal sightings and locations
Historical Steem blockchain statistics
Economic/price/inflation data (i.e. the billion prices project)
Document/web page archiving in the vein of archive.org (subject to size constraints from Steem block size limits)
Partner with The ResearchHub Foundation for any imaginable desci datasets.
Apps like a Twitter threadreader

The nice thing about any of these projects is that the on-chain data could be incorporated into Steem dapps and used to support analyses that could benefit from the blockchain's reward mechanism.

Conclusion

I think that's enough for now. I don't know how to prioritize this with my other projects, so I have no idea if/when it'll move forward, but it seems like a potentially useful concept, so I wanted to write it down before I forget.

Appendix - Project Requirements Document (PRD) - created in an interview session with Google Gemini

For the PRD from Google Gemini (with my input), read on.
For readability purposes I'm not marking the following text as a block quote, but everything after here was written by AI.

Product Requirements Document (PRD): SLIP - Steem Ledger Index Protocol

Version: 1.0 (Baseline MVP)
Date: May 20, 2025
Author: AI Senior Software Engineer

1. Project Overview

The SLIP (Steem Ledger Index Protocol) aims to establish a decentralized, hierarchical storage solution by leveraging the unique characteristics of the Steem blockchain. This protocol provides a cost-effective and highly flexible method for storing sequential data on-chain while enabling efficient retrieval through client-side indexing.

Motivation:

For Developers: Offers free transactions and APIs on Steem, significantly reducing operational costs for data storage. Provides a pathway to integrate data reporting with Steem posts and rewards, fostering new incentivized applications. Developers primarily bear the cost of local compute and storage for indexes.
For Steem Investors: Introduces a novel and valuable use case for the Steem blockchain, increasing its utility, transaction volume, and overall ecosystem robustness.

2. Core Requirements

2.1 Functional Requirements

On-Chain Data Storage: Enable storage of arbitrary JSON data on the Steem blockchain via custom_json transactions.
Protocol Identification: custom_json transactions must be clearly identifiable as belonging to the SLIP protocol and a specific application.
Flexible Data Structure: Support application-defined JSON data structures within the protocol's payload.
Decentralized Indexing: Allow decentralized clients to locally build and maintain indexes of on-chain data.
Schema-Driven Indexing: Enable application developers to define which fields within their data payload should be indexed.
Efficient Data Sync: Clients must be able to efficiently synchronize with the Steem blockchain, handling both historical data and real-time updates.
Robust Error Recovery: Clients should gracefully handle network interruptions and API errors, resuming synchronization from the last known state.
Local Querying: Provide a mechanism for applications to query the local indexes based on various criteria.
Lightweight Client: The decentralized client should be a pure indexer, minimizing its local storage footprint by not storing the full on-chain data payload.
Platform Agnostic Client: The protocol's design should permit implementation of decentralized clients in any programming language.

2.2 Non-Functional Requirements

Cost-Effectiveness: Leverage Steem's free transaction model to ensure low operational costs for data storage.
Scalability (Indexing): The indexing mechanism must be capable of handling potentially large volumes of on-chain data.
Performance (Querying): Local queries against indexes should be fast and efficient.
Decentralization: The indexing and querying components must operate in a decentralized manner without reliance on central servers.
Open-Source: The protocol's specifications and reference client implementation should be open-source.
Usability (Developer): The process for defining indexing schemas and interacting with the client library should be straightforward for developers.

3. Core Features

3.1 On-Chain Data Storage

custom_json id Field: All SLIP-related custom_json operations will use the id format: slip_v1_your_app_name.
- slip_v1: Protocol identifier and version.
- your_app_name: Unique identifier for the specific application (e.g., my_sensor_app, my_photo_album).
custom_json json Payload Structure:
- protocol_version (optional, integer): Denotes the version of the SLIP protocol's internal data structure.
- app_data (mandatory, JSON object): A flexible container for the application's unique JSON data.

3.2 Decentralized Client-Side Indexing

Pure Indexer: Clients will store only block_num, trx_in_block, and optionally steem_trx_id referencing the on-chain data, not the full app_data.
index_schema.json: A JSON file bundled with the client, defining indexable fields for a given app_name.
- app_name (string): Matches the application identifier in the custom_json id.
- version (string): Semantic version of the schema itself.
- index_fields (array of objects):
  - name (string): Human-readable name for the index field.
  - path (string): JSONPath expression to extract the value from app_data.
  - type (string): Expected data type (string, number, boolean) for local storage optimization.
  - unique (boolean): true if the index should enforce uniqueness of the value.
SQLite Local Database:
- data_entries table: Stores id (PK), block_num, trx_in_block, and steem_trx_id (optional).
- Dynamic Index Tables: One table per index_field (e.g., index_timestamp, index_category), storing id (PK), value, and data_entry_id (FK to data_entries). UNIQUE constraints applied for fields marked unique: true.

3.3 Data Synchronization

Initial Synchronization: Clients fetch historical blocks sequentially from a configurable start_block_number up to the current head, processing in batches.
Real-time Streaming: Clients subscribe to Steem API WebSocket/RPC streams for new block notifications and process new custom_json transactions.
State Persistence: The last_synced_block_number is persistently stored in the local SQLite database to enable fault tolerance and seamless resumption.
Error Handling: Clients implement retry mechanisms (e.g., exponential backoff) for API errors and network disconnections, pausing and resuming operations as needed.

3.4 Client Interface & Querying

Direct Library Interface: The client functionality is exposed as a software library, to be embedded and invoked directly by other applications (e.g., desktop apps, local web servers).

query_data Method: A general-purpose method for retrieving indexed data:

query_data(
    app_name: string,
    query_filters: list of Filter objects, // { field: string, operator: string, value: any }
    order_by: string,                      // Field name for sorting
    order_direction: "asc" | "desc",       // Sort order
    page_size: integer,                    // Results per page
    page_number: integer                   // 1-indexed page number
) -> list of { block_num: integer, trx_in_block: integer, [steem_trx_id: string] }

Query Capabilities: Supports exact match, range queries (e.g., gt, lt), and multiple criteria (ANDed implicitly).
Output: Returns references to the on-chain data (block_num, trx_in_block), requiring the consuming application to fetch the full custom_json from a Steem API node.

4. Core Components

Steem Blockchain: The immutable ledger for storing sequential data.
Steem Public APIs: Used by clients for block streaming and transaction retrieval.
SLIP custom_json Transactions: The on-chain data format.
Decentralized Client Application:
- Block Stream Processor: Component responsible for fetching and parsing Steem blocks.
- Schema Parser: Component that reads and interprets index_schema.json.
- JSONPath Extractor: Utility for extracting values from app_data based on JSONPath.
- SQLite Database Module: Handles database creation, indexing, insertion, and querying.
- Synchronization State Manager: Manages last_synced_block_number and recovery logic.
- Query Engine: Implements the query_data method and translates queries into SQL.
index_schema.json File: The definition of indexable fields.
Application-Specific app_data: The actual data payload created by developers.

5. App/User Flow

This flow describes the interaction between an application developer, the SLIP client, and the Steem blockchain.

Application Developer Action:
- Designs their application's app_data JSON structure.
- Defines the index_schema.json specifying which fields in app_data should be indexed and their JSONPath.
- Implements their application logic to post custom_json transactions to Steem with the slip_v1_your_app_name ID and their app_data payload.
- Bundles (or configures) the SLIP decentralized client with their application, providing the index_schema.json.
SLIP Decentralized Client (Initial Setup/First Run):
- Starts up on the user's local machine.
- Loads and parses the provided index_schema.json.
- Initializes or opens its local SQLite database.
- Based on the index_schema.json, dynamically creates the necessary data_entries table and specific index tables (e.g., index_timestamp, index_sensorId) in SQLite.
SLIP Decentralized Client (Synchronization - Initial/Catch-up):
- Reads last_synced_block_number from its local database (defaults to a configured start block if not present).
- Connects to a public Steem API node.
- Starts fetching historical blocks from last_synced_block_number up to the current head block, in batches.
- For each fetched block:
  - Iterates through transactions.
  - Identifies custom_json operations with id matching slip_v1_your_app_name.
  - Extracts the app_data JSON.
  - Uses JSONPath from index_schema.json to extract values for all defined index fields.
  - Inserts a new entry into data_entries table (block_num, trx_in_block, etc.).
  - Inserts the extracted index values into the corresponding index tables, linking them to the data_entries record.
- Periodically persists the current last_synced_block_number to SQLite.
SLIP Decentralized Client (Synchronization - Real-time):
- Once caught up, the client switches to listening for new blocks/transactions via Steem API subscriptions (e.g., WebSockets).
- Processes new custom_json transactions with slip_v1_your_app_name in real-time, extracting indexable data and updating the local SQLite database as above.
- Continues to persist last_synced_block_number.
Application User (Interacting with the Application):
- Launches the application (e.g., desktop app, local web server).
- The application's code directly invokes the SLIP client library's query_data method.
- The query_data method executes the query against the local SQLite indexes.
- The SLIP client library returns a list of block_num and trx_in_block for matching entries.
Application (Data Retrieval):
- Receives the list of block_num and trx_in_block from the SLIP client.
- Makes subsequent API calls to a Steem API node (e.g., get_block, get_transaction) using these coordinates to retrieve the full custom_json transaction.
- Parses the custom_json to extract the app_data payload.
- Uses the retrieved app_data to display or process the full data entry.

6. Tech Stack

Blockchain: Steem
On-Chain Data Format: custom_json
Local Database: SQLite (Public Domain)
Indexing Schema Format: JSON (index_schema.json)
Data Extraction Language: JSONPath
Reference Client Implementation Language: Python (for MVP, but extensible to any language)
Steem API Interaction: Standard Steem RPC endpoints (e.g., condenser_api, block_api), WebSockets for streaming.

7. Implementation Plan (High-Level)

Phase 1: Core Protocol Definition (Completed)
- Define custom_json id and payload structure.
- Define index_schema.json format.
- Outline client-side indexing logic and SQLite schema.
- Define client API (conceptual query_data).
Phase 2: Reference Client MVP (Python)
- Module 1: Steem Sync & Data Ingestor
  - Implement Steem API interaction (block fetching, streaming).
  - Implement custom_json filtering by slip_v1_your_app_name.
  - Implement app_data extraction.
  - Implement last_synced_block_number persistence and recovery.
- Module 2: Indexing Engine
  - Implement index_schema.json parser.
  - Implement dynamic SQLite table creation based on schema.
  - Implement JSONPath value extraction.
  - Implement data insertion into SQLite index tables.
- Module 3: Query Interface
  - Implement the query_data method.
  - Translate query_filters into SQLite SQL queries.
  - Implement pagination and ordering logic in SQL.
  - Return block_num and trx_in_block results.
- Basic CLI Tool: A simple command-line interface to run the client, specify app_name and index_schema.json, and perform basic queries.
Phase 3: Documentation & Examples
- Comprehensive documentation for the protocol, index_schema.json format, and client library usage.
- Example index_schema.json files for different use cases.
- Simple example applications demonstrating how to use the client library to store and retrieve data.
Phase 4: Community Engagement & Feedback
- Release the reference client and documentation.
- Gather feedback from developers.
- Iterate on the protocol and client based on real-world usage.

Thank you for your time and attention.

As a general rule, I up-vote comments that demonstrate "proof of reading".

^{Steve Palmer is an IT professional with three decades of professional experience in data communications and information systems. He holds a bachelor's degree in mathematics, a master's degree in computer science, and a master's degree in information systems and technology management. He has been awarded 3 US patents.}

^{Pixabay license, source}

Reminder

Visit the /promoted page and #burnsteem25 to support the inflation-fighters who are helping to enable decentralized regulation of Steem token supply growth.

#technology #programming #steemtalk #ai

2 months ago in #desci by remlaps (75)

$43.69

Sort:

Trending

[-]

danmaruschak (72) 2 months ago

Database indexes are maintained in app-specific decentralized clients.

You could probably use a post on the chain to store something like a JSON, XML, or other string-friendly representation of a data structure, and edit it whenever there's an update. That way consumers of the data could just use chain-level APIs rather than having to interface with the decentralized clients.

$0.09

2 votes

[-]

remlaps (75) 2 months ago

That would be amazing to have both data and indices on the blockchain. Not sure about size constraints or scaling, but it should work for some purposes, at least.

Maybe the index could even be stored in a profile's json_metadata to avoid publishing it in any feeds. I can't find it right now, but I think I remember seeing a post by one of the Steem OGs pointing that there's a lot of untapped potential there.

$0.03

2 votes