Programming Diary #41: Make it exist, then make it beautiful

in Steem Dev5 days ago (edited)

Overview

Apparently, it has been more than a month since my previous programming diary post, so an update is definitely overdue. As with previous posts, my recent activity has been almost entirely focused on bolstering the Thoth ecosystem. This post will describe my progress in three areas: enhancements to Thoth, itself; Creating tools to make operational decisions easier; and reporting/transparency.

Background

To refresh your memory, Thoth is an Open Source software product that lets anyone set up an AI curation bot on the Steem blockchain. The only prerequisites are: 1. The ability to operate the steem-python library; 2. A free API key from Google Gemini or ArliAI; -and- 3. A Steem posting key.

The program searches the blockchain for posts, screens them based on filtering rules that are defined by the Thoth operator, passes them through an LLM curation process and posts about them on the Steem blockchain. Posting rewards are shared with the authors of the posts that it finds and with delegators to the account. Optionally, rewards can also be burned.

When I started the project, I had no appreciation for how useful this single-language summary would eventually become for me, as a reader.

Two of Thoth's key features are: (1.) Rewards to authors after a post has already paid out - #lifetime-rewards; and (2.) Passive rewards to delegators at competitive rates (with no middleman) - #passive-rewards.

In addition to its work in support of authors and delegators, Thoth also helps readers by giving a single-language summary of posts that may have been written in multiple languages. It's a Reader's Digest sort of thing that transcends language differences. This makes it easier for the reader to decide which source posts to visit. When I started the project, I had no appreciation at all for how useful this single-language summary would eventually become for me, as a reader.

I can't find it again, but a couple weeks ago I saw a meme somewhere that was directed towards entrepreneurs and creators. It had a picture of some ugly looking Rube Goldberg Machine with a comment that said something like,

First make it exist, then make it beautiful

This reminded me of my strategy with Thoth. Under the hood, the code is not very clean, elegant, or mature (to put it mildly). At the functional layer, however, it's doing something useful that has never been done before - and at the risk of immodesty, I think it's doing it fairly well. I've definitely been postponing "beautiful" until later, though.

Since I couldn't find the original meme, the image to the right is from Google Gemini.

My first goal after creating the project was for Thoth to be able to reward delegators at a similar rate to previous generations of delegation bots. With delegators now receiving blockchain rewards at a rate of 15-70%, at a small scale, that goal has been achieved. (to be clear, "at a small scale" is very important, there...)

At this point, with several months of fairly smooth operation and decent reward rates for small delegators, I would estimate that maybe Thoth has reached something approaching "minimum viable product (MVP)" status.

So, let's dig into recent activities.

Activity Descriptions

Thoth updates

Randomness

Reviewing my github commits beginning on August 9, the first work I see was an effort to improve the random selection of delegators. I had mentioned in my previous post that the randomness seemed a little too predictable. After working with my AI assistants, I think we have now solved that problem, and the random beneficiary settings are now fair for all delegators. That took me into the beginning of September.

Engagement scoring

Next up was to add consideration based on payout value, numbers of votes, numbers of resteems, and numbers of comments. To do this, I implemented a simple engagement score with configurable weights:

engagement_score = A * (vote score) + B * (comment score) + C * (value score) + D * (resteem score)

Where the scoring and weights for each term can be adjusted by the operator, and a minimum engagement_score can be used for post screening.

Time-weighted random block selection

Thoth's starting block could previously be chosen in one of three ways: 1. Random selection; 2. Historical selection (starting at a predetermined block); 3. Active posts (starting at a block 6 days ago, or the most recently processed block). Last week, I added a new capability that will select the starting block at random, but it will be biased towards more recent blocks. This is the method that is currently in use, and it was prompted by a comment from @steemcurator02.

It was already possible to screen authors based upon activity time, but I don't want to do that in my test account. With time-weighted random block selection, it makes it more likely that currently active authors will be selected without excluding inactive accounts. The weight of the bias can also be set by the Thoth operator.

Evaluation Tools

checkValidation.py

Because of the screening complexity, even as Thoth's creator and operator, it's hard for me to anticipate which accounts will make it past screening to the AI evaluation phase. The checkValidation.py tool lets me check individual accounts to see how the screening rules apply. This is, hopefully, useful when adjusting screening parameters.

time_weighted_sampling_harness.py

This lets me see the impact of adjusting the bias multiplier for time-weighted random block selection. The value is currently set to 0.25, which yields the following bias from pure randomness (0.0).

Sampling range: a=3250000, b=99470641 (b-a=96220641)
samples per weight: 200000, seed: 42

weight=0.0: mean_norm=0.5004, top_10%_prop=0.1006
    Quantiles: 10%=12857699, 25%=27269414, 50%=51451844, 75%=75566398, 90%=89922356
    Decile counts (old->recent): 20034,20162,19802,19838,19968,19956,20137,19937,20043,20123

weight=0.25: mean_norm=0.5559, top_10%_prop=0.1593
    Quantiles: 10%=15137016, 25%=32359812, 50%=59085211, 75%=82417998, 90%=94086095
    Decile counts (old->recent): 16118,16601,16854,17476,17920,18774,19717,21500,23175,31865

In short, a starting block is roughly twice as likely to be in the most recent 10% than in the first 10% of blocks.

Testing with replit

I tried an experiment to see how a Replit AI agent would do with Thoth's code base with a free account. On one hand, it was absolutely amazing.

Replit imported the code, figured out the environmental prereqs, set up the environment (including steem-python), and got Thoth running with almost no assistance from me. I just had to give it environment variables for the ArliAI API key and the posting key for the @social account. (I have no idea who created the @social account, but its private posting key is public, which is convenient for testing.)

On the other hand, I ran out of free quota before I could really do anything useful; and I didn't realize that setting up the API key environment variable would expose the key to the world - so I had to go set a new one.

So, overall, mixed results. I created a github issue to enable copy/paste of the keys, in case I want to experiment again in the future without exposing the keys to the world, but given the free quota limitation, I'm not sure if I'll follow-up on that.

PowerBI reporting

Last week, I created a set of PowerBI visuals so I can visualize the blockchain's beneficiary distributions from Thoth posting. For reasons of space, I'm not going to update them here, but with the screening that has been in place, Thoth's top-10 favorite authors are apparently these accounts:

RankAccount
1.@frafiomatale
2.@rmm31
3.@shohana1
4.@ezzy
5.@denmarkguy
6.@solperez
7.@inspiracion
8.@kouba01
9.@genomil
10.@jaynie

All but 1 are currently active, despite the absence of filtering for recent activity. If you're looking for someone to follow, maybe give them a look!

Operations

Google Gemini provides free access to multiple LLM models. The best one is gemini-2.5-pro and the second-best one is gemini-2.5-flash. A while back, Thoth started having stability problems with gemini-2.5-pro so I reconfigured it to use gemini-2.5-flash, where it was able to run without API errors. A little over a week ago, I decided to try gemini-2.5-pro again, and see if Google had resolved the API issues. It has been stable in that configuration, so for now, that is the model that's in use. It remains comfortably below the 100 daily queries that are available for the free tier.

Next Up

Going forward, I see the following potential growth paths for Thoth.

Functionality

  • Continue adding features, known and unknown
  • Improve efficiency and error checking/handling
  • Improve reporting & transparency

Operations

  • Move it out of "experimental" status
  • Create multiple Thoth accounts to cover things like the following (not any time soon):
    • multiple languages (Single-language Thoth reports in German, Spanish, etc...)
    • dedicated topics (STEM, history, sports, politics, etc...)

Community

  • Have more than 1 contributor in the github repo.
  • I've been thinking about creating a Steem community for a "Thoth steering committee", in order to give delegators a way to influence the direction of the project.

Reflections

All the way back in 2016, I imagined that Steem would be on the leading edge of AI capabilities because of its unique ability to direct rewards for reinforcement learning. I have also imagined, for several years, the emergence of a new generation of voting service that would make use of beneficiary rewards to better align the interests of delegators in a fashion that resembles a bitcoin mining pool. I theorized that this new generation would outcompete the others by eliminating the Tragedy of the Commons and thereby protecting the core investment value while still delivering rewards to delegators.

image.png

Now, with the Framework that Thoth uses, I can see a more concrete path towards both of those expectations.

Will it succeed? I have no idea, but when I watch Thoth in operation, it's clear to me that it provides a much better alignment of incentives than previous generations of voting services.

Scale is still a big unknown, but in principle, I don't see any reason why a squad of Thoth accounts shouldn't be able to replace a substantial portion of the currently existent vacuous SPAM posting with a new form of content that readers will actually view and benefit from.

Conclusion

That's a wrap. Basically, it feels slow to me because this is only a hobby project, and it can only happen in my very-limited spare time, but looking back over the last 6 months, I'm fairly pleased with the results so far.

The progress described in this post covers the last six weeks, and it includes coding changes, development of simple new tools for Thoth operators, and the beginning of some transparency reporting.

I invite you to follow the @thoth.test account in order to help me with quality control, to find some interesting posts to read that you might have missed the first time around, and even to find authors to follow. I also invite others to give the program a test-run and to join me in developing it.

@happycapital posted about Thoth on Twitter, today,

If the SP leasing volume of this project increases sharply, it would be fair to say that an innovation is happening on Steem.

It's an uncharted path, but we have to try everything. I hope it creates a small success 🙏

I hope so, too.

Thank you for your attention!


You can view an overview of my active projects, here

Sort:  

I tried it briefly with Replit, and everything was set up:

The system is running perfectly—it connects successfully to the Steem blockchain and is ready for operation. The only missing components are the API keys, which you as the user must provide:

LLMAPIKEY: For AI content analysis
UNLOCK or POSTING_KEY: For blockchain posts

But that's it:
image.png

I noticed that replit replaced the steem-python library with beem and apparently corrected all .py files accordingly. I'll take a closer look at that next week.

Was there a reason why you chose steem-python and not beem?

 4 days ago 

Was there a reason why you chose steem-python and not beem?

Not really. I had decided not to use it back in 2020, because the creator was hostile to Steem during the hardfork, and I never really thought about it again. I suppose I should probably reconsider that.

I noticed that replit replaced the steem-python library with beem and apparently corrected all .py files accordingly. I'll take a closer look at that next week.

I was pretty amazed by the way that replit manages to understand the context. Wish I had a paid account, there, but I can't justify the cost to myself (yet?).

Let me know how you make out with it. I'm not particularly proud of the code quality, but hopefully it won't be too terrible to navigate with the help of an AI assistant.

aww I'm at no. 3, can't believe my eyes! Thank you so much @remlaps for such kind mention and appreciation! I'm honored and my heartfelt gratitude to you, my friend! ❤️😇🥳

 4 days ago 

You are welcome, and thank you for all of the posts you have contributed to the blockchain. I'm glad that Thoth has been able to bring some of them back to the surface for us.

Thank you very much for this mention, I consider it one of the most beautiful ‘medals’ I have received during my time on Steemit.
Congratulations on the great work you are doing and thank you for highlighting authors who write with passion, who, in my view, are the true wealth of Steemit.
Best regards!

 4 days ago 

You're welcome, and thank you for the contributions and the feedback! I agree that the authors are what makes Steem unique, and I hope that Thoth can help to bring additional audience and rewards for your contributions.

My first goal after creating the project was for Thoth to be able to reward delegators at a similar rate to previous generations of delegation bots. With delegators now receiving blockchain rewards at a rate of 15-70%, at a small scale, that goal has been achieved. (to be clear, "at a small scale" is very important, there...)

You've been doing well!! 🫡

 4 days ago 

Thanks! I appreciate the feedback and your support for the project.

Desde mi punto de vista, @thoth.test está haciendo un trabajo más valioso que el que realizan los "curadores a través de la cuentas", pues la herramienta que creaste valora textos realmente creativos y auténticos, sin inclinar la balanza hacia los amigos, compadres, novios y amantes.

Realmente, es decepcionante ver como, por lo general, se escogen a las mismas personas para que reciban votos grandes.

Mi desánimo va en ascenso cuando veo tanto "caradurismo" en la plataforma. Desde mi punto de vista, eso no fortalece ni el crecimiento de Steemit ni el de la criptomoneda; sino solo el bolsillo de unos pocos usuarios, que de paso ni siquiera escriben bien.

Gracias, Thoth por ser tan transparente.

 4 days ago 

Thanks for your reply, and for your continued contributions!

I think that manual and automated curation are both important, and I also agree with you that there's a real benefit from the fact that Thoth may be able to look at the task with more dispassionate objectivity than some human curators.

You have understood perfectly what I meant.

Has comprendido perfectamente lo que quise decir.