Steem Rewards Formulas Deep-dive

in #steem4 months ago

Stock image from Pixabay

One of the advantages of blockchain technology is that there’s a level of transparency because it’s all based on open-source code. But in practice it’s not always easy to extract that information. I was inspired by @remlapsrecent post to dig into the blockchain code to figure out exactly how the reward mechanisms work since it frustrates me that we tend to be so reliant on what people in the past have said about how things are supposed to work, or running experiments and extrapolating. Now that I understand how things work and have the actual numbers and formulas that blockchain is using I thought I would present it in a more human-readable format.

Vests, Rshares, and the Reward Pool

We think of our votes as being based on the amount of Steem Power we have on our accounts, but the blockchain actually represents that internally with a different value, called vests. The relationship between vests and SP changes over time as people power up and down and interest gets paid to SP holders. There are various tools that can tell you the current relationship between SP and vests, but as of this writing the exchange rate is:

0.0005874228138647424 SP = 1 vest
or
1702.3513156066424 vests = 1 SP

As you can see, those numbers get pretty big, because the chain wants to use integer arithmetic to do the calculations but with a similar degree of precision to the human-readable floating point numbers we use for things like the amount of SP an account has, or the dollar-amount of the pending rewards on a post. The chain does most of its reward calculations in terms of “rshares”, or shares of the reward pool. These tend to be pretty large because they involve multiplying multiple quantities together, each of which can be pretty big, like the number of vests a voter has. If you do a 100% upvote when you’re at 100% voting power, that will increase the rewards on a post by about:

rshares = vests * 20,000

The number of rshares on a post is related to the amount of SBD, Steem, and SP that it will be rewarded with based on the rewards pool and a “reward curve” (more about the curve in the next section). The number of rshares and amount of Steem in the reward pool is always changing based on the activity of the chain, but at the time that I’m writing this there are 647816613875122600 rshares in the pool and 902655.935 Steem, which gives an exhange rate of:

1 rshare = 0.00000000000139338188565507 Steem
or
1 Steem = 717678340945.183 rshares

And the relationship between the amount of Steem reward and the dollar amounts you see on Steemit is just the “median price feed” that’s set by the witnesses according to the current steem price, 0.2605 as of when I’m writing this.

1 Steem = 0.2605 USD
or
1 USD = 3.83780798037642 Steem

So to put that all together, if you were a Dolphin account with 10,000,000 vests (which is 5874 SP), a 100%/100% vote from you would increase the rewards on a post by about:

10,000,000 vests * 20,000 rshares/vest (constant) = 200,000,000,000 rshares
200,000,000,000 rshares * 0.000000000001393 Steem/rshare (reward pool) = 0.278676377131014 Steem
0.278676377131014 Steem * 0.2605 USD/Steem = 0.0725951962426292 USD = $0.073

Convergent Linear Reward Curve

Before Hardfork 21, the calculations I presented in the previous section were basically accurate for the rewards on a post, but Hardfork 21 introduced a “Convegent Linear” reward curve. There’s some debate about whether this was a good idea or what it was trying to achieve, but the effect is that the chain performs a calculation on the rshares which will reduce “small” amounts of rshares before translating them into real rewards, and allow “large” amounts to be roughly the same (so the curve “converges” to a linear y=x curve the bigger the rshares). The calculation is like this:

result = ( (rshares + s) * (rshares + s) - s*s ) / ( rshares + 4*s )

Where “s” is a constant set in the code: 2,000,000,000,000.

The shape of that curve and the size of that constant means you have to have a really large amount of rewards on a post before it stops being considered small and “converges” to linear. To help illustrate, here is what would happen if a post had only a single vote from different kinds of accounts:

NameVestsSPMana %Linear voteCurved vote
Redfish00100.00%$0.000$0.000
Minnow1000000587100.00%$0.007$0.004
Dolphin100000005874100.00%$0.073$0.037
Orca10000000058742100.00%$0.726$0.436
Whale1000000000587423100.00%$7.261$6.224
danmaruschak1526187896100.00%$0.011$0.006
steemcurator01221147376461299071995.13%$160.583$159.156

The “curve” calculation happens on the combined rshares from all the votes on a post, so it’s still good to get lots of small votes – I’ve been talking about single votes because I think that simplifies the explanation by helping to give a sense of scale, but the curve isn’t applied to each vote as they add to the rewards on a post, each vote linearly contributes rshares and the total rshares are put through the curve function when a post pays out. However, given the range of payouts you see on many posts, this curve is a pretty significant factor.

Curation Rewards

The way that curation rewards are paid out has always been a little confusing to me, there are arguments about the best timing of a vote, how advantageous it is to vote early vs late, etc., so figuring out how it actually works was a big part of my motivation for investigating the code. When distributing rewards, the chain does the “convergent linear” calculation I explained in the previous section first, and then it splits that amount into 50% for the author (and any beneficiaries) and 50% for the curators, i.e. accounts that voted on the post.

Each voter’s share of the curation rewards is based on a “weight” that gets recorded at the time of the vote, which is based on how many rshares are being added and how many were already on the post. The weight given to each vote is:

convergentSquareRoot(newRshares) – convergentSquareRoot(oldRshares)

where convergeSquareRoot is another formula:

result = rShares / approxSquareRoot(rShares + 2 * s)

(The approxSquareRoot function is something that seems to give a result that’s similar to a square root, but probably less computationally intensive).

So the size of your “weight” is based on the rshares you are contributing, and the number of rshares already on the post when you place the vote. And the same will be true of each vote that is made. When the post pays out after 7 days each curator's share is their weight divided by the sum of all the weights.

Timing

Also, before weights are recorded the chain checks the time of the vote relative to when the post was created, if the post is less than 5 minutes old then the weight is linearly scaled down before being recorded. (There’s also a time-based thing when posts are between 6 ½ and 7 days old).

So when is the best time to vote to get the best curation rewards?

I’m not sure. It’s not really obvious to me that it’s worth much to try to game-theory your way to maximizing curation rewards, I suspect for most people it will make the most sense to just vote for what you like (I would recommend waiting until after 5 minutes have passed, since that is guaranteed to reduce your share). My guess is that people are too invested in thinking they need to get in before big votes, it’s probably not a big deal, and it’s probably fine to vote on a post even if it already has large rewards (but I haven’t done any math to back that up). Feel free to discuss or debate in the comments, but hopefully this post at least has some more solid formulas and numbers based on current blockchain code so we don’t have to rely on hearsay or half-remembered explanations from years gone by.



Note: for simplicity I avoided talking about the “dust threshold” in this post, but that factors into rewards as well. When an account votes on a post the rshares that get recorded get reduced by the dust threshold of 50,000,000 rshares, that happens before any of the formulas I talk about above. The reward payout mechanism also reduces rewards below the dust threshold to zero.

Header image from Pixabay

Sort:  

Thanks for offering a different perspective of the mechanics operating below the surface! Whereas I am not all that technically inclined, I'm always curious and eager to learn.

Further to some of this... one of the things that often confuses people (I think) is that the SP needed to be a minnow, dolphin, etc. is actually a moving target over time. When I started here, it took just a little over 5,000 SP to be in the Dolphin rank; now it's closing in on 6,000 SP... owing to the natural inflation, as I understand it.

I’m not sure. It’s not really obvious to me that it’s worth much to try to game-theory your way to maximizing curation rewards, I suspect for most people it will make the most sense to just vote for what you like

I agree with you. Curator fees are affected by several factors and time is only one of them. I've noticed that when I vote on posts with 100% power, I usually get 3-5 SP of curation rewards. But there are occasional exceptions where I get 16-20 SP of curation rewards from one vote. Because of such a significant difference, I became interested in finding out what the reason is. In all of the times I've received large curation rewards, I've voted on posts that have been upvoted by relatively few people (20-40), but SC01 has voted on that post some time after me. I usually voted within a few hours of creating a post.

Thanks for this post where the rewards are explained in a technical way. I can't understand anything that is written, but now I definitely know something more

Great topic!

This post has been upvoted/supported by Team 7 via @httr4life. Our team supports content that adds to the community.

image.png

Die curations Rewards laufen bei mir ohne Berechnung einfach mit wieviel Kopf muss man sich machen um einem guten Post kein Vote oder ein kleineres da zu lassen nur weil vorher hohe Votes abgegeben wurden.
VgA

This deep-dive into Steem's reward system is incredibly enlightening, especially regarding the complex rshare calculations and the Convergent Linear reward curve. The breakdown of curation rewards and timing factors clarifies long-standing misconceptions. It's great to see blockchain code being made accessible for a clearer understanding of how rewards work.