Steem Data Analysis: Who Withdraws Rewards vs Builds SP?

Steem Data Analysis: Who Withdraws Rewards vs Builds SP?
Hey everyone,
I'm working on a project to analyze how Steem users handle their rewards—specifically, how many power up (build SP) versus how many withdraw them. I believe this kind of insight can help us understand user behavior and long-term platform engagement better.
First Step: Testing with 100 Users
The video I’m sharing shows a small test run with just 100 users, because even that already takes time to process. Scaling this up is no small task.
Scaling to 3.6M+ Users
Right now, I’m analyzing all Steem users—which is more than 3,600,000 accounts. This process involves a massive number of RPC calls and can be very demanding on the servers.
❗ I Will Always Do This Monthly
I want to make something clear:
🔁 I will run this analysis every month
The results (charts + CSV data) will always be published right here on Steemit.
💡 Hosting the Tool – Only If RPC Access Is Granted
I will only host the tool (so others can test and explore the data themselves) if someone gives me a clear promise that I can use their RPC for this kind of high-volume request.
✅ If I get permission to use an RPC regularly, I will host the tool
✅ Anyone will be able to test it and explore user data (read-only, no private info)
✅ I will run the analysis once per month to avoid abusing any RPC service
✅ Full respect for privacy, server limits, and community trust
✅ I will always credit and thank the RPC provider for their support
If no one gives permission, that’s fine—I’ll continue running the process privately and just share the report results here as usual.
⚙️ RPC Load: Risks and Best Practices
The best RPC I've tested so far is:
🔗 https://steemd.steemworld.org by @steemchiller
Alongside that, the API from SteemWorld also works extremely well and is very responsive.
It's incredibly fast. During testing, I made a lot of successive requests. Hopefully that doesn’t get my IP blocked 😅—it was only to build a single report.
I understand RPC owners have systems to prevent abuse:
- Rate limiting – restricts too many fast requests
- Throttling – slows connections or limits them
- Blocking/Banning – temporary or permanent denial of access
- Logging & Monitoring – to detect repeated misuse
That’s why I try to balance requests across multiple nodes—but again, a full user-base analysis really adds up.
What I’ll Publish Every Month
Each month, whether hosted or not, I’ll share:
- A report with charts and behavior breakdown
- A downloadable CSV file with public user data (withdrawal vs SP growth)
Final Thoughts
If you’re running a Steem RPC and are open to supporting this tool by allowing regular, large-volume access, I’d love to collaborate. That would let me host the tool publicly so others can interact with the data in real time.
Otherwise, I’ll keep running it myself and continue sharing reports here on Steemit every month.
Thanks for reading—and stay tuned for the first full report coming soon! Let’s keep building tools and insights that help grow the Steem ecosystem
So far, this is what I've got...
@steemchiller
@pennsif
@rme
@hungry-griffin
@blacks
@e-r-k-a-n
@moecki
@xpilar
You could just implement a failover logic in your code to try a different RPC node - alternatively you can use the following load balancers:
Or, you can set up a node your self: https://github.com/DoctorLai/steem-load-balancer
Hi, thank you for your response. I didn't know about this repo: https://github.com/DoctorLai/steem-load-balancer, so I really appreciate you sharing it with me.
I actually built my own solution for load balancing as well. I also make sure to pass to the next API call only if the current one doesn’t fail.
As for finding accounts with withdrawn Steem Power (SP) and checking power-up status, I'm using the API from SteemWorld. It provides direct results within the data range I need.
I could implement the same solution using RPC, but it would return more results and take much longer to process, since there’s no method to directly fetch powered-up accounts or accounts without SP. I would need to use
get_account_history
with pagination and scan the data within the desired range.Regarding load balancing, if I were to balance across all the RPCs, it would be quite a lot. There are 3,657,270 users, so calculating requests would look like this:
3,657,270 * 2 + 365,727 = a huge number of requests
Managing that many requests and handling the load balancing is a bit too much. Hosting this would likely result in my IP getting blocked eventually, which is why it feels frustrating.
How did you get the numbers 3.6M accounts?
I used the method condenser_api.get_account_count. It returns the number of accounts on Steem. Or are you asking how I fetch all users?
There seems a difference:
I will extract all accounts using the different RPC methods with database_api.list_accounts and compare the results. I remember that yesterday, when I fetched accounts with https://steemd.steemworld.org using database_api.list_accounts, I got 3.6 million accounts. However, when I used condenser_api.get_account_count with https://steemd.steemworld, I only got 1.93M accounts. XD I will repeat the process again to verify and let you know the results
I belive the official nodes (api.steemitdev.com and api.steemit.com) return incorrect results - other community nodes return 1.93M.
And if you check here you will see 1.93M is more convincing.
That's interesting. I tried with another RPC, which gave me a different result, not just the official node. However, the problem is that yesterday I fetched 3.6M accounts' usernames using database_api.list_accounts. I didn't encounter any issues while analyzing them, and in the first 1M accounts, I didn't face any problems. I didn't run into issues like missing accounts or anything. After that, I stopped the process and decided to work with batches instead
Upvoted! Thank you for supporting witness @jswit.
As the large majority of accounts are inactive it serves little purpose to analyze them all.
You need to build in some robust selection criteria to make the tool more useful.
For example select accounts by...
etc
Thank you for the thoughtful feedback, I really appreciate it.
I completely agree that filtering out inactive accounts would make the tool more efficient and insightful. Unfortunately, at the moment, there isn’t an available endpoint or API that allows for selecting accounts based on specific criteria like activity, age, reputation, SP size, or profile data.
The process I currently use involves first fetching a broad list of accounts, then analyzing each one individually to check if they’ve made any recent transfers. If an account has initiated a transfer, I consider it active. So while it’s not the most refined approach, it’s the only reliable indicator I can automate for now.
If someone is able to provide RPC access that supports custom endpoints or filtering, I’d be more than happy to integrate that. Otherwise, maybe when I create my own RPC, I can build an endpoint for this specific purpose. But for now, I can't do that, as it would require a powerful and expensive server.
Thanks again for the suggestion
Maybe talk with @steemchiller or @justyy to see if they can help...
Chatted with @justyy , super helpful!
I'm looking forward to the results. Fighting!
STEEMCHAT offers you post promotion feature..!!!
Join SteemChat.org