Steemit Blockchain Team Update: AppBase, RocksDB, Bandwidth, HF20, SMTs, and more!

in #steem7 years ago

Today’s @steemitblog post is brought to you by Steemit’s Blockchain team.
We have been extremely busy over the past several months, and wanted to share with you some of the things we have been up to. This post includes news of our plans for an improved bandwidth formula, updates on AppBase, RocksDB, and Hardfork 20 (HF20), as well as the latest developments for Smart Media Tokens (SMTs).

Scalability

As many of you already know from our previous scalability post, the Blockchain team has been very focused on scalability over the past year. We know that these types of changes are not as exciting as new features and platform enhancements, but ensuring that the blockchain is ready to scale to 100x or even 1,000x usage is something that is important to do before we actually scale to that degree. Neglecting scalability until it is actually needed is a recipe for disaster.

AppBase

AppBase provides a robust foundation for meeting all of our future scaling needs, and will allow us to grow the platform while at the same time managing the resource requirements for third-party application developers, witnesses, and exchanges to grow along with it.

It does this by enabling many components of the Steem blockchain to become modular by creating additional non-consensus blockchains as dedicated plugins. These plugins can be updated much more rapidly because they do not require replaying the entire blockchain.

The pre-release for AppBase was announced about three months ago and we appreciate all of the testing that the community has done since that announcement. We have also been testing extensively, and have been working on several changes to resolve some of the minor issues that were detected/reported. We are very close to having the official 19.4 AppBase release ready for witnesses and node operators to start safely using in production, and we will post as soon as it is ready.

RocksDB

As we mentioned in our Exploring Steem Scalability post, we have been spending a lot of time researching various ways to store the steemd data more efficiently. One of those approaches is using a technology called RocksDB. We are pleased to announce that we have decided to go with the RocksDB solution, and have already successfully converted the “account history” plugin to use RocksDB.

RocksDB is a fast-on-disk data store with an advanced caching layer, which could further minimize latency when reading/writing to and from the disk as it is optimized for fast, low-latency storage. Used in production systems at multiple web-scale entreprises (Facebook, Yahoo, LinkedIn), RocksDB is based on LevelDB but with increased performance thanks to its ability to exploit multiple CPU cores and SSD storage for input/output bound workloads. Its use in MyRocks, for example, led to less SSD storage use, longer SSD endurance, and more available IO capacity for handling queries.

In comparison with the previous account history implementation:

  • An account history node can now efficiently be run by storing the state file on a nVME SSD drive, instead of having to keep the entire state file in RAM. This has allowed us to start running account history nodes on 32-64 GB RAM servers, instead of the 488 GB instances we were required to use before.
  • An account history node can now re-index in about 10 hours, compared to the multiple days that it took before.
  • The state file is much smaller, as RocksDB has built-in compression.

We completed extensive testing of these changes in the development/staging environments, and have been running the changes in production for a little over three weeks. Any users who have been querying account history data from the api.steemit.com endpoint over the past three weeks have been getting their data from the RocksDB plugin.

Account history was used as a test of the RocksDB technology to determine if it fits our needs. We are more than happy with the results of this test and are working on a drop-in replacement for Chainbase that relies on RocksDB instead of memory mapped files. This will dramatically improve the performance of steemd, and we are very excited to complete the transition.

Bandwidth

As we continue to scale the blockchain to more and more users, the bandwidth formula that we use to allocate resource usage across all of those users becomes more and more important. The bandwidth formula that we are currently using has been adequate for the level of usage we have had so far, but there is a lot of room for improvement. Our goal is to try to find the right balance between allowing new users to have an amazing experience using Steem-powered applications like steemit.com, while at the same time preventing them from using an unreasonable amount of the network’s resources or spamming the network. We would also like to simplify the mental model behind understanding how much Steem Power is required for certain levels of use.

Our current bandwidth formula makes a somewhat crude approximation of the cost of a transaction, based on the size of the transaction. While size is one important metric, the improved bandwidth formula should try to take into consideration all of the different resource constraints that a transaction may place on the network.

We have been researching ways to classify transactions based on their impact on several different factors. By taking all of these items into consideration, we hope to come up with a much better representation of a transaction’s true cost with respect to:

  • Blockchain history size
  • Reindex time
  • State file size
  • Memory usage
  • Disk iops
  • Network bandwidth

We are currently working to integrate a new tool called StatsD that measures statistics into steemd, so that we can acquire superior metrics. We are also researching different bandwidth implementations that can be used to allocate usage based on these metrics.

Once we have gathered all of the necessary data and settled on the best bandwidth algorithm design, we will share all of the details with the community.

Miscellaneous changes

Security Changes

The team has worked on several security patches to improve the stability of the network, which were released under Steem 0.19.3. The majority of witnesses and node operators have already picked up these changes and are running them in production. If any node operators are still running version 19.2, it is recommended that they upgrade to 19.3.

cli_wallet testing

The cli_wallet is a tool that is used by many witnesses, exchanges, and application developers to interface with the Steem blockchain. As we continue to make changes to the blockchain architecture (such as with AppBase and RocksDB), we felt it was important to design a suite of tests that could be used to ensure that the new version of code remains backwards-compatible with the old version, and doesn’t break any functionality with the cli_wallet.

It should be noted that once SMTs are released, the cli_wallet tool will remain backwards-compatible with previously existing functionality, but it will not be upgraded to support the new SMT functionality. We will be providing a new tool before then (and instructions on how to use it) that will act as a full replacement for the cli_wallet, as well as support all of the new SMT functionality via the command line.

Code style guidelines

In GitHub issue 2366 we are working on a code style document, so that developers have a guideline to inform style decisions and the codebase has a unified set of style rules. These will be useful for any developer making contributions to steemd, and will help keep the source code clean and reliable.

Transaction confirmation API

On rare occasions an action made on Steemit.com appears to work, only to disappear a few seconds later. There are some edge cases where the current transaction submission logic breaks down and results in this undesirable situation due to the transaction not making it into an accepted block.

We are working on a new API to better determine the status of a transaction. It is not only more efficient than the current transaction submission process, but will allow steemit.com to detect these scenarios and ensure your actions make it on the blockchain.

The discussion for this has been ongoing here, and our goal is to have a design finalized soon, so we can start work on the implementation.

Hardfork 20

Hardfork 20 has been on the back burner for a while as we focused on scalability-related solutions, but it is time to put it back on the front burner. We don’t have an exact date for the hardfork yet, but we are targeting early Q3 of 2018. More details will be shared on HF20 as the development progresses.

SMTs

We have several full-time developers dedicated to working on SMTs, and a lot of progress is being made. While many of the changes being worked on so far are highly technical, and basically serve the purpose of updating much of the existing functionality in steemd (which was designed for a single token: STEEM) to work for multiple tokens, there is still a lot of interesting functionality that is beginning to take shape.

Here are 10 of the interesting changes that have been completed so far:

  • In 1508 the initial work was done to allow the creation of a new SMT.
  • In 1653 and 1729, test cases were created to ensure that all of the SMT creation logic works as expected.
  • In 1683 the foundation was laid for SMTs to integrate with the internal market, so that users can trade SMTs for STEEM and STEEM for SMTs.
  • In 2029 the structure that will be used for the SMT Market Makers to run on the internal market was defined.
  • In 1682, the transfer operation that is used to send tokens between accounts was updated to support SMTs.
  • In 1843 posts/comments were updated to allow users to specify up to two SMT tokens for which the post/comment will be eligible (in addition to STEEM).
  • In 1856 the vote operator was updated to support vote operations on posts/comments that include multiple voting assets (SMTs).
  • In 1896 the operation for users to claim tokens after a post/comment that includes SMT payouts is paid out was created.
  • In 2056 support was added for the smt_refund_operation, which can be used by contributors of an ICO to (optionally) cancel their contribution to an ICO and receive a refund if the ICO launch date is postponed.
  • In 2021, 2160, and 2085, support was added for “vesting” SMTs (SMT Power).

More Technical Details

For those of you who are interested in the technical details, the team is also spending a lot of time on important design decisions in addition to coding. Many of these discussions are documented in GitHub.

One example can be found in this issue, where we discussed how to handle automatic actions such as SMT emissions and Market Maker transactions. In another, 2212, we discussed the corner cases of SMT vesting to prevent integer overflows and rounding errors.

Testnet

We know that everyone is eager and excited to have SMTs completed and launched into production as soon as possible. So are we! Our dedicated team is working around the clock to make this happen. The first major milestone we are aiming to achieve is to have an SMT testnet (called “Forerunner”) up and running, where developers can start to play around with some of the implemented features.

We will continue to keep you up to date on our progress and will let you know as soon as the Forerunner testnet is ready for use.

Steem on,

The Steemit Blockchain Team

Sort:  

This is a fantastic update...my head feels like it's exploding trying to process all of this and imagining all of the possibilities that will soon be available on the Steem platform!

Once appbase and rocksDB are at a good point (which it sounds like they might already be at?) I would love to see a post or guide about the best way to set up a full RPC node/cluster using these new technologies and what type of hardware requirements are necessary.

I think it's a big problem right now that there are so few full RPC nodes available and it's getting prohibitively expensive and time consuming to set up new ones. Obviously these changes are aimed at changing that (which is awesome!), and I think it would also be good to have a clear guide on how to set a full RPC node up, what the various options are, etc.

Probably a good thing for the developer portal I guess (which I also love btw!)

Anyway, I could not be more excited about all of this and the future of the Steem blockchain. Keep up the great work!

Me too for new docs on building and selecting modules for a full or partial rpc node install, with the rocksdb build and lowered ram utilization configs and optimized disc strategies and all that. Get us to where we can get nodes online affordably, at lower witness levels and we will, and that will take a lot of pressure off the plumbing here.

I agree that these changes bring great capabilities which will be useful as Steemit continues to grow and SMT become more common.

Yes, I agree. It's one of their most successful summaries and it demonstrates a great effort to communicate the progress to us.

Agreed @yabapmatt @sircork it would be helpful if clear documentation were available for setting up full RPC nodes, as well as, ways to make them more affordable and how to manage them well. I've been chugging along trying to learn but there's so much information (much outdated) that its confusing and overwhelming.

Welcome to the game

✊😆 I swear I adore you @sircork hahaha

Aww shucks!

What I'd like to know is how to setup jussi to redirect the API calls to different nodes. That would help me load-balance between different machines 🙂

If I could understand more.... my head would explode too.

Whats up mothafucker @berniesanders 🖕

Yeah, honestly it's one of the things I don't understand. I like the idea of steem, but there seems to be only updates on how it will "perform better," not actually create changes that will fix the community, content, etc...
The front page, you can just scroll, and every single one is $200+ in bots, or it's just accounts spamming 4 post a day that have cult like followings, regardless of post content.
Steem is 3 things. Steem Circlejerk, Crypto Circlejerk, and "fixes" that only create avenues for higher ups to gain more revenue easier....
Nothing for the average user. It's depressing.

That sounds like an opportunity. Why don't YOU create content for the average user ?

You are confusing Steem with SteemIT. Steem is the underlying blockchain that transacts 1,8 million transactions at 0,15% capacity per day, more than all other blockchains combined.

SteemIT however is a platform on top of the STEEM blockchain. Don't like it? Build a better one.

I'm curious why so many nodes are offline do you know? It looks a bit strange. For my logical brain I would expect that they should all be up and running. So would you be able to run your own Steemit server to keep the page up or is it only witnesses that are allowed to do that?

Anyone can run their own Steem blockchain instances as well as condenser (the code that runs steemit.com).

There is a cost to running the servers though.

Long time no see @timcliff ! these are exciting times! 👍👍👍😀

What nodes are you talking about exactly? There is a list of steem full nodes here: http://geo.steem.pl/ and most of them are up.

Thanks for letting me know. Never seen that link before.

Because no one gives a shit.

Lol xD Well some care, and some don't. As it always have been on Earth 😂

I like the sound of reduced latency for use with the use of rocksdb :D. Will there be additional recovery options for our keys in HF20? And what is to be expected when this version is rolled out? Thank you for all the hard work of improving this platform. Cheers! @STEEMITBLOG

I'm learning but it all looks interesting

Good luck with your learning :)

I am feeling the same way. Amazing update guys, lots of great and reassuring items being presented. I am extremely excited about the future opportunities the Steem platform will provide, especially as it supports SMTs and scales out to support all us app devs!

I agree this was a fantastic update about many things we were waiting to here more news about. This is very reassuring that Steemit continues to improve its infrastructure so transactions and other functions will proceed without delays or incidents. The steady march towards completion of the SMT is also great news as this feature promises both greater exposure of Steemit and greater exposure of everyday businesses to the blockchain and how it can help them survive to movement from brick & mortar shops to the internet. Bravo

same here! I want to see more languages support officially with the API; I'm a .NET type of person, and some of my goals include making Steem integrate with DNN CMS via a module.

I'm also a .NET developer. What I did was setup some Python stuff with DJango using the library and then I write my software in .NET calling the API 🙂

Hmmm. IronPython might work well for that. Thing just got an update as well, so I could be messing with that, or call a Python library from a C# application; C# or PowerShell, for that matter, actually. Now there's something cool. A Steemit client that has powerShell script calls.

Basically I just send http request to Django. So far it's working great 😁 you could check out my Github if you'd like to see my approach https://github.com/moisesmcardona

That's pretty cool; wonder whether or not the APIs that are accessible via C Python are also accessible via ironPython? I would think they are; I mean, I do plan to include IronPython in my ideas for languages to learn; love all of them.

Due to the fact that I am a complete novice when it comes to the tech involved, I typically avoid making comments on these types of posts. First of all I want to compliment you on how this was presented. Although I would be lying if I said I understood every detail, I understood quite a bit of this post (great job explaining it to a layman!).

It gives the impression that significant progress is being made in one of the (if not the) biggest areas of need... scalability. Thank you for including the names of huge recognizable companies using RocksDB. Even if I don't fully understand it, knowing that sites with a huge user base use it makes me excited that we are planning to have a similar (or even bigger) user base.

It also seems that there has been significant progress on SMTs. Including the details of the significant changes was a great idea.

My favorite piece is the revelation that HF 20 is being pushed to the forefront again. That signals to me that significant progress has been made in many other areas that will allow manpower and resources to be spent on it.

I can't wait for the launch of these so that we can use them to help the community to grow and thrive.

Thanks so much for that thoughtful response @hanshotfirst. It's really helpful to know which parts of the post you found most useful.

Did you write this @andrarchy? Man, you were a great pick for this position!

Thanks @bbrewer! I oversee the content production system of the organization, but this specific post was written by an amazing team mate of mine @plink01001 who has a much better grasp of the technicalities of the blockchain. I'm happy to say that the Content Team is now getting to a stage of development and organization where we can produce far more content together than I ever could alone.

That is awesome man! I have been a fan of yours for a long time.
Kudos to @plink01001 as well. This was a great update and instills a lot of confidence that good progress is being made.

Aww shucks, thanks for the support!

The question remains... when will SMTs come out... seems like Q3 is not likely anymore... I heard those rumors that it would be beginning of July... after reading this my doubts increased... 33 tickets open for SMTs...

What can I say, you said it all, I agree, these are important points as many Steemians are not writers of code or proficient at translating technical terms to everyday language, but this article did a very good job and this comment sums that sentiment up nicely.
Bravo

Thank for this great update. SMT is the update we are waiting for. The society is excited for this new wonderful thing to create a token for your own platform. This is the reason why this community is awesome because working developers in this platform.

Steem blockchain is a protocol like Ethereum, Waves, EOS etc.

Steem has 1,8 million feeless transactions per day, as much or more than ALL other blockchains combined, using only 0,15% of total capacity.

Now, ethereum is transacting 800,000 transactions per day at 100% capacity - and is valued at $60 billion USD. It cant even handle a stupid cat game.

Steem is only at $800 million USD. If it is valued half that of Ethereum we are talking 90-100$ Steem. Do you see the price of Steem today? It is not falling like all other cryptos. It's up 3%. It is probably waiting to go off the roof once there is a more bullish sentiment in the market.

Steem has been overlooked and undervalued IMO.

I am more bullish on Steem now than ever.

Great to see all these projects being built and all the innovation on Steem. I'm very very happy to be investing long term in Steem!

Very interesting statistics, thank you, I will use that information in future posts.

Excellent update and on the steemitblog, no less! I'm glad to see the tech details get some love here and not just @steemitdev.

Thanks for continuing to communicate about the work you're doing. We really appreciate it.

as a new witness (@swisswitness) I really like the full node news... this will make it much more affordable for the smaller teams to set them up as well... This will really boost the blockchain strength and stability

That's really useful feedback @lukestokes. It is an ongoing discussion about where to publish updates like this, and whether steemitdev or steemitblog is more appropriate, so glad to know your thoughts on this.

Steemit, inc is a tech heavy company after all, and I say embrace it instead of trying to explain to bloggers why Steemit (your reference implementation of what’s possible on STEEM) hasn’t become the next Facebook yet with free money for everyone. STEEM has some solid devs working on important stuff. The more we highlight that, the more we can educate people on what’s going on and prepare them for a future where Steemit is just one among many sites on this chain.

Keep it up, man! We appreciate your efforts and the team taking time to put these updates together.

Agree 100%

What language should I learn to develop something on Steem platform?

You should start by checking out the devportal: https://developers.steem.io/

thank you

Excellent and exciting news, looking forward for Hardfork 20 in early 2018.

early 2018 or 2019?

As mentioned by a lot of users here, I love how the thoughts were presented in a professional manner. It helps us understand (non-developers) in layman's term :)

It is very glad to know that steemit is developing. And your team always do it with success. Thanks steemitblog.

YES!!!! I have to confess I was a bit worried about the RAM requirements... Now I don't feel so bad when I say that I believe all top 20 Witnesses should run a full Node.

I know, that its not required, I get that, but to me it's about leading by example. I welcome disagreements, I do. But when the nodes go down and we have thousands of users not able to use the blockchain, It drives me nuts.

We are all shooting for mass adoption, as much as most users won't know the significance of this upgrade, we should all be celebrating.

Most of us witnesses don't disagree, but most of us can't afford it. If they can push HF20, we can, and then you'll see vast relief on resource utilizations system wide. Even if we just doubled from the dozen or so public ones now, to double that, it would be huge right now.

I'm aware that a witness that is not in the top 20 would struggle to do this. Some witnesses can't cover costs of operation from month to month and I'm sensitive to that of course. I guess my request is exclusively for those at the top.

I think it's a rational idea. But also, developers lean too much on the steemit api as well. Some of the others have been around a long time and stable enough to use as a primary with steemit api as a backup. That will also help the issue of people not being able to access the blockchain. Most interfaces I see use the steemit as a primary, and others only for backup. I feel it should be the other way around.