What Happens When I Press This Button?steemCreated with Sketch.

in #art7 years ago

Predicting Outages with Math

Iyer wanted to find the top common reasons data centers experience outages, and then create a way to mathematically predict and even prevent those events before they happen.

To have software code that predicts outcomes when a new route is injected to a network, or a new spine created, or a new switch added, the software would need to use models of every possible state the network could take. Every single state possible.

Iyer and the Candid team worked exhaustively to calculate all the states a single packet could go through and found it would take on the order of 2 to the 144th possibilities.

That’s more states than there are stars in the known universe.

To do these calculations, Iyer and his team started by mining Cisco’s 30 years of networking history to find the most common issues in data centers, their causes, and their resolutions.

“We want to make a claim on every conceivable flow [your network] will ever see. That’s the only way to be proactive.”

In February 2015, Iyer and his team mapped out a proof-of-concept for Candid and pitched it to Cisco which, resulted in a round of funding.

From there, the team began writing algorithms that would calculate the massive number of models they would need. The idea was to use the team’s advanced math and programming skills to predict every possible outcome in a data center, much the way NASA would for a mission like landing the Mars rover.

“Let’s take a single problem like network security, and look at one particular aspect of it, like ‘Are your security policies currently programmed?’” Iyer explained. “And let’s look at one network switch and see if we can mathematically say something nice about that switch and its configuration.”

Iyer said his team originally attempted to do this modeling on a single switch using open-source formal mathematical tools. But when they applied it to 60 security policies as a test, the tool took six hours to verify.

“Sixty policies is nothing when you manage millions of groups,” he said. It just would not scale.

Iyer and the Candid team worked exhaustively to calculate all the states a single packet could go through and found it would take on the order of 2 to the 144th possibilities.
That’s more states than there are stars in the known universe.

To help speed things to a more realistic time frame, Iyer and his core Candid team paired with academic groups and PhDs from the University of Pittsburgh, Stanford and Purdue to build formal mathematical models catered to networking.

They also worked with Cisco Advanced Services and the Technical Assistance Center to pull historical outage data to identify the top data center issues along with their likely causes. Pairing with those internal Cisco groups gave Iyer and his team access to 30 years’ worth of data around data center outages, reported human errors, hardware problems, and software programming issues, as well as complex multi-vendor issues.



Posted from my blog with SteemPress : http://selfscroll.com/what-happens-when-i-press-this-button/
Sort:  

Warning! This user is on my black list, likely as a known plagiarist, spammer or ID thief. Please be cautious with this post!
If you believe this is an error, please chat with us in the #cheetah-appeals channel in our discord.

This user is on the @buildawhale blacklist for one or more of the following reasons:

  • Spam
  • Plagiarism
  • Scam or Fraud