Choosing the appropriate solution vs falling for the hottest new trend.
Consider the following study:
A. The database of our example company is 12Gb of data (including the indexes, and "frozen"/aggregated data).
B. The company has a professional MSDN licence, and it currently uses SQL server as its relational database.
C. The rate of incoming data to the database is fairly slow, (less than a hundred sales per day).
D. The company has a department of statistics, that make heavy use of cohorts and trinagles studies, trying to "predict" the financial future of the company.
E. The reports take a few days to complete each month, but the company wants to have access to these reports real-time.
F. Managers love charts.
G. The company has its own full blown data centre in house.
H. The team of in-house programmers is small. Less than 10 people.
I. A representative of Cloudera visits the company to sell a Hadoop configuration.
What do you propose should happen?
A good first approach would be for an experience IT person to attend the presentation and ask the following questions:
A. Why do I need a BIG Data solution when I only got such a small database and its rate of expansion is fairly slow?
B. Should we speak Cubes before we speak NoSql?
In this case study the company has full access to Microsoft Analysis Services, and so far they do not use it at all, they still use the poor mans cube "frozen" tables filled with historical results, maintained by a long list of stored procedures, that run as jobs after business hours.
Since the reports do not need to be realtime OLTP is not necessary, which kinda also throws "Big Data" out of the window too.
The company already has its own data center in house, with fault tolerance and disaster recovery.
So the best probable bet would be an OLAP solution, and since the solution is geared towards manages and analysts instead of developers, probably a Power BI solution.
Makes sense, right? The reality is that the company, asked a person without any experience on large databases, or real time systems, to rate Cloudera's solution, in a "nay to yay" scale, they got a "apache! cool!" and so they chose Hadoop!
Inexperienced IT personnel should not play the role of the adviser.
Along with the good things a new technology brings, thing about the bad ones that it might also cause, like potential slow adoption, cause by the bumpy road to data conversion from one system to the other, and bumpier road trying to persuade old developers to learn new tricks.
Think about your development team. How many are there? How old? How experienced? What is their workload? What is the quality of their work so far?
Just because Fortune 500 companies use a specific technology, it does not make it necessarily right for your company.
If you liked my article please consider using my Genesis-Mining Code: az2Ol4 to get a 3%off,
Thank you.
Great article!
Out of curiosity ... Was this a real scenario? Sounds like you know what you're talking about, so I'm wondering why you weren't chosen to be in that meeting?
#circleoffriends
They had been by introduced to cloudera, through a personal friend of the CEO, so obviously they did not bother asking any of us.
I was eventually called to offer my opinion a couple months after their decision to go live with Hadoop, when management picked me to send me for Hadoop training at Cloudera.
What happened at the end, was that we managed to explain to the management that the cost of installing Hadoop and using it was not the efficient solution, and at the same time there were systems ready to be used. So far we have not heard anything from the management, and will not hear anything until October for sure.