All posts by vchavakula

My R learning curve

As a Data Scientist / Analyst we may receive data in different/variety of formats, and also allows to wide range of tools to import data and widely supported formats can be as follows:

R Data input formats R Data input format2

  1. Keyboard
  2. Text Files
  3. Excel
  4. Structured Database – Oracle, MySQL, etc.,
  5. Unstructured / Hybrid Database – eg XML; HDFS, etc.,

and the details are as follows:


Data Scientist role

The role of the Data Scientist came up with the Big Data area. But it’s not a quite new role in the enterprise business. Before we called them statisticians or subject matter experts. So what makes him now so different and what skills brings a Data Scientist to his role? Drew Conway published a very interesting illustration, called the “Data Science Venn Diagram”:

What is data scientist

About data scientists

Rising alongside the relatively new technology of big data is the new job title data scientist. While not tied exclusively to big data projects, the data scientist role does complement them because of the increased breadth and depth of data being examined, as compared to traditional roles.

So what does a data scientist do?

A data scientist represents an evolution from the business or data analyst role. The formal training is similar, with a solid foundation typically in computer science and applications, modeling, statistics, analytics and math. What sets the data scientist apart is strong business acumen, coupled with the ability to communicate findings to both business and IT leaders in a way that can influence how an organization approaches a business challenge. Good data scientists will not just address business problems, they will pick the right problems that have the most value to the organization.

The data scientist role has been described as “part analyst, part artist.” Anjul Bhambhri, vice president of big data products at IBM, says, “A data scientist is somebody who is inquisitive, who can stare at data and spot trends. It’s almost like a Renaissance individual who really wants to learn and bring change to an organization.”

Whereas a traditional data analyst may look only at data from a single source – a CRM system, for example – a data scientist will most likely explore and examine data from multiple disparate sources. The data scientist will sift through all incoming data with the goal of discovering a previously hidden insight, which in turn can provide a competitive advantage or address a pressing business problem. A data scientist does not simply collect and report on data, but also looks at it from many angles, determines what it means, then recommends ways to apply the data.

Data scientists are inquisitive: exploring, asking questions, doing “what if” analysis, questioning existing assumptions and processes. Armed with data and analytical results, a top-tier data scientist will then communicate informed conclusions and recommendations across an organization’s leadership structure.

What does a data scientist do?

This is one of the better descriptions, I have seen, for what a data scientist does.

They must find interesting, novel, and useful insights about the real world in the data. And they must turn those insights into products and services, and deliver those products and services at a profit.

Notice, data scientists don’t just need to find insights in data. They also need create profitable products from that insight. I often times feel that data products are not seen as important as improving the machine learning algorithms, but the data products really are the end goal.

The Broken System.. Would You Hire Your Father?

In our New Economy, it is very common to encounter a 26 year old Hiring Manager. Before 2008 a 26 year old would never have had such a position, at best they would be a Management trainee.
In all honesty.. they do not have enough experience to make such life changing decisions. It’s not their fault it is not luck that gave them that position but the bottom line of the company. If they can pay $30k a year to someone just out of school to be a Manager.. why should they hire a seasoned professional for more than double that salary?.. Well here is why; this situation has created a “Catch 22” paradox. On the one hand we have highly experienced people who cannot get past the young recruiter, because they do not fit in the define new-speak boxes of:
“Best Fit” or “Culture” of the parameters the inexperience decision maker has been given.
While on the other hand.. we have Employers saying they have openings they can’t fill.
It is time the hiring process goes back to what is supposed to be, a Manager has to earn the title and has to have had enough experience to base their decisions on the real value an employee can bring to the Company. The assertion that anyone over 40 is out of touch with new technology or is not a good fit for a Start-Up environment is incorrect.
Have we not learned anything form the crash of the early decade? are going back to the days of the 20 something CEO with no clue of how the business world works. Do we not understand that the people we hold as role models are over 50 and sometimes 60?
It’s time for the decision makers to open their eyes and allow people to earn their way to Management positions, instead of giving them the positions because their salary demands are low.