Smita Mishra is covering the topic of "Tester and Data Scientist". Software Testing and Data Science actually have a fair amount of overlap. Yes, there is a level of testing in big data but that's not the same thing.
A data scientist, at the simplest level, is someone who looks through and tries to interpret information gathered to help someone make decisions.
Website data can tell us what features people engage with, what articles they enjoy reading and by extension, might help us make decisions as to what to do next based on that information.
An example can be seen on Amazon. About 40% of purchases are made based on user recommendations. the Data Scientist would be involved with helping determine that statistic as well as its validity.
Taking into consideration the broad array of places that data comes from is important. Large parallel systems, databases of databases, distributed cloud system implementations, aggregation tools, all of these will help us collect the data. The next step, of course, is to try to get this information into a format to be analyzed and for us (as Data Scientist wannabes) to synthesize that data into a narrative that is meaningful. I find the latter to be the much more interesting area and for me, that's the area that I'm most interested in learning more about. Of course, there needs to be a way to gather information and pull it down in a reliable and repeatable manner. The tools and the tech are a good way to get to the "what" of data aggregation. Interacting with the "why" is the more interesting (to me) but more nebulous aspect.
So what do I need to know to be a Data Scientist?
Recommended site: Information is Beautiful
The key takeaway is that, if you are a tester, you already have many of the core skills to be a Data Scientist. Stay Curious :).
A data scientist, at the simplest level, is someone who looks through and tries to interpret information gathered to help someone make decisions.
Website data can tell us what features people engage with, what articles they enjoy reading and by extension, might help us make decisions as to what to do next based on that information.
An example can be seen on Amazon. About 40% of purchases are made based on user recommendations. the Data Scientist would be involved with helping determine that statistic as well as its validity.
Taking into consideration the broad array of places that data comes from is important. Large parallel systems, databases of databases, distributed cloud system implementations, aggregation tools, all of these will help us collect the data. The next step, of course, is to try to get this information into a format to be analyzed and for us (as Data Scientist wannabes) to synthesize that data into a narrative that is meaningful. I find the latter to be the much more interesting area and for me, that's the area that I'm most interested in learning more about. Of course, there needs to be a way to gather information and pull it down in a reliable and repeatable manner. The tools and the tech are a good way to get to the "what" of data aggregation. Interacting with the "why" is the more interesting (to me) but more nebulous aspect.
So what do I need to know to be a Data Scientist?
- Scientific Method is super helpful.
- Math. Definitely, know Math.
- Python and R both have large libraries specific to data science.
- A real understanding of statistics.
- Machine Learning and the techniques used in the process. Get ready for some Buzzword Bingo. Understanding the broad areas is most important to get started.
Recommended site: Information is Beautiful
The key takeaway is that, if you are a tester, you already have many of the core skills to be a Data Scientist. Stay Curious :).
No comments:
Post a Comment