@polo77j has a few good basic starter questions to ask yourself.
First, it’s important to understand that “data analytics” encompasses a bunch of different sub-specialty applications (ultimately, many basic principles carry across different applications, but the specifics and nuances within those sub-specialties take some time to learn, as well). The norms of how data analysis is performed by a company trying to make self-driving cars will be very different than those applied in finance, which will be very different than those applied in medicine (my field). So it’s hard to answer this question briefly, because the experience of an “analyst” or “statistician” can vary quite a bit from one place to the next.
I’ve worked in three different environments over the last eight years, so I’ll give you a taste of what that’s been like.
Job #1: as a graduate student, I worked as an analyst in a large data center that coordinates multicenter clinical trials around the country. The study that I was assigned to enrolled 2,368 patients at >60 clinical sites and randomly assigned them to a diabetes treatment strategy (insulin sensitizing drugs versus insulin providing therapy) and a cardiovascular treatment strategy (immediate revascularization versus “watchful waiting”). During the study, our center was responsible for creating the database infrastructure, training the individual sites on proper screening/enrollment and data entry, performing data-quality checks, producing reports for the Data and Safety Monitoring Board (which convenes at pre-specified intervals to ensure that the study is going as planned, that patients are not being exposed to undue harm by participating, and that neither treatment is showing such a strong benefit/harm that it’s unethical to continue randomly assigning the treatments). Once the study has concluded, we perform a final data freeze and then start the process of mining the data for all that it’s worth. There are the “primary analyses” (the original purpose for which the trial was designed) and then, since we have a very rich source of well-collected, well-scrubbed data on a large batch of patients, we write as many “secondary” papers as we can to glean everything that we can from the data. It costs millions upon millions of dollars to conduct this large of a study, so you really want to make use of all that well-collected “clean” data to answer all of the interesting secondary questions we might explore in that study population.
Day-to-day operations there were pretty chill (at least on the analyst level - higher up in the administration, there’s more pressure because you’re negotiating contracts for major studies, trying to put out fires when someone demands something very quickly, etc). As a data analyst, you’ll basically get put on a couple of major projects by someone at the PhD level, spend most of the day sitting at your computer & trying to grind through your major analytic tasks, periodically reporting back to your PhD boss/director with updates, and occasionally hop on conference calls with the main study investigators to discuss the findings thus far and see what else they want to explore. Working in this environment has the advantage that most of your clients are off-site, so they can’t really hassle you all that much, and you provide a service that they really have to have (most people/organizations do not have the infrastructure in-house needed to coordinate and analyze the data from large studies like the ones I described above, it takes a lot more time, money, and effort than most people probably think). There will occasionally be issues, like some clients that demand things on unreasonable timelines or off the study protocol, but the director-level PhD’s will deal with that (although you might have the occasional boss that tells you “Sorry, but we need this one done in a big hurry; drop everything else, this is the most urgent task until it’s done.”)
Job #2: core statistician at Magee Womens Research Institute. I was the “stats guy” available for a couple dozen faculty members performing research in obstetrics, gynecology, and a couple other similar fields. Here’s a dirty little secret that most people outside the medical research enterprise probably wouldn’t want you to know - a large percentage of experimental data is ultimately analyzed by people with little/no formal training in data analysis, and they typically range from “knows just enough to scrape by doing their own analysis with only a few mistakes” to “downright incompetent.” I was constantly in the position of trying to rescue poorly-designed studies, poorly-collected or poorly-organized data (in some cases, data that were all but useless for the purpose that the PI was trying to serve) and/or trying to explain to PI’s that the analyses they wanted to perform were not appropriately suited for the question that they were trying to answer. I spent most of two years basically trying to get people to do things “less wrong” than they would have done them without me there, but it was somewhat of an eye-opener (I mean, some of these people were career researchers with millions of dollars in government grants to their name, no doubt very brilliant in their own specialty, but many really struggled with the most rudimentary aspects of data collection and study design).
Day-to-day life: in that position, I was a one-man band serving many masters, and therefore I was constantly juggling a bunch of projects from a bunch of different people, meetings every day to discuss progress / updates / planning, performing analysis whenever I could squeeze in the time, plus helping write medical papers and grant applications.
Job #3 (current job): very similar position as Job 2, but now at our Heart & Vascular Institute, working with cardiologists and cardiothoracic surgeons. Same basic responsibilities, with a different crew of people. I took this job because I was recruited by one physician that I really like and respect, and it was an advancement opportunity (higher pay, higher title, now with 2 staff members under me) but the day to day issues described in Job 2 are all basically the same. People often just send me their data (spreadsheet with god knows what sort of color codes, typos, etc) and say “here’s my data; I want you to do ____ with it; also, I need this done by tomorrow, sorry for the late notice!” or “can u do stats on this thx” (that’s an actual direct quote of a message I have received from a surgeon; missives like that are not uncommon with this crew), which is frustrating because to provide quality service I need to understand what their main question is, how their data are structured, and pry into some nuances of the analytic approach that can have a big influence on the results we obtain and the inference that can be drawn from them.
A few final observations:
-
for me, the hands-on analysis work is the best part of the job. I love getting a fresh dataset, defining the analytic question, figuring out the analytic strategy, and going to town with some code. It’s really satisfying to write code that produces the output you’re looking for, and it can also be fun to figure out a new modeling approach that best suits the data and question you’ve been tasked with answering.
-
Also, depending on exactly where you end up, compensation is generally pretty good, and there is pretty high demand for people with statistical/analytic skills. There are large parts of the country with no statistical jobs, but in places where statistical/analytic jobs are concentrated, there are usually more jobs than there are qualified people to fill them. I’ve been recruited for (and turned down) a couple job offers while I already had a job.
-
while I am a decent “people person” and enjoy collaborating, the most challenging aspect is often steering your collaborators and/or supervisors in the right direction and working with them to understand the real question and potential confounding or complicating factors.
-
depending on the environment/organization you work in, it can be really challenging to figure out who you’re ultimately reporting to and what you’re ultimately accountable for. My early-career experience is that one should be wary of places that have never employed their own in-house statistical team before, places that are going to hire you as “the stats guy” for everyone - because even they don’t really know exactly what they’re expecting from you or have a clearly defined strategy for who can ask you to do stuff, how they can approach you, and how quickly a turnaround can be expected on requests. They just kind of know “we need someone that can help us do stats” - and while they do need the help, my anecdotal experience does suggest that the early stages are often pretty uneven as they figure out who you are, what you can do, and how to approach you. Furthermore, even if you try to outline specific “policies” for operations, they are completely toothless unless someone higher than you will stand behind them and enforce them. I came in with big ideas for a structured process for new projects, a request form that people would submit for my support, etc. Ha. That lasted a couple weeks, until I realized that every single MD either thought they were exempt from any rules or process that I tried to implement.
Anyways, those are some disjointed thoughts from my experience. Data analysts / statisticians in other fields like economics, finance, etc will likely have substantially different (both PRO and CON) experiences from my own.
