What has been people's experience with "full-stack" data roles?
I started my career being a jack of all trades - hired as a data analyst but I had to extract, clean, and then analyze data and even sometimes train models for simple predictions and categorization.
That actually led me to become a data engineer but I've spent most of my career working closely with data scientists and trying my best to make their jobs easier by taking away all the preprocessing tasks away from them so they can focus on training, inference MLops, etc.
While I claim to have helped them, to be honest DE teams often become a bottleneck and an obstacle. Everything from not being able to provide the data needed to train on time, or how we processed the data was wrong and led to bad performance, or they went live with a model blindly because we couldn't get them the observation data on time for them to analyze accuracy.
I'm wondering how much of the data engineering tasks can be automated/vibed away by data scientists. My guess is that in larger companies this won't be the case but I think startups and SMBs want to move fast so they'd rather have data scientists own the whole pipeline.
What has been other's experience with this and where is it heading?
[link] [comments]
Want to read more?
Check out the full article on the original site