J A B B Y A I

Loading

Uncategorized

Help a CS student. Need honest feedback on wrangling data for ML/MLOps

By jabbyai
No Comments

I’m currently speaking with post-training/ML teams at LLM labs, folks who wrangle data for models or work in ML/MLOps.

Tell me your thoughts or anecdotes on ::

Biggest recurring bottleneck (collection, cleaning, labeling, drift, compliance, etc.)
Has RLHF/synthetic data actually cut your need for fresh domain data?
Hard-to-source domains (finance, healthcare, logs, multi-modal, whatever) and why.
Tasks you’d automate first if you could.

submitted by /u/kritnu
[link] [comments]

share:

Leave a Comment Cancel reply