Found Description
Requirements
Industry experience in data science, machine learning, or a closely related quantitative field Proficiency in Python and the core DS stack: Pandas, Scikit-Learn, XGBoost, and at least one deep learning framework (PyTorch or TensorFlow) Solid grasp of statistical concepts underpinning model evaluation: bias–variance tradeoff, calibration, confidence intervals, A/B testing, and data drift Experience with LLM evaluation frameworks (e.g. RAGAS, Eleuther AI Eval Harness, or custom LLM eval pipelines) Hands‑on experience designing custom evaluation metrics; you've gone beyond off‑the‑shelf metrics when the problem demanded it Strong understanding of ML and LLM model architectures — you can reason about how a model is built and why it behaves the way it does High proficiency in SQL for data exploration, feature validation, and debugging model inputs Exceptional attention to detail — you treat model validation with the same rigour as software QA Strong written and verbal ...
Industry experience in data science, machine learning, or a closely related quantitative field Proficiency in Python and the core DS stack: Pandas, Scikit-Learn, XGBoost, and at least one deep learning framework (PyTorch or TensorFlow) Solid grasp of statistical concepts underpinning model evaluation: bias–variance tradeoff, calibration, confidence intervals, A/B testing, and data drift Experience with LLM evaluation frameworks (e.g. RAGAS, Eleuther AI Eval Harness, or custom LLM eval pipelines) Hands‑on experience designing custom evaluation metrics; you've gone beyond off‑the‑shelf metrics when the problem demanded it Strong understanding of ML and LLM model architectures — you can reason about how a model is built and why it behaves the way it does High proficiency in SQL for data exploration, feature validation, and debugging model inputs Exceptional attention to detail — you treat model validation with the same rigour as software QA Strong written and verbal ...