Turning complex datasets into clear, actionable decisions. Currently a Data Science Intern at Sports Media Inc., building analytics pipelines and dashboards. Published researcher with hands-on experience across the full data lifecycle.
I'm a Data Scientist and Analyst with an M.S. in Computer Science from Oregon State University, passionate about transforming messy, real-world data into decisions that matter.
My work spans the full data lifecycle, from SQL querying and exploratory data analysis to building and deploying machine learning models and interactive dashboards. I've published research on ML data preparation, built live Streamlit applications used by real users, and have experience in sports analytics, NLP, computer vision, and BI reporting.
I'm actively seeking full-time roles in Data Science, Data Analytics, and ML Engineering.
AI-powered toolkit for job seekers. Features resume-job match scoring, bullet point rewriting, cover letter generation, and ATS simulation — all powered by Llama 3.3 70B via Groq API. Built and deployed end-to-end.
Interactive BI dashboard analyzing retail sales trends, regional performance, product insights, and customer segments using the Superstore dataset. Features KPI scorecards and 6 interactive Plotly charts.
CNN image classifier achieving 95% accuracy and 0.92 ROC-AUC on chest X-ray data using VGG16 and ResNet transfer learning. Preprocessed 5,000+ images with augmentation and class balancing.
Normalized relational database managing 10,000+ race statistics. Applied star schema, 3NF normalization, and indexing — improving query performance by 30%. Built SQL aggregations for analytical reporting.
End-to-end EDA and regression modeling on 200,000+ global health records. Achieved R²=0.85 using XGBoost and Random Forest with feature engineering, cross-validation, and Tableau visualization.
Research on minimal-imputation strategies for machine learning data pipelines. Demonstrates how optimized preprocessing reduces data processing time by 15% while maintaining or improving model accuracy. Covers novel approaches to data completeness and model performance in large-scale datasets.
DOI: 10.5281/zenodo.15306002 →"I had the pleasure of working with Saibalaji and can confidently say he is a highly analytical and dedicated data professional. He has a strong ability to work with complex datasets, identify meaningful patterns, and translate them into clear insights that support data-driven decision making. Saibalaji combines solid technical skills in SQL, Python, and data visualization tools like Tableau and Power BI with a thoughtful, problem-solving mindset. I highly recommend Saibalaji for any role involving data analysis or business intelligence."
"Saibalaji is exceptional when it comes to working with data and machine learning models. His ability to approach complex analytical problems methodically and extract meaningful insights is impressive. He conducted a great research study and demonstrated a strong grasp of both the technical and practical aspects of data science. A reliable and skilled collaborator I would recommend without hesitation."
I'm actively looking for full-time roles in Data Science, Data Analytics, and ML Engineering. If you have an opportunity or just want to connect, reach out, I'd love to chat.