Python
SQL
Machine Learning
TensorFlow
Data Visualization
Tableau
Power BI
PostgreSQL
Streamlit
Deep Learning
NLP
Pandas
XGBoost
Scikit-learn
Python
SQL
Machine Learning
TensorFlow
Data Visualization
Tableau
Power BI
PostgreSQL
Streamlit
Deep Learning
NLP
Pandas
XGBoost
Scikit-learn
About Projects Skills Experience Education Research Contact
Data Scientist & Analyst
SAIBALAJI NEELI
M.S. Computer Science · Oregon State University
|

Turning complex datasets into clear, actionable decisions. Currently a Data Science Intern at Sports Media Inc., building analytics pipelines and dashboards. Published researcher with hands-on experience across the full data lifecycle.

0
Records Analyzed
0
CNN Accuracy
0
Publication
0
GPA
Saibalaji Neeli
Open to Work Data Science · Analytics · ML Engineering
01 /

ABOUT ME

I'm a Data Scientist and Analyst with an M.S. in Computer Science from Oregon State University, passionate about transforming messy, real-world data into decisions that matter.

My work spans the full data lifecycle, from SQL querying and exploratory data analysis to building and deploying machine learning models and interactive dashboards. I've published research on ML data preparation, built live Streamlit applications used by real users, and have experience in sports analytics, NLP, computer vision, and BI reporting.

I'm actively seeking full-time roles in Data Science, Data Analytics, and ML Engineering.

PythonSQLMachine Learning Data VisualizationNLPDeep Learning StreamlitTableauPower BI Statistical ModelingFeature Engineering
◆  Quick Info
🎓
Education
M.S. Computer Science
Oregon State University, 2025
💼
Current Role
Data Science Intern
Sports Media Inc.
📍
Location
United States · Open to Relocation
📬
Email
neelisaibalaji607@gmail.com
📞
Phone
541-286-2406
02 /

PROJECTS

View Project →
01 · Featured Project
Job Application Assistant

AI-powered toolkit for job seekers. Features resume-job match scoring, bullet point rewriting, cover letter generation, and ATS simulation — all powered by Llama 3.3 70B via Groq API. Built and deployed end-to-end.

PythonStreamlit Groq APILlama 3.3NLP
View Project →
02
Sales Performance Dashboard

Interactive BI dashboard analyzing retail sales trends, regional performance, product insights, and customer segments using the Superstore dataset. Features KPI scorecards and 6 interactive Plotly charts.

PythonStreamlitPlotlyPandas
View Details →
03
Pneumonia Detection CNN

CNN image classifier achieving 95% accuracy and 0.92 ROC-AUC on chest X-ray data using VGG16 and ResNet transfer learning. Preprocessed 5,000+ images with augmentation and class balancing.

TensorFlowCNNVGG16ResNet
View Details →
04
F1 Race Database System

Normalized relational database managing 10,000+ race statistics. Applied star schema, 3NF normalization, and indexing — improving query performance by 30%. Built SQL aggregations for analytical reporting.

SQLPostgreSQLStar SchemaData Modeling
View Details →
05
Life Expectancy Prediction

End-to-end EDA and regression modeling on 200,000+ global health records. Achieved R²=0.85 using XGBoost and Random Forest with feature engineering, cross-validation, and Tableau visualization.

PythonXGBoostScikit-learnTableau
03 /

SKILLS

Languages
PythonSQLRJavaScriptJavaC / C++
ML & AI
TensorFlowPyTorchScikit-learnXGBoostKerasNLPDeep Learning
Data & Visualization
PandasNumPyTableauPower BIPlotlyMatplotlibSeabornExcel
Databases
PostgreSQLMySQLMongoDBStar Schema3NFData Modeling
Tools & Platforms
StreamlitGitJupyterGoogle ColabGroq APIVS Code
Core Competencies
EDAFeature EngineeringModel EvaluationData WranglingStatistical ModelingPipeline Design
◆ Proficiency Levels
◆ Tool Experience
04 /

EXPERIENCE

Sports Media
Outlier AI
OSU Research
Chennai IT
OCT 2025 — PRESENT
Data Science Intern
Sports Media Inc. · United States
  • Collected and cleaned sports statistics datasets using Python (Pandas, NumPy), standardizing data formats and resolving inconsistencies to build analysis-ready datasets.
  • Analyzed player and game performance metrics using SQL queries to surface trends and patterns across large-scale sports datasets.
  • Built data visualizations summarizing key sports performance insights to support reporting and decision-making workflows.
JUN 2024 — AUG 2025
Prompt Engineer / AI Model Evaluator
Outlier AI · Remote
  • Evaluated LLM outputs across hundreds of prompts, identifying patterns in model failures and documenting quality issues to inform improvement cycles.
  • Designed structured evaluation frameworks to assess response accuracy, reasoning, and instruction-following across model versions.
  • Synthesized analytical findings into structured reports that guided model retraining and quality benchmarking decisions.
JAN 2024 — MAY 2024
Graduate Research Assistant
Oregon State University · Corvallis, OR
  • Researched minimal-imputation strategies in ML data preparation, reducing processing time by 15% through optimized statistical pipelines.
  • Benchmarked 10+ ML models; identified preprocessing improvements that boosted predictive accuracy by 25%.
  • Co-authored a research paper published on Zenodo (DOI: 10.5281/zenodo.15306002).
APR 2022 — AUG 2023
Research Assistant
Chennai Institute of Technology · India
  • Built ML models to predict compression and tensile strength of 3D-printed polymer components, reducing experimental trial time.
  • Collected and structured datasets from 50+ experimental runs for statistical and predictive analysis.
  • Led a two-day Additive Manufacturing workshop, delivering technical presentations to 30+ participants.
05 /

EDUCATION

🎓
Master of Science — Computer Science
Oregon State University · Corvallis, OR
Sep 2023 – Jun 2025  ·  Relevant coursework: Machine Learning, ML Challenges, Topological Data Analysis, Human-Computer Interaction, Algorithms
3.52
🏛️
Bachelor of Engineering — Computer Science
Anna University · Chennai, India
Jan 2019 – Mar 2023  ·  Relevant coursework: Artificial Intelligence, Databases, Data Warehousing, Computer Networks, Software Engineering
3.60
06 /

RESEARCH

📄
◆ Zenodo Preprint · 2025
CertainPrep: Showcasing Minimal Imputations in ML Data Preparation

Research on minimal-imputation strategies for machine learning data pipelines. Demonstrates how optimized preprocessing reduces data processing time by 15% while maintaining or improving model accuracy. Covers novel approaches to data completeness and model performance in large-scale datasets.

DOI: 10.5281/zenodo.15306002 →
07 /

CERTIFICATIONS

🎓
Machine Learning for All
Coursera
🗄️
The Complete SQL Bootcamp
Udemy
📊
Tableau Data Visualization
Udemy
08 /

TESTIMONIALS

"

"I had the pleasure of working with Saibalaji and can confidently say he is a highly analytical and dedicated data professional. He has a strong ability to work with complex datasets, identify meaningful patterns, and translate them into clear insights that support data-driven decision making. Saibalaji combines solid technical skills in SQL, Python, and data visualization tools like Tableau and Power BI with a thoughtful, problem-solving mindset. I highly recommend Saibalaji for any role involving data analysis or business intelligence."

SS
Sai Shruthi Nandigam
HR Partner · Ex-Oracle Health
"

"Saibalaji is exceptional when it comes to working with data and machine learning models. His ability to approach complex analytical problems methodically and extract meaningful insights is impressive. He conducted a great research study and demonstrated a strong grasp of both the technical and practical aspects of data science. A reliable and skilled collaborator I would recommend without hesitation."

MR
Madhumidha Ramesh
Lab Research Assistant · University of Louisville
07 /

CONTACT

I'm actively looking for full-time roles in Data Science, Data Analytics, and ML Engineering. If you have an opportunity or just want to connect, reach out, I'd love to chat.