Kiefer Sulijanto
Final Year · CS Big Data @ UOW

Data that drives
decisions.

Data Analyst & Engineer building production-grade systems, grounded in AI/ML engineering to solve real-world problems.

I build AI systems that are production-ready — not just demo-ready.

35%
Monitoring Reduced
3rd
UOL CSSC Hackathon
~1M
Records Trained
3+
Shipped Projects
01

About

From Jakarta to Singapore — building production systems along the way.

I'm a Computer Science (Big Data) student at University of Wollongong in Singapore, focused on building AI systems that are production-ready — not just demo-ready.

As Team Lead, I built SecureAdvisor for Certis Group — an AI-powered security incident management platform that fuses live CCTV, access logs, and manual triggers using YOLOv8 detection and GPT-4o advisory to coordinate real-time ground officer dispatch across a 3-app system.

In an industry collaboration with Aires Applied Technology, I built MakanMap — a real-time crowd forecasting system powered by a Gradient Boosting Regressor (93.2% accuracy), served through a FastAPI backend and a React dashboard with live What-If scenario analysis.

At SIM Data Analytics Club, I developed a Real Estate Valuation Analyzer trained on 1M+ residential records — covering end-to-end ML inference, real-time over/undervaluation analysis, and a recommendation engine that cut property search time by 15%.

What drives me is turning messy real-world problems into systems that actually work — clean pipelines, reliable APIs, and interfaces that make decisions easier for the people using them.

What drives me

Data systems that scale. From raw pipeline to production interface — reliable, observable, and engineered to turn big data into real business impact.

Current obsession

AI/ML Pipeline Design — the infrastructure layer that makes data systems actually work in production. Observable, scalable, and debuggable.

Data PipelinesFeature EngineeringML DeploymentReal-Time AnalyticsAPI IntegrationContainerisationGenAI Integration
University of Wollongong

University of Wollongong

Bachelor of Computer Science (Big Data)

Sep 2024 – Dec 2026 · Singapore

Singapore Institute of Management

Singapore Institute of Management

Diploma in Information Technology

Oct 2023 – Sep 2024

02

Experience

Internships, Industry Projects, and Clubs.

Aires Applied Quantum Technology

Frontend Engineer

Contract

Aires Applied Quantum Technology

Jan 2026 – Mar 2026 · Singapore · Remote

Built MakanMap — a real-time crowd level monitoring dashboard in React and Vite, surfacing ML predictions at 93.2% classification accuracy across 30-minute forecast bins up to 3 hours ahead. Built a What-If scenario analysis module for instant side-by-side comparison of crowdedness predictions. Containerised a 4-service stack with Docker Compose and automated CI/CD via GitHub Actions.

ReactViteFastAPIPythonMLflowDockerRechartsGitHub Actions
SIM Data Analytics Club

Data Analyst

Contract

SIM Data Analytics Club

Sep 2025 – Mar 2026 · Singapore · On-site

Developed a real estate valuation analyzer platform — a Gen-AI property valuation system trained on 1M+ residential records covering end-to-end ML inference, real-time over/undervaluation analysis, and a recommendation engine that cut property search time by 15%.

PythonSQLGenAIXGBoostScikit-learnFastAPIReact
Astrindo Senayasa

Software and Web Developer Intern

Internship

Astrindo Senayasa

Apr 2025 – Jun 2025 · Jakarta, Indonesia · On-site

Developed an AI chatbot for natural language product search and multi-product comparisons across 100+ IT products. Implemented automated report generation — reducing database load by 45%. Enhanced internal query resolution time by 35% through contextual memory tracking.

PHPMySQLLaravelHTMLCSSJavaScriptLLM APIsApache
PPI Singapura

Data and IT Committee

Contract

PPI Singapura

Jan 2025 – Nov 2025 · Singapore · On-site

Led a cross-functional team of 5+ to deliver a university comparison platform for 10+ institutions. Drove a 20% increase in page engagement through UI restructuring. Managed 50+ digital assets as part of PPI Singapura's digital transformation initiative.

HTMLCSSFigmaWixProject Management
03

Selected Work

TEAM LEADCERTIS GROUPFULL STACK · PRODUCTION

SecureAdvisor

Built for Certis Group — this platform coordinates real-time security incident response across a 3-app system, led as Team Lead with a cross-functional team.

On the detection side, CCTV frames, access control door events, and manual panic triggers are routed through YOLOv8n at 0.7 confidence with per-camera polygon zone validation.

A sliding event window de-duplicates and fuses signals across all three streams, classifying them into 7 incident types before escalating to GPT-4o — which returns a threat severity flag, a structured response plan, and a named dispatch unit with ETA.

What makes this distinct is the end-to-end pipeline latency under 2 seconds — from raw CCTV frame to actionable officer dispatch recommendation — running across a FastAPI backend and 3 React + Vite frontends.

System Architecture

INPUT SIGNALSCCTV Uploadbase64 · frame-by-frameLive Camera Stream4 cams · DroidCam · webcamAccess Control Logsdoor events · keycardManual Triggerspanic · fire · customPOST /cctv/framePOST /cctv/framePOST /accessPOST /manualYOLOv8n Detection EngineOpenCV frame decode · 0.7 conf threshold · person detection · restricted zone check · yolov8n.ptFrame Decode (OpenCV)YOLOv8n InferenceZone Check + AdapterEvent → PipelinePipeline Service — Event Stream Processor120s sliding window · 30s duplicate cooldown · multi-source signal fusion · rule-based correlationIntrusionUnauthorizedAfter-HoursLoiteringTailgatingPanic ButtonFireOpenAI GPT-4o Advisory Engineincident analysis · flag classification · recommended actions · dispatch unit · expected response timeFlag: Green / Yellow / RedResponse Actions + PlanDispatch Unit + ETAREST APIFastAPI Backend — Port 8000 (Uvicorn)in-memory store · 10 officers · incident / dispatch / report endpoints · CORS · demo resetIncident RouterOfficer ManagementDispatch EngineField ReportsHTTP PollHTTP PollHTTP PollSupervisor DashboardReact + Vite · Port 5173Ground Officer AppReact + Vite · Port 5174 · mobileDemo Trigger AppReact + Vite · Port 5175 · scenarios

< 2s

End-to-end pipeline — from raw CCTV frame to YOLOv8n detection, rule correlation, and GPT-4o advisory with officer recommendation

0.7

YOLOv8n confidence threshold for real-time person detection with per-camera restricted zone polygon checks

7+

Incident types detected by the rule-based engine — intrusion, loitering, tailgating, panic, fire, unauthorized access, after-hours presence

120s

Sliding event window for multi-source signal fusion — correlating CCTV, access logs, and manual triggers with 30s duplicate suppression

FastAPIUvicornReactViteYOLOv8nOpenCVNumPyOpenAI APIPythonPydantic
AIRES APPLIED TECH · FRONTEND ENGINEER93.2% ACCURACYFULL STACK · DOCKERISED

MakanMap

An industry collaboration with Aires Applied Technology, a Singapore-based deep-tech startup — built to give venues and users a reliable way to plan around crowd density before it happens.

The forecasting engine is a Gradient Boosting Regressor trained to 93.2% accuracy on contextual signals: temperature, humidity, weather condition, time-of-day, public holiday flags, and historical location frequency. Data flows through an Apache Airflow pipeline — two sequential DAG tasks handle raw event cleaning and feature engineering, writing directly into a Supabase PostgreSQL feature store before predictions are served via FastAPI.

The dashboard goes beyond a chart — a What-If scenario analysis module lets users override input conditions and compare alternate crowd outcomes side-by-side against the baseline. The entire stack runs across 4 Docker services with GitHub Actions CI/CD, making it deployable and reproducible out of the box.

System Architecture

INPUT DATAVenue Check-insvisitor logs · timestampsHistorical Crowd Records30-min bin averages · location IDsCalendar & Weatherpublic holidays · time signalsApache Airflow DAG — Scheduled Pipelinecrowd_level_predictor_pipeline · Python operators · 2 sequential tasks · triggers on new raw dataTask 1: clean_data.py — validation + normalisationTask 2: build_features.py → write to features tablewrite featuresSupabase PostgreSQL — Persistent Feature Storehosted cloud DB · raw_data table + features table · real-time feature reads for inferenceraw_data Table — ingested recordsfeatures Table — engineered signals for GBRread featuresGradient Boosting Regressorgbr_model.pkl · 93.2% accuracy · 30-min bin prediction · score post-processingMLflow TrackingPort 5000 · experiment runsScore Scaler (0–100 normalisation)Low / Medium / High ClassificationJSON responseFastAPI Backend — Port 8000 (Uvicorn)POST /predict · GET /forecast · CORS · Docker service · loads gbr_model.pkl on startupPrediction Router3-hr Forecast HorizonWhat-If Scenario EngineScore NormalisationReal-time MonitorReact + Vite · Port 5173 · RechartsForecast Chart12 × 30-min bins · up to 3 hrs aheadWhat-If Analysisside-by-side scenario comparison

93.2%

Gradient Boosting Regressor accuracy across 30-minute crowd level bins — trained on weather, time, location frequency, and public holiday signals

3 hrs

Max forecast horizon in the dashboard — up to 6 × 30-minute bins with Low, Medium, and High crowd classification

4

Docker services via Compose — FastAPI backend, React dashboard, MLflow experiment tracker, and Airflow pipeline scheduler

2

Airflow DAG tasks on schedule — raw data cleaning then feature engineering writing directly to Supabase's features table

PythonScikit-learnFastAPIUvicornReactViteRechartsAirflowMLflowSupabaseDockerGitHub Actions
SIM DAC · DATA ANALYST~1M RECORDS · 50 US STATESML + GPT-4O PIPELINE

Real Estate Valuation AI

A property valuation tool built for the SIM Data Analytics Club, trained on ~1M Zillow residential records spanning all 50 US states — the system takes a listing URL and outputs a verdict.

The ML pipeline tackles a tricky data problem — city and state are high-cardinality categoricals that break naive encoding. The solution uses K-Fold cross-validated Target Encoding, replacing each label with a target-mean value to preserve strong location price signal without data leakage. XGBoost Regressor and Gradient Boosting Regressor were benchmarked via GridSearchCV, with the best model serialised as a full Scikit-learn Pipeline.

The output isn't just a number. A Bullet Chart compares predicted vs actual price to surface the over/undervaluation percentage, while a GPT-4o advisory layer generates conversational valuation context and 4% annual forward price projections — all accessible through a Gradio web interface.

System Architecture

INPUTZillow Property URLuser-provided listing linkusa_real_estate.csv~1M records · training data · 50 US statesWeb Scraper — Feature ExtractionPython · extracts 6 features per listing: Price (actual), City, State, House Size, Lot Size, Beds, BathsPrice (Actual)City · StateHouse Size · Lot SizeBeds · Bathsencode + scaleFeature Engineering PipelineK-Fold Target Encoding for City/State · StandardScaler · prevents data leakage · fitted on training data onlyTarget Encoder — CityTarget Encoder — StateStandardScaler → Feature VectorpredictXGBoost Pipeline — best_model_pipeline.pklGridSearchCV hyperparameter tuning · XGBoost Regressor vs GBR benchmark · Pipeline: Scaler + Model → P_predictedXGBoost Regressor (primary)Gradient Boosting Regressor (benchmark)P_predicted ($)Valuation Engine + Bullet ChartP_actual vs P_predicted · over/under % · 4%/yr projectionGenAI Advisory — Gradio InterfaceOpenAI API · natural language valuation insights

~1M

Zillow residential records across all 50 US states — covering price, city, state, lot size, house size, beds, and baths for training

15%

Property search time reduced via AI recommendation engine — surfacing over/undervalued listings ranked by valuation delta

2

Ensemble models benchmarked with GridSearchCV hyperparameter tuning — XGBoost Regressor vs Gradient Boosting Regressor

4%

Annual growth rate applied for forward price projections from 2025 — embedded in the GenAI valuation advisory output

PythonXGBoostScikit-learnPandasOpenAI APIGradioMatplotlibPlotlyJoblibTarget Encoding
kiefer-sulijanto/real-estate-price-prediction ↗
INTERNSHIP · ASTRINDO SENAYASAENTERPRISE INTERNALNLU · PHP + OPENAI API

Astrindo Chatbot

Built during my internship at Astrindo Senayasa (Jakarta) — an internal enterprise chatbot that lets non-technical employees query live business data across Marketing, HR, Finance, Purchasing, and Service departments using plain language.

The system runs a two-stage NLU pipeline: GPT-4o-mini classifies the user's intent and extracts structured entities (year, month, specialist name, city) at temperature=0, returning strict JSON. The PHP backend then dispatches to a domain-specific feature handler that executes deterministic MySQL queries — the LLM never touches the database, making hallucinated figures structurally impossible.

The NLU call also generates a ChatGPT-style sidebar title per session, with a hard-banned list of generic labels and a regex-based Indonesian fallback to keep titles specific and natural across both English and Indonesian input.

System Architecture

USER INPUTPlain Language Query — Indonesian or English"Berapa total biaya marketing bulan ini?" · "Who requested the most ATK in 2024?" · "Show me service summary"POST /chat · JSON {message}NLU Engine — GPT-4o-mini (temperature = 0)classifies intent · extracts entities (year, month, name, city) · generates chat title · returns strict JSONIntent ClassificationEntity Extraction (year, month, name…)ChatGPT-style Title Generation{ "intent": "...", "entities": {...}, "title": "..." }PHP Dispatcher — chat.phproutes by intent field · loads features/<intent>.php · calls handleX($conn, $entities) · fallback via GPT-4o-minimarketing_total_costmarketing_specialist_costmarketing_inventoryhr_top_requester_atkhr_top_item_atkfinance_top_itemfinance_top_senderpurchasing_top_requesterpurchasing_total_requestpurchasing_vendor_cityservice_summarysmalltalk → fallbackparameterised SQL queryMySQL — Astrindo Digital Approval Databaselive internal ERP data · department tables · no LLM writes to DB · all figures come from deterministic SQLmarketing_activity_headerstransaksi_h_hrd (HR)finance / purchasing tablesservice_summary tableFormatted Chat ResponseHTML · structured data · Rp currency formattingChatGPT-style Sidebar Titlebanned-list enforced · Indonesian regex fallback · max 48 chars

12

Intents across 5 business domains — Marketing, HR, Finance, Purchasing, and Service — each with a dedicated PHP feature handler

0

Hallucinated numbers — LLM returns only intent + entities; all figures come from deterministic MySQL queries against live business data

2

Languages supported — bilingual input (Indonesian and English) with intent detection stable across both via the same NLU prompt

5

Business departments integrated — Marketing, HR, Finance, Purchasing, Service — covering the full Digital Approval reporting scope

PHP 8.0OpenAI APIMySQLJavaScriptApacheGPT-4o-miniNLU PipelineREST API
kiefer-sulijanto/astrindo-chatbot ↗
The Full Archive

Every project,
one level deeper.

Full source, architecture breakdowns, and technical write-ups for every shipped project.

SecureAdvisor

1 repo · Certis Group

MakanMap

1 repo · Aires

Real Estate Valuation AI

1 repo · SIM DAC

Astrindo Chatbot

1 repo · Astrindo Senayasa

/projects/all
Explore all