Back to Selected Work
03All Projects

The full catalog.

Flagship projects and industry collaborations — each broken out by architecture so the system tells the story instead of a pitch.

Flagship ProjectsIndustry collaborations & competitions, shipped
TEAM LEAD · NEURAL SYNDICATE1ST RUNNER-UP · NAISC 2026CERTIS GROUPFULL STACK · PRODUCTION

SecureAdvisor — AI-Powered Security Incident Response

1st Runner-Up at NAISC 2026 — the National AI Student Challenge by Certis Group, presented at Marina Bay Sands. As Team Lead of Team Neural Syndicate, I worked across the full stack: collecting and labelling training data, building the Command Centre Dashboard, Ground Officer App, and improving the FastAPI backend.

The detection layer runs two parallel models: YOLOv8n at 0.7 confidence handles person detection and per-camera polygon zone validation across live CCTV streams and video uploads, while a VideoMAE microservice — a separate FastAPI service with a fine-tuned HuggingFace model — classifies physical altercations from 16-frame clips at 0.75 confidence.

A 120-second sliding event window fuses signals from all three sources into 9 incident types — intrusion, unauthorized access, loitering, tailgating, after-hours presence, physical altercation, emergency distress, fire alert, and unattended bag — before escalating to GPT-4o for severity flags, structured response plans, and named dispatch units with ETA.

Supervisors manage response through a Command Centre Dashboard with a live location map across 3 floors and 14 named zones, tracking real-time officer positions and active incident priorities per location. Ground officers receive task assignments instantly, update status in the field, and submit incident reports on resolution — end-to-end pipeline under 2 seconds.

Three Architectural Commitments

Dual-model detection layer

YOLOv8n handles person detection and polygon zone validation. A separate VideoMAE microservice classifies fight clips at 0.75 confidence — each model runs independently and feeds fight_detected events into the same pipeline.

Strict LLM boundary

GPT-4o only generates the advisory text. All routing, deduplication, zone validation, and incident classification are deterministic Python — no LLM in the critical path.

Location-aware dispatch

The Command Centre Dashboard shows a live floor-plan map across 3 floors and 14 named zones. Officer positions and active incident priorities are tracked per location — supervisors dispatch directly from the map.

Production Outcomes

  • < 2s end-to-end: from raw CCTV frame → YOLOv8n / VideoMAE detection → rule engine → GPT-4o advisory → Command Centre Dashboard alert.
  • 9 incident types classified deterministically: intrusion, unauthorized access, loitering, tailgating, after-hours presence, physical altercation, emergency distress, fire alert, and unattended bag.
  • 120s sliding multi-source fusion window with 30s duplicate suppression — prevents alert flooding across CCTV, access control, and manual trigger streams.
  • 1st Runner-Up at NAISC 2026 — Team Neural Syndicate placed 1st Runner-Up at the National AI Student Challenge by Certis Group, presented at Marina Bay Sands.
security-response-advisorFull Stack · Python + React

3-app system — Command Centre Dashboard (location map · 3 floors · 14 zones), Ground Officer App, Demo Trigger. FastAPI backend, YOLOv8n + VideoMAE microservice, GPT-4o advisory.

AIRES APPLIED TECH · FRONTEND ENGINEER93.2% ACCURACYFULL STACK · DOCKERISED

MakanMap — Real-Time Crowd Level Forecasting

Built for Aires Applied Technology — a real-time crowd level forecasting system for food court locations. Operators view predicted crowd density for any location at any future time and run live What-If scenario analysis.

The core model is a Gradient Boosting Regressor trained on ~50K rows of historical visitor count data per location, achieving 93.2% R² on the holdout set. The feature set includes hour-of-day, day-of-week, week-of-year, is_public_holiday, rolling 7-day average visitor count, location_type, and weather category.

The system runs on two Apache Airflow DAGs: one daily retraining + validation pipeline and one hourly inference pipeline that writes predictions to Supabase for the React dashboard to consume in real time.

Three Architectural Commitments

In-memory What-If engine

Model is loaded once at FastAPI startup and lives in process memory. Each What-If query is a single inference call on a modified feature vector — no DB call, no retraining. Response time < 100ms.

Two-DAG Airflow pipeline

DAG 1 (daily): pull sensor data → clean → feature engineer → retrain → validate → write artifact to S3. DAG 2 (hourly): fetch latest data → inference → write predictions to Supabase.

Tabular-first model selection

Gradient Boosting chosen over XGBoost and LightGBM after cross-validation on holdout R². Neural nets were excluded — the ~50K row dataset is too small for deep learning to generalise reliably.

Production Outcomes

  • 93.2% R² on holdout set across all test locations — Gradient Boosting outperformed XGBoost and LightGBM after cross-validation.
  • Rolling 7-day average visitor count was the highest-importance feature, capturing recent location-specific trends better than calendar signals alone.
  • What-If scenario recompute runs < 100ms in-memory — operators can sweep across times, days, and holiday flags without any backend latency.
  • Deployed with Docker + GitHub Actions CI/CD; model artifacts versioned to S3 with each daily DAG run.
crowd-level-predictorML Backend · Python + FastAPI

Gradient Boosting model, Airflow orchestration, FastAPI prediction service, Supabase storage, MLflow tracking.

SIM DAC · DATA ANALYST~1M RECORDS · 50 US STATESML + GPT-4O PIPELINE

Real Estate Valuation AI — USA Property Predictor

Built for the SIM Data Analytics Club — an end-to-end ML system for USA residential property price prediction, trained on ~1M Zillow records across all 50 US states, with a GPT-4o advisory layer that turns a model output into a natural language valuation report.

XGBoost handles the regression task: it natively manages the high null rate in Zillow data (older listings frequently omit features), trains fast on 1M rows, and produces feature importance out of the box. It outperformed Random Forest and LightGBM on RMSE after cross-validation.

Location encoding was the central challenge: 50 states × hundreds of cities × thousands of zip codes creates extreme cardinality. Target Encoding captures location price signal in a single numeric feature per column, with cross-validation folds to prevent target leakage.

Three Architectural Commitments

Target Encoding for location

One-hot on 50 states × cities × zip codes would produce tens of thousands of sparse columns. Target Encoding collapses each to one numeric feature (mean price per category) with CV folds to prevent leakage.

XGBoost for tabular scale

Handles nulls natively — critical for Zillow data where older listings frequently omit features. Trains fast on 1M rows and provides feature importance rankings without post-hoc SHAP computation.

GPT-4o as decision layer

After the model outputs a price, GPT-4o receives: predicted price, listing price, and median zip-code price. It generates a 3-paragraph advisory: valuation verdict, key drivers, buyer/seller guidance.

Production Outcomes

  • Trained on ~1M Zillow residential records spanning all 50 US states — XGBoost outperformed Random Forest and LightGBM on RMSE after cross-validation.
  • Target Encoding reduced location feature dimensionality from tens of thousands of one-hot columns to 3 numeric features (state, city, zip) with no information loss on price signal.
  • GPT-4o advisory prompt is grounded: predicted price, listing price, and median zip-code price are injected — the model cannot fabricate a valuation without a reference anchor.
  • Gradio interface exposes property input form, predicted price band, and GPT-4o advisory text — deployable as a standalone web app without any frontend framework.
real-estate-price-predictionML + GenAI · Python + Gradio

XGBoost regression, Target Encoding, GPT-4o advisory layer, ~1M Zillow records, Gradio web interface.

INTERNSHIP · ASTRINDO SENAYASAENTERPRISE INTERNALNLU · PHP + OPENAI API

Astrindo Digital Approval Chatbot

Built during my internship at Astrindo Senayasa (Jakarta, Apr–Jun 2025) — an internal enterprise chatbot that lets non-technical employees query live business data across Marketing, HR, Finance, Purchasing, and Service departments using plain language.

The system runs a two-stage NLU pipeline: GPT-4o-mini first classifies the user's intent and extracts structured entities (year, month, specialist name, city, brand) at temperature=0, returning strict JSON. The PHP backend then dispatches to a domain-specific feature handler that executes deterministic MySQL queries and formats the response.

The LLM also generates a ChatGPT-style sidebar title per conversation. A hard-banned list of generic titles ("General Chat", "Greeting", "Quick Question") forces the model to produce specific, intent-driven labels — with a regex-based Indonesian fallback if the output is invalid.

Three Architectural Commitments

Two-stage NLU pipeline

GPT-4o-mini classifies intent + extracts entities at temperature=0 and returns strict JSON. The PHP dispatcher routes to the correct feature handler based on the intent field — never on free text.

Strict LLM boundary

The LLM never touches the database. It returns { intent, title, entities }. All SQL queries, aggregations, and number formatting are in deterministic PHP handlers — hallucinated data is structurally impossible.

Domain-scoped intent list

12 intents across 5 business domains, explicitly enumerated in the NLU prompt. Anything outside the domain is routed to smalltalk → GPT-4o-mini fallback chat, keeping business data queries separate from general conversation.

Production Outcomes

  • 12 intents across 5 departments — Marketing, HR, Finance, Purchasing, Service — each with a dedicated PHP feature handler and parameterised MySQL queries.
  • Zero hallucinated numbers: the LLM returns only intent + entities; all figures come from live MySQL queries against Astrindo's Digital Approval database.
  • ChatGPT-style conversation titles generated per session by the NLU call, with a hard-banned generic-title list and regex-based Indonesian language fallback.
  • Deployed internally on Apache/XAMPP during internship. Supports bilingual input (Indonesian and English) with intent detection stable across both.
astrindo-chatbotEnterprise Chatbot · PHP + OpenAI API

Two-stage NLU pipeline, 12 domain intents, MySQL feature handlers, ChatGPT-style title generation, bilingual support.

More on GitHub.

Tooling, prototypes, and in-progress work live at one address — the curated story is above, the full archive is a click away.

github.com/kiefer-sulijanto ↗