Muntazir Fadhel

01 Latest 13 min read

The ML and Infrastructure Architecture Behind striff.io

striff.io turns GitHub pull requests into architecture diagrams scored by a graph neural network. This post breaks down the production pipeline behind it: a three-tier Kafka topology that decouples graph construction, GNN inference on Triton, and LLM annotation, plus a degradation hierarchy that guarantees every review request produces a useful result even when services fail….

mlops kubernetes kafka graph neural networks ml engineering system design
Read article ↗
02 14 min read

Detecting Architectural Anomalies in Code with Graph Neural Networks

How striff.io uses a neurosymbolic pipeline with typed dependency graphs, Chidamber-Kemerer features, and an edge-prediction GNN to flag the dependencies in a pull request that carry architectural risk. The post covers the graph construction pipeline, the 405-dimensional feature vector spanning four languages, why we switched from node-level anomaly scoring to edge-prediction, six deterministic architectural detectors,…

gnn code review graph neural networks software architecture ml engineering
Read article ↗
03 8 min read

Building a Production Neurosymbolic Pipeline for Scientific Discourse Graphs

Lessons from building Fylo’s ingestion pipeline that turns scientific papers into typed discourse graphs. Covers ShEx schema as executable contract, a phased LLM extraction loop that closes the validator-LLM feedback gap, and a three-tier cross-document merge policy that keeps the graph converging as more papers are ingested.

nlp knowledge graphs neurosymbolic scientific discourse llm engineering shex
Read article ↗
04 19 min read

What a Decade of ML Infrastructure Taught Me About LLMs

After close to a decade working on ML infrastructure, including GPU clusters, autoscaling pipelines, and model serving systems, the transition into LLM-based production systems turned out to be less of a clean break than the hype suggests. The problems do not change so much as evolve, and they get harder in specific ways. This post…

ai llms mlops platform engineering system design
Read article ↗
05 9 min read

Self-Hosting LLMs in Production: The vLLM + KubeAI Stack

Deploying a large language model is not the hard part — deploying one that is safe to operate, cost-effective to scale, and straightforward to reason about under load is where most teams run into trouble. This post walks through an architecture developed at HADI Technology for running self-hosted LLM inference in production, using vLLM as…

llmops mlops kubernetes vllm model serving
Read article ↗
06 10 min read

Designing a Production MLOps Pipeline: The Decisions That Actually Determine Reliability

Teams that invest heavily in model development often ship the surrounding infrastructure as an afterthought — and pay for it later in operational failures that are difficult to diagnose and expensive to fix. This post distills a reference MLOps architecture built from repeated production engagements, using a GitHub pull request categorization pipeline with a PyTorch…

mlops gitops kubernetes machine learning system design
Read article ↗
07 1 min read

Deploy A Production-Ready E-Commerce Solution on AWS with CloudFormation

Shopify works until it doesn’t — vendor lock-in and transaction fees compound at scale, and the platform gives you limited control over infrastructure when it matters. This post walks through a CloudFormation-based reference architecture for running PrestaShop on AWS with the operational properties of a production system: auto-scaling application tier, managed RDS database, ElastiCache for…

aws ecommerce cloudformation scalability
Read article ↗
08 11 min read

3 Common Misunderstandings of Inter-Service Communication in MicroServices

REST and message queues are the two dominant approaches to inter-service communication in distributed systems, and teams frequently choose between them based on assumptions that do not hold under scrutiny. Synchronous HTTP calls are not always simpler or more reliable than async messaging, and message queues are not always the right choice when decoupling is…

microservices distributed systems rest messaging
Read article ↗
09 8 min read

Why Object Oriented Code Accelerates Microservices Adoption

Migrating a monolith to microservices is a difficult undertaking regardless of technical quality, but the difficulty scales dramatically with how coupled and procedural the source code is. When a codebase lacks clear object boundaries, the decomposition process becomes a guessing game about which pieces can be extracted without breaking everything else. This post demonstrates how…

microservices oop software architecture refactoring
Read article ↗
10 10 min read

4 Elements of A Great Serverless Application Deployment Strategy

Serverless apps depend on many managed services — storage, caches, load balancers, execution environments — which makes deployment automation non-trivial compared to a single application binary. Without structure, provisioning infrastructure and deploying code across dev, staging, and production environments becomes a manual, error-prone process. This post covers four practices for keeping that process automated and…

aws serverless ci/cd infrastructure as code devops
Read article ↗
11 16 min read

Dissecting GitHub Code Reviews: A Text Classification Experiment

Code review comments contain signal about what engineers care about — naming, logic, performance, test coverage — but it’s buried in unstructured text. This post builds an SVM classifier to categorize 30,000 GitHub PR review comments by technical topic.

machine learning github code review nlp python
Read article ↗
12 4 min read

API Documentation Using Tables

Swagger UI communicates endpoint details well but fails at conveying the shape of a complex API at a glance. When an API exposes dozens of resources and hundreds of operations, developers need a high-level map before they can navigate the detail. This post proposes a table-based documentation format that presents resources, operations, and their relationships…

api design api documentation rest
Read article ↗
13 7 min read

The Roots of Object Oriented Programming

Most OOP languages claim the label but miss what Alan Kay actually meant when he coined the term. Polymorphism, encapsulation, and inheritance are frequently cited as its pillars — but these exist in functional languages too. Kay’s actual vision was rooted in biology: autonomous objects communicating exclusively through message passing, with no direct access to…

oop software design alan kay messaging
Read article ↗
14 3 min read

Monitor Disk Usage Levels on Slack

A bash script that monitors local disk storage and reports usage to a Slack channel, color-coded by severity. Zero dependencies, configurable thresholds, deployable across multiple instances in minutes.

python bash slack monitoring devops
Read article ↗
15 5 min read

Easily Migrate Postgres/MySQL Records to InfluxDB

Relational databases were not designed for time series data — as write volumes grow, table cardinality climbs and query performance degrades in ways that are hard to tune around. Purpose-built time series databases like InfluxDB handle this workload efficiently by design, with compression, downsampling, and retention policies built in from the start. This post explains…

python influxdb postgresql mysql time series
Read article ↗