1. 01 Latest 14 min read

    The ML and Infrastructure Architecture Behind striff.io

    A walkthrough of the async Kafka-staged pipeline, Triton-based inference serving, and degradation hierarchy that powers striff.io’s architectural review system. Covers why the pipeline moved from synchronous to event-driven, how three independent Kafka worker tiers decouple graph construction, GNN scoring, and LLM annotation, the distributed systems problems that Triton separation introduces, and the three-tier degradation strategy that ensures every failure mode still produces a useful review.

    striff-gnn striff-lib clarpse mlops-blueprint
    Read article ↗
  2. 02 15 min read

    Detecting Architectural Anomalies in Code with Graph Neural Networks

    How striff.io uses a neurosymbolic pipeline with typed dependency graphs, Chidamber-Kemerer features, and a distilled GCN to flag the parts of a pull request that actually carry architectural risk. The post covers the graph construction pipeline, the 404-dimensional feature vector design, why spectral GCN was chosen over attention-based architectures, and how symbolic facts are fused with learned anomaly scores to produce grounded LLM review annotations rendered directly on architecture diagrams.

    striff-gnn striff-lib clarpse mlops-blueprint
    Read article ↗
  3. 03 19 min read

    What a Decade of ML Infrastructure Taught Me About LLMs

    After close to a decade working on ML infrastructure, including GPU clusters, autoscaling pipelines, and model serving systems, the transition into LLM-based production systems turned out to be less of a clean break than the hype suggests. The problems do not change so much as evolve, and they get harder in specific ways. This post works through the areas where classical ML intuitions transfer directly into LLM operations, where they break down and need updating, and where the failure surfaces are genuinely new. Covering latency, reproducibility, data lineage, cost modeling, observability, and the unique challenges of agent systems, written for engineers who have operated traditional ML infrastructure and want an honest map of what carries over.

    vllm-mlops mlops-blueprint
    Read article ↗
  4. 04 9 min read

    Self-Hosting LLMs in Production: The vLLM + KubeAI Stack

    Deploying a large language model is not the hard part — deploying one that is safe to operate, cost-effective to scale, and straightforward to reason about under load is where most teams run into trouble. This post walks through an architecture developed at HADI Technology for running self-hosted LLM inference in production, using vLLM as the inference engine and KubeAI for model lifecycle management. Rather than a step-by-step tutorial, it explains the tradeoffs that led to this architecture and where it fits compared to alternatives like managed API endpoints or simpler single-instance deployments. The reference implementation is open-source and available on GitHub.

    vllm-mlops
    Read article ↗
  5. 05 10 min read

    Designing a Production MLOps Pipeline: The Decisions That Actually Determine Reliability

    Teams that invest heavily in model development often ship the surrounding infrastructure as an afterthought — and pay for it later in operational failures that are difficult to diagnose and expensive to fix. This post distills a reference MLOps architecture built from repeated production engagements, using a GitHub pull request categorization pipeline with a PyTorch autoencoder as the concrete example. It covers the decisions that determine whether a production ML system holds up over time: anomaly detection before classification, experiment tracking, model registry integration, containerized training, and CI/CD for model deployment. The full implementation is available on GitHub as a reusable blueprint.

    mlops-blueprint
    Read article ↗
  6. 06 1 min read

    Deploy A Production-Ready E-Commerce Solution on AWS with CloudFormation

    Shopify works until it doesn’t — vendor lock-in and transaction fees compound at scale, and the platform gives you limited control over infrastructure when it matters. This post walks through a CloudFormation-based reference architecture for running PrestaShop on AWS with the operational properties of a production system: auto-scaling application tier, managed RDS database, ElastiCache for session and object caching, and a CDN layer for static assets. Every resource is defined as code, so the entire stack can be reproduced across environments with a single command. The architecture was developed following a real client engagement where the cost and control tradeoffs of hosted e-commerce platforms became untenable at scale.

    Read article ↗
  7. 07 11 min read

    3 Common Misunderstandings of Inter-Service Communication in MicroServices

    REST and message queues are the two dominant approaches to inter-service communication in distributed systems, and teams frequently choose between them based on assumptions that do not hold under scrutiny. Synchronous HTTP calls are not always simpler or more reliable than async messaging, and message queues are not always the right choice when decoupling is the goal. This post examines three specific misconceptions that lead teams to make this decision poorly — around coupling, reliability, and operational complexity — and replaces them with a more grounded analysis of what each approach actually trades off. The goal is not to recommend one over the other but to help teams make the call with clear eyes.

    Read article ↗
  8. 08 8 min read

    Why Object Oriented Code Accelerates Microservices Adoption

    Migrating a monolith to microservices is a difficult undertaking regardless of technical quality, but the difficulty scales dramatically with how coupled and procedural the source code is. When a codebase lacks clear object boundaries, the decomposition process becomes a guessing game about which pieces can be extracted without breaking everything else. This post demonstrates how four core OOP principles — single responsibility, encapsulation, dependency inversion, and composition — directly reduce the mechanical effort of splitting a legacy system into services. The argument is not that OOP is required for microservices, but that investing in it before a migration begins pays back measurably during the decomposition.

    Read article ↗
  9. 09 10 min read

    4 Elements of A Great Serverless Application Deployment Strategy

    Serverless apps depend on many managed services — storage, caches, load balancers, execution environments — which makes deployment automation non-trivial compared to a single application binary. Without structure, provisioning infrastructure and deploying code across dev, staging, and production environments becomes a manual, error-prone process. This post covers four practices for keeping that process automated and low-risk: separating environments properly, using infrastructure as code to provision resources, packaging application code independently of infrastructure, and automating deployment pipelines end to end. If any part of your release still requires you to look at the cloud console, this post is for you.

    Read article ↗
  10. 10 16 min read

    Dissecting GitHub Code Reviews: A Text Classification Experiment

    Code review comments on GitHub contain a wealth of signal about what engineers care about — naming, logic, performance, test coverage — but that signal is buried in unstructured free text. This post builds an SVM classifier to categorize over 30,000 GitHub pull request review comments by the main technical topic each addresses. The dataset, feature engineering approach, and model evaluation are walked through in a Jupyter notebook available on GitHub. The results reveal which topics dominate code review discussions and how that distribution shifts across different types of repositories.

    PRs Welcome Open Source Love Star
    As part of the code review process on GitHub, developers can leave comments on portions of the unified diff of a GitHub pull request. These comments are extremely valuable in facilitating technical discussion amongst developers, and in allowing developers to get feedback on their code submissions.

    But what do code reviewers usually discuss in these comments?

    In an effort to better understand code reviewing discussions, we’re going to create an SVM classifier to classify over 30 000 GitHub review comments based on the main code-related topic addressed by each comment (e.g. naming, readability, etc.).

    Grab the Jupyter Notebook for this experiment on GitHub.

    sample_comment

    Review Comment Classifications

    The list of classifications we’re going to incorporate into our classifier are summarized in the table below. This list was developed based on a manual survey of approximately 2000 GitHub review comments I performed on randomly selected, but highly forked Java repositories on GitHub.

    The selected categories reflect the most frequently occurring topics encountered in the surveyed review comments. Majority of the categories are related to code level concepts (e.g. variable naming, exception handling); however, certain review comments that did not naturally fall into any existing categories and were unrelated to the overall goal of code reviewing were placed in the “other” category.

    In situations where a review comment discussed more than one subject, I gave it a classification according to the topic it spent the most words discussing.

    Category Label Further Explanation      Sample Comment      
    Readability 1 Comments related to readability, style, general project conventions. “This code looks very convoluted to me”
    Naming 2   “I think foo would be a more appropriate name”
    Documentation 3 Comments related to licenses, package info, module documentation, commenting. “Please add a comment here explaining this logic”
    Error/Resource Handling 4 Comments related to exception/resource handling, program failure, termination analysis, resource . “Forgot to catch a possible exception here”
    Control Structures/Program Flow 5 Comments related to usage of loops, if-statements, placement of individual lines of code. “This if-statement should be moved after the while loop”
    Visibility/ Access 6 Comments related to access level for classes, fields, methods and local variables. “Make this final”
    Efficiency / Optimization 7   “Many unnecessary calls to foo() here”
    Code Organization/ Refactoring 8 Comments related to extracting code from methods and classes, moving large chunks of code around. “Please extract this logic into a separate method”
    Concurrency 9 Comments related to threads, synchronization, parallelism. “This class does not look thread safe”
    High Level Method Semantics & Design 10 Comments relating to method design and semantics. “This method should return a String”
    High Level Class Semantics & Design 11 Comments relating to class design and semantics. “This should extend Foo”
    Testing 12   “is there a test for this?”
    Other 13 Comments not relating to categories 1-12. “Looks good”, “done”, “thanks”

    Loading The Data Set

    Now we’ll discuss our SVM text classifier implementation. This experiment represents a typical supervised learning classification exercise.

    We’ll start by first loading our training data consisting of two files representing 2000 manually labeled comment-classification pairs. The first file contains a review comment on each line, while the second file contains manually determined classifications for each corresponding review comment on each line.

    with open('review_comments.txt') as f:
        review_comments = f.readlines()
        
    with open('review_comments_labels.txt') as g:
        classifications = g.readlines()
        
    
    Read article ↗
  11. 11 4 min read

    API Documentation Using Tables

    Swagger UI communicates endpoint details well but fails at conveying the shape of a complex API at a glance. When an API exposes dozens of resources and hundreds of operations, developers need a high-level map before they can navigate the detail. This post proposes a table-based documentation format that presents resources, operations, and their relationships in a compact, scannable structure. The approach is complementary to existing spec-driven tooling and can be generated directly from an OpenAPI definition — a live demo built against the GitHub API is included.

    Read article ↗
  12. 12 7 min read

    The Roots of Object Oriented Programming

    Most OOP languages claim the label but miss what Alan Kay actually meant when he coined the term. Polymorphism, encapsulation, and inheritance are frequently cited as its pillars — but these exist in functional languages too. Kay’s actual vision was rooted in biology: autonomous objects communicating exclusively through message passing, with no direct access to each other’s internal state. This post traces that original conception through Kay’s early work and asks why almost no modern software actually practices it, and what we lose as a result.

    Read article ↗
  13. 13 3 min read

    Monitor Disk Usage Levels on Slack

    Disk space exhaustion is a quiet failure mode — systems degrade gradually until they stop working entirely, often at the worst possible moment. This post shares a bash script that monitors local disk storage levels and reports them to a Slack channel at a configurable interval, color-coded by usage severity. The script uses standard Unix tooling with no additional dependencies and can be dropped onto any instance in minutes. It is designed to be deployed across multiple machines simultaneously and supports configurable alert thresholds so teams can act before things become critical.

    License PRs Welcome
    Integrations are what takes Slack from a normal online instant messaging and collaboration system to a solution that enables you to centralize all your notifications, from sales to tech support, social media and more, into one searchable place where your team can discuss and take action on each. In this article, I’ll share a simple bash script that reports local disk storage levels to Slack at a continuous time interval. It is easily deployable to multiple instances, highly configurable, and can helps teams take proactive measures in maintaining the operational well-being of their systems.

    Download the App on GitHub

    inheritance

    The script is available on GitHub and can be dropped anywhere on the instance you want to monitor. At a specified interval, it will post disk storage related information to slack as illustrated above. The drive information is retrieved using the df -h command on Unix systems. Additionally, listed drives on the system are color coded based on how much storage capacity they have left. Two quick steps are required for getting the integration setup and running.

    1 - Create a Slack Webhook Notification:

    This will allow the script to post as a bot/integration instead of as yourself (which would require your personal credentials). First, ensure the Incoming WebHooks app is installed in your slack organization. Next, click Add Configuration and read the instructions to configure the integration settings as desired. Copy the value for the Webhook URL field, which will be required in the next step.

    inheritance

    2 - Use a time-based job scheduler to run the script:

    The job scheduler will execute the script regularly at a time interval based on how often we want to view the reports. On a Linux environment, the crontab command, which is used to schedule commands to be executed periodically, is the perfect tool for the job. To create a new cronjob, simply type crontab -e in a command prompt. New jobs can be installed by adding a new entry to the file with the following syntax:

    1 2 3 4 5 /path/to/command arg1 arg2
    
    Read article ↗
  14. 14 5 min read

    Easily Migrate Postgres/MySQL Records to InfluxDB

    Relational databases were not designed for time series data — as write volumes grow, table cardinality climbs and query performance degrades in ways that are hard to tune around. Purpose-built time series databases like InfluxDB handle this workload efficiently by design, with compression, downsampling, and retention policies built in from the start. This post explains when that tradeoff is worth making and walks through the practical steps of migrating existing Postgres or MySQL records into InfluxDB using Python. It covers schema mapping, batching strategies, and the key differences in querying that will affect any application sitting on top of the new store.

    Read article ↗
  15. 15 3 min read

    Clarpse - The Way Source Code Was Meant To Be Analyzed

    Clarpse is a multi-language source code analysis tool designed for extracting deep structural relationships between entities in a codebase — classes, methods, fields, imports, and the connections between them. It exposes these relationships through a clean, language-agnostic API that decouples downstream tooling from any particular compiler or parser. Features like jump-to-definition, find-usages, type inference, and documentation generation can be built on top of Clarpse without re-implementing the underlying language analysis for each supported language. The library currently supports Java and Go, with a design that makes adding additional language backends straightforward.

    Maintained by Hadi Technology Maven Central Java CI codecov License MIT PRs welcome
    Read article ↗

© 2026 Muntazir Fadhel. All rights reserved.