Binary Classification with ML.NET

In many real-world situations, we need to make decisions based on patterns in data. For instance, a bank may want to automatically decide if a customer is eligible for a loan. This decision depends on various factors such as the customer's age, income, and criminal history. Traditionally, this decision might be made manually by an agent reviewing each application. However, with the help of machine learning, we can build a model that learns from historical data and automates the decision-making process with high accuracy.

In this example, we will use ML.NET, Microsoft's machine learning framework for .NET developers, to build a binary classification model. A binary classification task is a type of machine learning problem where the goal is to predict one of two possible outcomes (e.g., Yes or No, Eligible or Not Eligible).

Here, I’ll guide you through each step of the process, from building, to training, evaluating, and using the model to make predictions.

Step 1: Install ML.NET NuGet Package

Before we begin, ensure you have a .NET project set up (Console App is fine). Then, install the ML.NET NuGet package:

dotnet add package Microsoft.ML

This package provides all the tools you need to load data, transform it, train machine learning models, and make predictions.

Step 2: Prepare Data for Training

To train a machine learning model, we need data. In a real-world scenario, this would be a dataset of past loan applications with known outcomes. For demonstration purposes, we will generate synthetic data in code.

Define Data Structures

using Microsoft.ML.Data;

public class LoanApplication
{
    public float Age { get; set; }
    public float Income { get; set; }
    public bool HasCriminalRecord { get; set; }
}

public class LoanEligibility : LoanApplication
{
    public bool Label { get; set; } // 1 = Eligible, 0 = Not Eligible
}

public class Prediction
{
    [ColumnName("PredictedLabel")]
    public bool IsEligible { get; set; }
    public float Probability { get; set; }
    public float Score { get; set; }
}

Generate Training Data

public static List<LoanEligibility> GenerateSampleData()
{
    var data = new List<LoanEligibility>();
    var rnd = new Random();

    for (int i = 0; i < 1000; i++)
    {
        float age = rnd.Next(18, 70);
        float income = rnd.Next(20000, 150000);
        bool hasCriminalRecord = rnd.NextDouble() > 0.9;

        bool label = (age > 25 && income > 40000 && !hasCriminalRecord);

        data.Add(new LoanEligibility
        {
            Age = age,
            Income = income,
            HasCriminalRecord = hasCriminalRecord,
            Label = label
        });
    }

    return data;
}

We simulate a basic rule: if the applicant is older than 25, earns more than $40,000, and has no criminal record, they are likely to be eligible.

Step 3: Build and Train the Machine Learning Model

1. Initialize the ML Context

The MLContext is the starting point for all ML.NET operations.

var context = new MLContext();

2. Load the Data

We load our in-memory list into an ML.NET data structure:

var data = GenerateSampleData();
var trainData = context.Data.LoadFromEnumerable(data);

3. Define the Training Pipeline

This pipeline tells ML.NET how to process the data and train the model:

var pipeline = context.Transforms.Conversion.MapValueToKey("Label")
    .Append(context.Transforms.Concatenate("Features", nameof(LoanApplication.Age), nameof(LoanApplication.Income), nameof(LoanApplication.HasCriminalRecord)))
    .Append(context.Transforms.NormalizeMinMax("Features"))
    .Append(context.BinaryClassification.Trainers.SdcaLogisticRegression());

MapValueToKey converts the label (true/false) into a numeric format.
Concatenate merges multiple input columns into a single feature vector.
NormalizeMinMax ensures all features are scaled between 0 and 1.
SdcaLogisticRegression is the algorithm used to train a binary classifier.

4. Train the Model

var model = pipeline.Fit(trainData);

This step builds the model using our pipeline and training data.

Step 4: Make Predictions

Once trained, we can use the model to make predictions on new data:

var predictor = context.Model.CreatePredictionEngine<LoanApplication, Prediction>(model);

var newApplicant = new LoanApplication
{
    Age = 30,
    Income = 50000,
    HasCriminalRecord = false
};

var result = predictor.Predict(newApplicant);

Console.WriteLine($"Prediction: {(result.IsEligible ? "Eligible" : "Not Eligible")}");
Console.WriteLine($"Probability: {result.Probability:P2}");

This shows how your trained model can automate decisions on new applications.

Step 5: Evaluate Model Accuracy

To ensure our model is making good predictions, we evaluate it on test data:

var testData = context.Data.LoadFromEnumerable(GenerateSampleData());
var predictions = model.Transform(testData);
var metrics = context.BinaryClassification.Evaluate(predictions);

Console.WriteLine($"Accuracy: {metrics.Accuracy:P2}");
Console.WriteLine($"AUC: {metrics.AreaUnderRocCurve:P2}");
Console.WriteLine($"F1 Score: {metrics.F1Score:P2}");

Accuracy shows the percentage of correct predictions.
AUC (Area Under Curve) shows how well the model distinguishes between the two classes.
F1 Score balances precision and recall.

A high accuracy and AUC indicate the model generalizes well to unseen data.

Real-World Impact

This small example solves a real-world business problem:

Banks can automate loan approvals
HR teams can classify job applicants
Healthcare can predict if a patient is at risk

By building a binary classification model, you can scale up decision-making, reduce human error, and act faster on data-driven insights.

ML.NET allows .NET developers to harness the power of machine learning without needing to switch to Python or R. It’s fast, flexible, and integrates naturally into your existing .NET applications.

Whether you're looking to build a new AI-powered application or integrate machine learning capabilities into existing systems, I can bring the expertise to turn your vision into reality.

Feel free to contact me today to explore some new your ideas.