Regression with ML.NET

Exploring regression task to estimate house price with ML.NET

Many business problems revolve around estimating a numeric value based on past data. For example, predicting the price of a product, forecasting sales, or estimating the number of items a customer might purchase. These problems fall under a machine learning category known as regression.

Imagine you're in real estate or property investment. Estimating house prices accurately helps:

  • Sellers set competitive asking prices

  • Buyers make informed offers

  • Investors evaluate return on investment

  • Agencies streamline valuations and reduce human error

By automating this process, businesses save time, scale operations, and make data-backed decisions.

In this example, we'll use ML.NET, Microsoft's machine learning library for .NET developers, to build a regression model that predicts a house price based on features like size, number of rooms, and location score.

 


Step 1: Install ML.NET NuGet Package

If you're using a .NET console project, open your terminal and run:

dotnet add package Microsoft.ML

This brings in the ML.NET library, which enables data processing, model training, and prediction.

 


Step 2: Define and Generate Sample Data

We’ll simulate a dataset representing houses with the following features:

  • Size (square feet)

  • NumRooms

  • LocationScore (1–10 scale)

  • Price (target variable)

The quality and size of your dataset play a crucial role in how well your machine learning model performs. A clean dataset with accurate, consistent, and relevant features allows the model to learn meaningful patterns. Conversely, missing values, outliers, or inconsistently recorded data can introduce bias or noise.

For a regression model like this, a few hundred examples might be sufficient for a basic demonstration, but real-world applications often require thousands or even millions of examples to achieve high accuracy and reliability. The more diverse and representative the dataset, the better the model can generalize to new data.

 

Define Data Structures

using Microsoft.ML.Data;

public class HouseData
{
    public float Size { get; set; }
    public float NumRooms { get; set; }
    public float LocationScore { get; set; }
    public float Price { get; set; } // This is the label we want to predict
}

public class HousePrediction
{
    [ColumnName("Score")]
    public float PredictedPrice { get; set; }
}

Generate Synthetic Data

public static List<HouseData> GenerateSampleData()
{
    var rnd = new Random();
    var list = new List<HouseData>();

    for (int i = 0; i < 1000; i++)
    {
        float size = rnd.Next(600, 4000);
        float rooms = rnd.Next(1, 8);
        float location = rnd.Next(1, 11);

        // Simulate a price formula
        float price = (size * 200) + (rooms * 10000) + (location * 15000) + rnd.Next(-10000, 10000);

        list.Add(new HouseData { Size = size, NumRooms = rooms, LocationScore = location, Price = price });
    }

    return list;
}

This data mimics how size, number of rooms, and location impact housing prices.

 


Step 3: Build and Train the Machine Learning Model

1. Create MLContext

var context = new MLContext();

MLContext is the entry point for ML.NET operations, similar to setting up a workspace.

 

2. Load the Data

var data = GenerateSampleData();
var trainData = context.Data.LoadFromEnumerable(data);

3. Define the Pipeline

var pipeline = context.Transforms.Concatenate("Features", nameof(HouseData.Size), nameof(HouseData.NumRooms), nameof(HouseData.LocationScore))
    .Append(context.Transforms.NormalizeMinMax("Features"))
    .Append(context.Regression.Trainers.Sdca());
  • Concatenate joins the feature columns into a single input vector.

  • NormalizeMinMax scales feature values to a common range.

  • Sdca is a regression algorithm suitable for structured numeric data.

 

4. Train the Model

var model = pipeline.Fit(trainData);

During training, the model attempts to learn the relationship between the input features and the target price. Internally, this involves assigning initial weights to each feature and calculating a predicted price using a forward pass. The error (difference between the predicted and actual price) is then used to adjust those weights using backward propagation, where the model learns which features to emphasize more or less. This process is repeated across the entire dataset over multiple iterations until the model reaches a state where prediction errors are minimized.

 


Step 4: Make Predictions

After training, we can use the model to estimate prices of new houses:

var predictor = context.Model.CreatePredictionEngine<HouseData, HousePrediction>(model);

var newHouse = new HouseData
{
    Size = 1800,
    NumRooms = 4,
    LocationScore = 7
};

var prediction = predictor.Predict(newHouse);

Console.WriteLine($"Predicted price: ${prediction.PredictedPrice:N2}");

This makes a price prediction for a home with given features.

 


Step 5: Evaluate the Model

To check the model’s accuracy, we test it on a new dataset:

var testData = context.Data.LoadFromEnumerable(GenerateSampleData());
var predictions = model.Transform(testData);
var metrics = context.Regression.Evaluate(predictions);

Console.WriteLine($"R-Squared: {metrics.RSquared:0.00}");
Console.WriteLine($"Root Mean Squared Error: {metrics.RootMeanSquaredError:N2}");
  • R-Squared indicates how well the model explains variance (closer to 1 is better).

  • RMSE measures prediction error (lower is better).

 


Why Businesses Should Care

Businesses benefit greatly from predictive models:

  • Real estate: Property valuation

  • Retail: Sales forecasting

  • Manufacturing: Inventory and demand estimation

  • Finance: Customer lifetime value prediction

Regression models reduce guesswork, optimize pricing, and improve planning. 

With ML.NET, .NET developers can build robust machine learning tools using C#, without leaving their existing ecosystem. This empowers businesses to turn data into actionable insights and unlock new value.

Feel free to contact me today if this is something you would like to explore further for your business.