Bulk Data Insertion with Entity Framework in C#

In this blog post, we will look at how to use Entity Framework to insert bulk data into a C# application. Bulk insertion is a common necessity when working with huge datasets, and it’s critical to handle mistakes gently. We’ll go over the steps in detail, including how to set up the Entity Framework context, create retry logic for failed insertions, and efficiently handle failures.

Step 1: Setting Up the Entity Framework Context
First, we’ll establish an Entity Framework context to interface with our database. Assume we have a simple DbContext class called AppDbContext that has a DbSet for the entity type MyEntity.

public class AppDbContext : DbContext
{
    public DbSet<MyEntity> MyEntities { get; set; }

    // Constructor to configure database connection
    public AppDbContext(DbContextOptions<AppDbContext> options) : base(options)
    {
    }
}

Step 2. Implementing Bulk Data Insertion

Next, let’s implement the method to perform bulk data insertion using Entity Framework. We’ll use the AddRange method to add multiple entities to the context and then call SaveChanges to persist the changes to the database.

public class BulkDataProcessor
{
    private readonly AppDbContext _dbContext;

    public BulkDataProcessor(AppDbContext dbContext)
    {
        _dbContext = dbContext;
    }

    public void InsertBulkData(List<MyEntity> entities)
    {
        _dbContext.MyEntities.AddRange(entities);
        _dbContext.SaveChanges();
    }
}

Step 3. Implementing Retry Logic for Failed Insertions

To handle scenarios where bulk insertion fails due to transient errors (e.g., database connection issues), we’ll implement retry logic. We’ll retry the insertion operation a configurable number of times with a delay between retries.

public class BulkDataProcessor
{
    // Previous code remains unchanged

    public void InsertBulkDataWithRetry(List<MyEntity> entities, int maxRetries = 3, TimeSpan delayBetweenRetries = default)
    {
        int retries = 0;
        bool success = false;

        while (!success && retries < maxRetries)
        {
            try
            {
                InsertBulkData(entities);
                success = true; // Mark insertion as successful
            }
            catch (DbUpdateException ex) when (IsTransientError(ex) && retries < maxRetries - 1)
            {
                // Transient error occurred, retry after delay
                retries++;
                if (delayBetweenRetries != default)
                    Thread.Sleep(delayBetweenRetries);
            }
        }

        if (!success)
        {
            // Log or handle failed insertion after retries
            Console.WriteLine($"Bulk data insertion failed after {maxRetries} retries.");
        }
    }

    private bool IsTransientError(DbUpdateException ex)
    {
        // Check if the exception is due to a transient database error
        // Implement logic to identify transient errors based on the exception type or message
        return true; // Placeholder implementation
    }
}

Let’s integrate batch size handling into the bulk data insertion process using Entity Framework.

public class BulkDataProcessor
{
    private readonly AppDbContext _dbContext;
    private const int DefaultBatchSize = 1000; // Default batch size

    public BulkDataProcessor(AppDbContext dbContext)
    {
        _dbContext = dbContext;
    }

    public void InsertBulkData(List<MyEntity> entities, int batchSize = DefaultBatchSize)
    {
        for (int i = 0; i < entities.Count; i += batchSize)
        {
            IEnumerable<MyEntity> batch = entities.Skip(i).Take(batchSize);
            _dbContext.MyEntities.AddRange(batch);
            _dbContext.SaveChanges();
        }
    }

    public void InsertBulkDataWithRetry(List<MyEntity> entities, int maxRetries = 3, TimeSpan delayBetweenRetries = default, int batchSize = DefaultBatchSize)
    {
        int retries = 0;
        bool success = false;

        while (!success && retries < maxRetries)
        {
            try
            {
                InsertBulkData(entities, batchSize);
                success = true; // Mark insertion as successful
            }
            catch (DbUpdateException ex) when (IsTransientError(ex) && retries < maxRetries - 1)
            {
                // Transient error occurred, retry after delay
                retries++;
                if (delayBetweenRetries != default)
                    Thread.Sleep(delayBetweenRetries);
            }
        }

        if (!success)
        {
            // Log or handle failed insertion after retries
            Console.WriteLine($"Bulk data insertion failed after {maxRetries} retries.");
        }
    }

    private bool IsTransientError(DbUpdateException ex)
    {
        // Check if the exception is due to a transient database error
        // Implement logic to identify transient errors based on the exception type or message
        return true; // Placeholder implementation
    }
}

In this updated code

  • We’ve added a batchSize parameter to the InsertBulkData and InsertBulkDataWithRetry methods, allowing the caller to specify the size of each batch.
  • Inside the InsertBulkData method, we iterate over the list of entities in batches of the specified size and perform bulk insertion for each batch.
  • The InsertBulkDataWithRetry method now accepts an additional batchSize parameter and passes it to the InsertBulkData method for batched insertion.
  • By utilizing batched insertion, we can efficiently process large datasets while minimizing the impact on memory consumption and database performance.

To implement the rolling back failed batches and retrying the insertion process, we need to enhance our code with error handling, retry logic, and tracking of failed batches. Below is an updated version of the BulkDataProcessor class incorporating these features.

public class BulkDataProcessor
{
    private readonly AppDbContext _dbContext;
    private const int DefaultBatchSize = 1000; // Default batch size
    private const int DefaultMaxRetries = 3;   // Default max retry attempts
    private const int DefaultDelayMs = 1000;   // Default delay between retry attempts in milliseconds

    public BulkDataProcessor(AppDbContext dbContext)
    {
        _dbContext = dbContext;
    }

    public void InsertBulkData(List<MyEntity> entities, int batchSize = DefaultBatchSize)
    {
        using (var transaction = _dbContext.Database.BeginTransaction())
        {
            try
            {
                InsertBatchedData(entities, batchSize);
                transaction.Commit(); // Commit transaction if all batches are successful
            }
            catch (Exception ex)
            {
                transaction.Rollback(); // Rollback transaction on error
                throw ex;
            }
        }
    }

    private void InsertBatchedData(List<MyEntity> entities, int batchSize)
    {
        for (int i = 0; i < entities.Count; i += batchSize)
        {
            IEnumerable<MyEntity> batch = entities.Skip(i).Take(batchSize);
            _dbContext.MyEntities.AddRange(batch);
            _dbContext.SaveChanges();
        }
    }

    public void InsertBulkDataWithRetry(List<MyEntity> entities, int maxRetries = DefaultMaxRetries, int batchSize = DefaultBatchSize, int delayMs = DefaultDelayMs)
    {
        int retries = 0;

        while (retries < maxRetries)
        {
            try
            {
                InsertBulkData(entities, batchSize);
                return; // Exit method if successful
            }
            catch (Exception ex)
            {
                // Log error or perform any necessary handling
                Console.WriteLine($"Error occurred: {ex.Message}");
                retries++;
                Thread.Sleep(delayMs); // Delay before retry
            }
        }

        // If max retries exceeded, log or handle accordingly
        Console.WriteLine($"Max retries exceeded. Unable to insert bulk data.");
    }
}

In this updated implementation

  • We define default values for maximum retry attempts (DefaultMaxRetries), delay between retry attempts in milliseconds (DefaultDelayMs), and batch size (DefaultBatchSize).
  • The InsertBulkData method now encapsulates the entire batched insertion process within a database transaction. If an error occurs during any batch, the entire transaction is rolled back to maintain data consistency.
  • We introduce the InsertBulkDataWithRetry method, which attempts to insert bulk data with retry logic. If an exception occurs during insertion, the method retries the operation up to the specified maximum number of times (maxRetries). It also includes a delay between retry attempts (delayMs) to prevent excessive load on the system.
  • The retry loop continues until the insertion is successful or the maximum number of retries is reached. If the maximum number of retries is exceeded, an appropriate message is logged or handled accordingly.

Conclusion

In this blog post, we’ve learned how to perform bulk data insertion using Entity Framework in a C# application. We’ve covered setting up the Entity Framework context, implementing bulk insertion logic, and handling errors with retry logic. By following these steps, you can efficiently handle large-scale data insertion tasks in your applications.

You can use InsertBulkDataWithRetry in your application code to perform bulk data insertion with retry logic, providing the list of entities to be inserted and optionally specifying the maximum number of retries, batch size, and delay between retries.

Best and Most Recommended ASP.NET Core 8 Hosting

Fortunately, there are a number of dependable and recommended web hosts available that can help you gain control of your website’s performance and improve your ASP.NET Core 8 web ranking. HostForLIFEASP.NET is highly recommended. In Europe, HostForLIFEASP.NET is the most popular option for first-time web hosts searching for an affordable plan.

Their standard price begins at only € 3.49 per month. Customers are permitted to choose quarterly and annual plans based on their preferences. HostForLIFEASP.NET guarantees “No Hidden Fees” and an industry-leading ’30 Days Cash Back’ policy. Customers who terminate their service within the first thirty days are eligible for a full refund.

By providing reseller hosting accounts, HostForLIFEASP.NET also gives its consumers the chance to generate income. You can purchase their reseller hosting account, host an unlimited number of websites on it, and even sell some of your hosting space to others. This is one of the most effective methods for making money online. They will take care of all your customers’ hosting needs, so you do not need to fret about hosting-related matters.