Process big DB table with chunk() method

Tutorial last revisioned on August 18, 2022 with Laravel 9
Let's imagine the situation: you have a big database table (like 10 000 rows or bigger) and you need to run an update to one column. But you cannot run just SQL query - there is some PHP logic behind it. So foreach loop could potentially take forever or bump into a default 30-second script limit. Luckily, Laravel has a neat solution for it. Let's look at a simple example of a problem I'm talking about:
$users = User::all();
foreach ($users as $user) {
  $some_value = ($user->some_field > 0) ? 1 : 0;
  // might be more logic here
  $user->update(['some_other_field' => $some_value]);
}
Do you see the problem? It's fine for a DB table with 100 or 500 users. But what about 10000? 100000? Million? It's not only about the speed, but also about storing the data - you are potentially storing the whole database table into variables, you might run out of memory. So here's the thing: there is a function called chunk() which splits all data into separate selects, like pagination. Let's look at an example:
User::chunk(100, function ($users) {
  foreach ($users as $user) {
    $some_value = ($user->some_field > 0) ? 1 : 0;
    // might be more logic here
    $user->update(['some_other_field' => $some_value]);
  }
});
What it actually does is running a loop of selecting 100 entries, then doing updates with them, and then another 100 entries, another update and so on. Which means that at no point there is a huge amount of data taken from the database - you are working with a chunk of entries, not the whole table.
Want more articles like this every week? Subscribe!
Still not sure? Want to check out past newsletter issues? Here they are - just click this link!
Now, be careful with filtering initial results - don't run anything like this:
User::where('approved', 0)->chunk(100, function ($users) {
  foreach ($users as $user) {
    $user->update(['approved' => 1]);
  }
});
Technically, it would work and wouldn't throw any errors, but the problem here is that you're filtering not approved users, and then approving them, and on the next chunk, when Laravel is doing another database query to get another "page", the data is already changed by then, and you will miss a page of data. Which means you will process only half of the entries. So don't update same fields that would affect initial filter. Hope that helps dealing with big or huge amount of data.

No comments or questions yet...

Like our articles?

Become a Premium Member for $129/year or $29/month
What else you will get:
  • 22 courses (477 lessons, total 38 h 20 min)
  • 2 long-form tutorials (one new every week)
  • access to project repositories
  • access to private Discord