Data migration: Importing images

Error message

The spam filter installed on this site is currently unavailable. Per site policy, we are unable to accept new submissions until that problem is resolved. Please try resubmitting the form in a couple of minutes.

On a recent project we had a situation where the client wished to migrate over all the avatars currently in use by their community. Upon investigate we found out that the folder contained over 4GB worth of data. The data was not segmented in such a way that we could identify the images we needed simply from the file structure. Of these images were only needed to obtain approximately 100MB worth, and the database schema told us which ones we needed. As a result we created some code that sent requests to the existing web server, retrieved the image related to a specific user and then stored it in a imagefield on the users profile. (Note we were using content_profile rather than core profiles) The code below illustrates how we achieved this:

<?php
// Do we have an image to process
if ($image_path) {
 
// Build the url that references the image we want to pull back.
 
$url = $server_name .'/'. $image_path;
 
// Send the request to the remote server.
 
$binary_image = drupal_http_request($url);
 
// If the image is retrieved successfully from the remote server process it.
 
if ($binary_image->code == 200) {
   
// Find the file name that we are going to use for storage.
   
$filename = substr($url, strrpos($url, '/') + 1);
   
// Build the path for the temporary file and save off the data we have retrieved.
   
$dst = file_create_path(file_directory_temp()) .'/'. $filename;
   
$temp_file = file_save_data($binary_image->data, $dst);
   
// Check that the we successfully saved the file.
   
if ($temp_file) {
     
// Create the final file and assign it too our imagefield.
     
$path = file_create_path() .'/'. $filename;
     
$user_node->field_profile_image[0] = field_file_save_file($temp_file, array(), $path, FILE_EXISTS_RENAME);
    }
  }
}
?>

Using this method meant that we able to limit the data transferred to absolute minimum, and only the images we actually required were moved to the new server.