Contact
Site: US UK AU |
Nexcess Blog

Speed up Magento DataFlow (Import / Export)

September 24, 2010 1 Comment RSS Feed

For as long as we can remember, Magento has had issues with Import and Export profiles, especially regarding performance. We have tried many different solutions for speeding up DataFlow and dealing with other import / export related issues. We’ve found a solution that seems to help in the majority of cases. First, I’d like to mention that the Magento import / export status page occasionally just shows a white screen (page doesn’t load completely or loads blank), a 500 error, or some other random error when really, the import / export job is still running in the background and will complete successfully.

Magento DataFlow has 4 main pieces

  • Adapter (Read the external data source and allows the parser to access it)
  • Parser (Goes through the external data and translates it into something Magento can understand)
  • Mapper (Takes external data fields and associates them with the correct Magento data fields)
  • Validator (Ensures that data is correct before / after committing it)

A standard workflow looks something like this

Create an import / export profile (either via Profile Manager or Advanced Profile which allows you to tweak the actual XML profile)

You probably want to select “Local File” for “File Information -> Type” when creating the profile. “Local File” means that the file will be saved to [Magento basedir]/var/export if you’re exporting data. It is critical that you ensure the file does not exist or that you manually specify a new filename for each export; sometimes Magento has trouble overwriting an existing file and this will cause cryptic errors / export failure. Your best bet is to use a new filename for each attempt.

Run the profile either via a custom script (not recommended or necessary in most cases, unless running via cron)

When DataFlow receives an XML request for an import or export, it will connect to either the database in the case of an export or the external data source in the case of an import and (after parsing, mapping, and possibly validating) start building up the results (to be later written to a file or imported into to the database) in the following tables.

dataflow_batch_import<br />
dataflow_batch_export

There is an issue with Magento where it does not truncate (empty the data from) the tables before starting an import / export. Also, having extra data in those tables or the logs will slow DataFlow down quite a bit.

Here is what we recommend for speeding up DataFlow

  1. Log into your Magento Dashboard (admin) and go to System > Configuration
  2. Go to Advanced > System -> Log Cleaning on the side menu
  3. Change ‚Äúsave log, days”. We recommend 14.
  4. Select “Enable log cleaning”
  5. Ensure your crontab is properly configured. Contact support@nexcess.net if you’re not sure or follow this guide: ( https://docs.nexcess.net/article/setting-up-a-cron-job.html )
    To make sure the logs get cleaned, run cron.php manually just before the import by loading http://yoursite.tld/cron.php or wherever cron.php is located (it’s in your Magento base directory) in a web browser. There won’t be any output displayed in your browser, but you should get a HTTP/200 response code if everything ran OK.
  6. You can check the status of the export by doing a “SELECT COUNT(batch_export_id) FROM dataflow_batch_export” in MySQL. Alternatively, you can download the PHP script below and rename it “mage-dataflow.php” and upload it to your Magento base directory, then load it in your web browser. It will show you the number of rows in the import and export tables. When the rows go up when you reload the page, you can see that more items are being processed. When the count drops back to zero, the dataflow operation has started writing out the file for the export or importing the records for an import and should be done in less than 5 minutes. Additionally, the page lets you truncate the tables at the press of a button (WARNING: Only use if your dataflow process is stuck! It will abort the dataflow process).
<?php

function emptyTables() {

// Parse magento's local.xml to get db info, if local.xml is found

	if (file_exists('app/etc/local.xml')) {

		$xml = simplexml_load_file('app/etc/local.xml');

		$tblprefix = $xml->global->resources->db->table_prefix;
		$dbhost = $xml->global->resources->default_setup->connection->host;
		$dbuser = $xml->global->resources->default_setup->connection->username;
		$dbpass = $xml->global->resources->default_setup->connection->password;
		$dbname = $xml->global->resources->default_setup->connection->dbname;

	} 
	
	else {
	    exit('Failed to open app/etc/local.xml');
	}

	$conn = mysql_connect($dbhost, $dbuser, $dbpass) or die ('Error connecting to mysql');
	mysql_select_db($dbname);

	mysql_query("TRUNCATE " . $tblprefix . "dataflow_batch_export") or die (mysql_error());
	mysql_query("TRUNCATE " . $tblprefix . "dataflow_batch_import") or die (mysql_error());
}

// Get the name of this script
$myname = $_SERVER["SCRIPT_NAME"];

// Check to see if we're truncating tables
$p = $_REQUEST['clear'];
if ($p) {
	emptyTables();
	echo "<h1>Tables truncated!</h1><br /><br />";
}


// DB Interaction
if (file_exists('app/etc/local.xml')) {

	$xml = simplexml_load_file('app/etc/local.xml');

	$tblprefix = $xml->global->resources->db->table_prefix;
	$dbhost = $xml->global->resources->default_setup->connection->host;
	$dbuser = $xml->global->resources->default_setup->connection->username;
	$dbpass = $xml->global->resources->default_setup->connection->password;
	$dbname = $xml->global->resources->default_setup->connection->dbname;

	} 
	
	else {
	    exit('Failed to open app/etc/local.xml');
	}

	$conn = mysql_connect($dbhost, $dbuser, $dbpass) or die ('Error connecting to mysql');
	mysql_select_db($dbname);


mysql_select_db($dbname);

$exportresult = mysql_query("SELECT COUNT(batch_id) FROM " . $tblprefix . "dataflow_batch_export") or die (mysql_error());
$importresult = mysql_query("SELECT COUNT(batch_id) FROM " . $tblprefix . "dataflow_batch_import") or die (mysql_error());

$numexportrows = mysql_fetch_array($exportresult);
$numimportrows = mysql_fetch_array($importresult);

$numexport = $numexportrows[0];
$numimport = $numimportrows[0];



// CSS for NexStyle
echo '
<html>
<head>
<title=Magento DataFlow Status>
<style type="text/css">
html {
    width: 100%;
    font-family: Helvetica, Arial, sans-serif;
}
body {
    background-color:#00AEEF;
    color:#FFFFFF;
    line-height:1.0em;
    font-size: 125%;
}
b {
    color: #FFFFFF;
}
table{
    border-spacing: 1px;
    border-collapse: collapse;
    width: 300px;
}
th {
    text-align: center;
    font-size: 125%;
    font-weight: bold;
    padding: 5px;
    border: 2px solid #FFFFFF;
    background: #00AEEF;
    color: #FFFFFF;
}
td {
    text-align: left;
    padding: 4px;
    border: 2px solid #FFFFFF;
    color: #FFFFFF;
    background: #666;
}
</style>
</head>';

// DB info for user to see
echo '
<body>
<a href="http://nexcess.net">
<img src="https://smhttp-nex.nexcesscdn.net/803313/static/images/logoMainR2.gif" width="217" height="38" alt="Nexcess Beyond Hosting"></a>
<br />
<br />
<br />
<b>Table Prefix:</b> ' . $tblprefix . '<br />'
. '<b>DB Host:</b> ' . $dbhost . '<br />'
. '<b>DB User:</b> ' . $dbuser . '<br />'
. '<b>DB Name</b>: ' . $dbname . '<br /><br /></p>';

// Set up the Export table
echo "
	<h1>Export</h1>
	<h2>$numexport rows</h2>
	<br />
	<h1>Import</h1>
	<h2>$numimport rows</h2>";

echo '<INPUT type="button" value="Truncate import and export tables (runs in a new window)" onClick="window.open(\'' . $myname . '?clear=1\',\'mywindow\',\'width=400,height=510\')">';

mysql_close($conn);
?>
Posted in: Magento