Creating a Data Importer
Edit on GitHubThe following article describes how to build your data importer for a specific type. All steps in this article are built on the real-life example for importing product images.
Currently, we only support CSV as a format for file imports out of the box. However, you can create your own file reader if you want to use a different format.
Prerequisites
Before you start creating a data importer, you need to know what data it should include. We recommend you start by checking out the respective database tables you want to fill with data. The image below shows the table relation for product images.
From this schema, you can easily identify the data columns you need for your import file. So the relevant fields to fill are:
- name (spy_product_image_set)
- external_url_large (spy_product_image)
- external_url_small (spy_product_image)
- fk_locale (spy_product_image_set)
- fk_product (spy_product_image_set)
- fk_product_abstract (spy_product_image_set)
All fk_*
fields are foreign keys to other database tables. We can not know the id’s of the related entities so we need to fill in those fields with a unique identifier.
For the database field fk_locale
we will use the name of the locale for which we need the id (e.g. de_DE). This value will then be used later to fetch the id for the given locale name.
We will use the same technique for the fk_product
and fk_product_abstract
fields.
To identify the data for your import file, you can also check out the .csv files of individual data importers. You can find the import file details in Packaged Business Capabilities.
Now that you know what data your import file should include, you can proceed with the first step of creating a data importer: creating an import file.
Create an import file
Since only CSV format is supported for import out of the box, we will start with creating a .csv file in any preferred editor.
Your CSV file for the product images import will contain the following header columns:
- image_set_name
- external_url_large
- external_url_small
- locale
- concrete_sku
- abstract_sku
Now, you can start to fill in some data into the new file. We recommend adding only a couple of entries to check after the first import run if all needed data is imported.
Once you populate all columns, your CSV file should be similar to this one:
Save the new file under data/import/*
.
That’s it - your import file is ready. Now you have to configure the data importer.
Configure the data importer
Every import type needs its own DataImporterConfiguration
. To add it, open the DataImportConfig
class and add a constant for the import type.
The constant is used to identify an import type. More information about it will follow later in this article. In our case we will use const IMPORT_TYPE_PRODUCT_IMAGE = 'product-image';
.
You also need to define the new data importer in the configuration YML file. Add the following lines to the /data/import/config/full_import_config.yml
configuration file:
actions:
...
- data_entity: product-image
source: data/import/icecat_biz_data/product_image.csv
where:
data_entity
represents the name of your data importer;source
indicates the path to your.csv
file with data to import.
Create a writer step
(Each importer needs at least one step to write the data from the file to a database. You can add as many steps as you need to your DataSetStepBroker
.
First, we will create a new class called ProductImageWriterStep
in "*/Zed/DataImport/Business/Model/ProductImage/"
with this content:
ProductImageWriterStep
<?php
namespace Pyz\Zed\DataImport\Business\Model\ProductImage;
use Spryker\Zed\DataImport\Business\Model\DataImportStep\DataImportStepInterface;
use Spryker\Zed\DataImport\Business\Model\DataSet\DataSetInterface;
class ProductImageWriterStep implements DataImportStepInterface
{
const BULK_SIZE = 100;
/**
* @param \Spryker\Zed\DataImport\Business\Model\DataSet\DataSetInterface $dataSet
*
* @return void
*/
public function execute(DataSetInterface $dataSet)
{
echo var_dump($dataSet) . PHP_EOL . 'Line: ' . __LINE__ . PHP_EOL . 'File: ' . __FILE__ . die();
}
}
Now that the writer has been created, you can wire up your data importer.
Wire up the DataImporter
We now have everything at hand to wire up the new DataImport
and run it. To do so:
- Add the following method in
DataImportBusinessFactory
:
/**
* @return \Spryker\Zed\DataImport\Business\Model\DataImporterInterface|\Spryker\Zed\DataImport\Business\Model\DataSet\DataSetStepBrokerAwareInterface
*/
public function createProductImageImporter(DataImportConfigurationActionTransfer $dataImportConfigurationActionTransfer)
{
$dataImporter = $this->getCsvDataImporterFromConfig(
$this->getConfig()->buildImporterConfigurationByDataImportConfigAction($dataImportConfigurationActionTransfer)
);
$dataSetStepBroker = $this->createTransactionAwareDataSetStepBroker(ProductImageWriterStep::BULK_SIZE);
$dataSetStepBroker
->addStep(new ProductImageWriterStep());
$dataImporter->addDataSetStepBroker($dataSetStepBroker);
return $dataImporter;
}
- Add the new DataImporter in
DataImportBusinessFactory::getDataImporterByType()
:
public function getDataImporterByType(DataImportConfigurationActionTransfer $dataImportConfigurationActionTransfer): ?DataImporterInterface
{
switch ($dataImportConfigurationActionTransfer->getDataEntity()) {
...
case DataImportConfig::IMPORT_TYPE_PRODUCT_IMAGE:
return $this->createProductImageImporter($dataImportConfigurationActionTransfer);
}
}
In the DataImportBusinessFactory::getImporter()
we now add the new DataImporter
with $dataImporterCollection->addDataImporter($this->createProductImageImporter())
.
- Register a new console command in
ConsoleDependencyProvider
to allow execution of the import command from the console. Since theDataImport
module brings a generic console command which can be used several times to add a console command for each data import type, you only need to register it once as follows:
/**
* @param \Spryker\Zed\Kernel\Container $container
*
* @return \Symfony\Component\Console\Command\Command[]
*/
public function getConsoleCommands(Container $container)
{
$commands = [
new DataImportConsole(),
...
new DataImportConsole(DataImportConsole::DEFAULT_NAME . ':' . DataImportConfig::IMPORT_TYPE_PRODUCT_IMAGE),
...
];
}
Now you have a new console command data:import:product-image
available. When you run vendor/bin/console
you will see a new entry: data:import:product-image This command executes your "product-image" importer.
Execute the data importer
As mentioned in the previous step, the DataImport module brings a generic command which can be used several times. You need to at least register it once without a constructor argument.
This command will execute all the DataImporter
registered in your full_import_config.yml
one by one.
As you already made it in the example above, you can add the command with a new name over and over again. This brings in the ability to execute only a specific data importer.
When you now execute vendor/bin/console data:import:product-image
you will see the debug output from the ProductImageWriterStep
.
There are a lot of options that you can set for an import. If you need to debug one specific line of an import file, you can use offset and limit like this vendor/bin/console data:import:product-image -o 43 -l 1
.
This will then only execute the data set at potion 43 of your import file.
The DataImporter
catches exceptions by default and continues to import data. In development mode, you can use the throw-exception option to throw the occurred exception instead of catching it.
There are a couple more options, you can see them when you execute vendor/bin/console data:import:product-image -h
, the console command will then print the help page for the given command.
Finalize the data importer
We have made sure that the data importer can be executed, but we only print a debug message right now. We need to do some additional things to really save some data. Follow the steps below to finalize your data importer.
1. Convert logical identifier to foreign keys
As mentioned in the Prerequisites, we can not use foreign keys in our import file - we need a logical identifier that can now be used to get the foreign key of a related entity.
There are several ways of how we can get the logical identifier. For example, we could add a new Step e.g. LocaleNameToIdLocaleStep
. However, in our case, it’s better to use a Repository, which provides us with a getter to retrieve the id_locale
by its name. We will take this approach and do the following:
- Add
LocaleRepository
to get the foreign key of a locale by its name:
<?php
namespace Pyz\Zed\DataImport\Business\Model\Locale\Repository;
use Orm\Zed\Locale\Persistence\Map\SpyLocaleTableMap;
use Orm\Zed\Locale\Persistence\SpyLocaleQuery;
class LocaleRepository implements LocaleRepositoryInterface
{
/**
* @var array
*/
protected static $localeMap;
/**
* @param string $locale
*
* @return int
*/
public function getIdLocaleByLocale($locale)
{
if (!static::$localeMap) {
$this->loadLocaleMap();
}
return static::$localeMap[$locale];
}
/**
* @return void
*/
private function loadLocaleMap()
{
$localeCollection = SpyLocaleQuery::create()
->select([SpyLocaleTableMap::COL_ID_LOCALE, SpyLocaleTableMap::COL_LOCALE_NAME])
->find();
foreach ($localeCollection as $locale) {
static::$localeMap[$locale[SpyLocaleTableMap::COL_LOCALE_NAME]] = $locale[SpyLocaleTableMap::COL_ID_LOCALE];
}
}
}
This Repository
is very simple but does what we need right now.
- Add the following code, as we need to change our
ProductImageWriterStep
to use it:
/**
* @var \Pyz\Zed\DataImport\Business\Model\Locale\Repository\LocaleRepositoryInterface
*/
protected $localeRepository;
/**
* @param \Pyz\Zed\DataImport\Business\Model\Locale\Repository\LocaleRepositoryInterface $localeRepository
*/
public function __construct(LocaleRepositoryInterface $localeRepository)
{
$this->localeRepository = $localeRepository;
}
Now, we need to inject this new Repository into our ProductImageWriterStep
inside of the business factory.
When this is done we can use it like this: $idLocale = $this->localeRepository->getIdLocaleByLocale($dataSet[static::KEY_LOCALE]);
We need to add a similar Repository
to retrieve the ID of an abstract or concrete product by its SKU. This is then also added to our ProductImageWriterStep
as already done with LocaleRepository
.
2. Find or create entities
We will now create the spy_product_image_set
, spy_product_image
, spy_product_image_set_to_product_image
and entities.
With the first run of an importer, all entities are new and we need to do an insert. When the importer is executed more than once, it updates the existing entities. To execute this approach, we use Propel’s findOrCreate()
method. Do the following:
- Find or create
spy_product_image_set
. Add the following code to theProductImageWriterStep
:
/**
* @param \Spryker\Zed\DataImport\Business\Model\DataSet\DataSetInterface $dataSet
*
* @return \Orm\Zed\ProductImage\Persistence\SpyProductImageSet
*/
protected function findOrCreateImageSet(DataSetInterface $dataSet)
{
$idLocale = $this->getIdLocaleByLocale($dataSet);
$query = SpyProductImageSetQuery::create()
->filterByName($dataSet[static::KEY_IMAGE_SET_NAME])
->filterByFkLocale($idLocale);
if (!empty($dataSet[static::KEY_ABSTRACT_SKU])) {
$idProductAbstract = $this->productRepository->getIdProductAbstractByAbstractSku($dataSet[static::KEY_ABSTRACT_SKU]);
$query->filterByFkProductAbstract($idProductAbstract);
}
if (!empty($dataSet[static::KEY_CONCRETE_SKU])) {
$idProduct = $this->productRepository->getIdProductByConcreteSku($dataSet[static::KEY_CONCRETE_SKU]);
$query->filterByFkProduct($idProduct);
}
$productImageSetEntity = $query->findOneOrCreate();
if ($productImageSetEntity->isNew() || $productImageSetEntity->isModified()) {
$productImageSetEntity->save();
}
return $productImageSetEntity;
}
/**
* @param \Spryker\Zed\DataImport\Business\Model\DataSet\DataSetInterface $dataSet
*
* @return int
*/
protected function getIdLocaleByLocale(DataSetInterface $dataSet)
{
$idLocale = null;
if (!empty($dataSet[static::KEY_LOCALE])) {
$idLocale = $this->localeRepository->getIdLocaleByLocale($dataSet[static::KEY_LOCALE]);
}
return $idLocale;
}
We want to allow fk_locale
to be null. Either fk_product
or fk_product_abstract
must be set. For performance reasons, we save the entity only when it’s new or modified.
- Find or create the
spy_product_image
by adding the following code to theProductImageWriterStep
:
/**
* We expect that the large URL is the unique identifier for an image.
*
* @param \Spryker\Zed\DataImport\Business\Model\DataSet\DataSetInterface $dataSet
*
* @return \Orm\Zed\ProductImage\Persistence\SpyProductImage
*/
protected function findOrCreateImage(DataSetInterface $dataSet)
{
$productImageEntity = SpyProductImageQuery::create()
->filterByExternalUrlLarge($dataSet[static::KEY_EXTERNAL_URL_LARGE])
->findOneOrCreate();
$productImageEntity
->setExternalUrlSmall($dataSet[static::KEY_EXTERNAL_URL_SMALL]);
if ($productImageEntity->isNew() || $productImageEntity->isModified()) {
$productImageEntity->save();
}
return $productImageEntity;
}
- Add the relation
spy_product_image_set_to_product_image
by adding the following code toProductImageWriterStep
:
/**
* @param \Orm\Zed\ProductImage\Persistence\SpyProductImageSet $imageSetEntity
* @param \Orm\Zed\ProductImage\Persistence\SpyProductImage $productImageEntity
*
* @return void
*/
protected function updateOrCreateImageToImageSetRelation(SpyProductImageSet $imageSetEntity, SpyProductImage $productImageEntity)
{
$productImageSetToProductImageEntity = SpyProductImageSetToProductImageQuery::create()
->filterByFkProductImageSet($imageSetEntity->getIdProductImageSet())
->filterByFkProductImage($productImageEntity->getIdProductImage())
->findOneOrCreate();
$productImageSetToProductImageEntity
->setSortOrder(0);
if ($productImageSetToProductImageEntity->isNew() || $productImageSetToProductImageEntity->isModified()) {
$productImageSetToProductImageEntity->save();
}
}
3. Fill the Execute Method:
/**
* @param \Spryker\Zed\DataImport\Business\Model\DataSet\DataSetInterface $dataSet
*
* @return void
*/
public function execute(DataSetInterface $dataSet)
{
$imageSetEntity = $this->findOrCreateImageSet($dataSet);
$productImageEntity = $this->findOrCreateImage($dataSet);
$this->updateOrCreateImageToImageSetRelation($imageSetEntity, $productImageEntity);
}
7. Run the importer
That’s it! Now when you run the console command vendor/bin/console data:import:product-image
, you will see an output similar to this one:
Thank you!
For submitting the form