Simplified Data Export
Edit on GitHubOverview
The new data export system simplifies exporting entities from Spryker to external systems. It provides a declarative YAML-based configuration combined with PHP plugins for flexibility, making it easy to integrate with third-party systems, analytics platforms, and data warehouses. It allows you to use existing repository method and data transfer objects to export data from Spryker. You need to define the mapping between the data entities and the fields in the export file.
Installation
This feature requires spryker/data-export version 0.1.6 or later. Make sure your composer.json includes the following:
composer require spryker/data-export:"^0.1.6"
How to export an entity
Simplified Data Export greatly simplifies the process. For example, creating an OrderDataExport requires only:
1. Add YML Configuration
version: 1
defaults:
filter_criteria:
created_at:
from: "3 week ago 00:00:00"
to: 'now'
connection:
type: file-system # connection plugin that implements \Spryker\Service\DataExportExtension\Dependency\Plugin\DataExportConnectionPluginInterface. Connection to a place where files are exported.
params:
# check connection plugin documentation for params
actions:
- data_entity: order
destination: '{store}_order.{extension}'
format:
type: json
object: 'order'
fields:
field_name_in_export_file: $.fieldNameInYourTransfer
entity_id: $.entityId
# Fields can be configured here or in a plugin.
# YML config is merged with plugin config.
# YML overrides plugin config fields if they are also defined in plugin.
2. Implement a Plugin
class OrderDataEntityReaderPlugin extends AbstractPlugin implements DataEntityReaderPluginInterface, DataEntityFieldsConfigPluginInterface
{
public function getDataEntity(): string
{
return 'order';
}
public function getDataBatch(DataExportConfigurationTransfer $dataExportConfigurationTransfer): DataExportBatchTransfer
{
$criteriaTransfer = (new SomeCriteriaTransfer())
->fromArray($dataExportConfigurationTransfer->getFilterCriteria());
return $this->getRepository()->getOrderData(
$criteriaTransfer,
$dataExportConfigurationTransfer->getOffsetOrFail(),
$dataExportConfigurationTransfer->getBatchSizeOrFail(),
);
}
public function getFieldsConfig(): array
{
return [
'field_name_in_config' => '$.fieldNameInYourDataTransfer',
];
}
}
This plugin links the data entity (order) with its data retrieval logic and field mapping.
You can choose where to define the fields configuration — either in the YAML file or in a plugin that implements the DataEntityFieldsConfigPluginInterface.
Register the plugin
Add the plugin to your module’s dependency provider:
<?php
namespace Pyz\Zed\DataExport;
use Pyz\Zed\Sales\Communication\Plugin\DataExport\OrderDataEntityReaderPlugin;
use Spryker\Zed\DataExport\DataExportDependencyProvider as SprykerDataExportDependencyProvider;
class DataExportDependencyProvider extends SprykerDataExportDependencyProvider
{
/**
* @return array<\Spryker\Zed\DataExportExtension\Dependency\Plugin\DataEntityReaderPluginInterface>
*/
protected function getDataEntityReaderPlugins(): array
{
return [
new OrderDataEntityReaderPlugin(),
];
}
}
3. Implement Repository Logic
Implement the repository method that retrieves and prepares data for export:
public function getOrderData(
DataExportConfigurationTransfer $dataExportConfigurationTransfer,
int $offset,
int $limit
): DataExportBatchTransfer {
$query = $this->getFactory()->getSpySalesOrderQuery();
$this->applyFilters($query, $dataExportConfigurationTransfer);
$query->setOffset($offset)->setLimit($limit);
$orderEntities = $query->find();
$dataExportBatchTransfer = (new DataExportBatchTransfer())
->setOffset($offset)
->setLimit($limit);
$data = [];
foreach ($orderEntities as $orderEntity) {
$data[] = $this->mapExportDataItem($orderEntity);
}
return $dataExportBatchTransfer->setData($data);
}
protected function applyFilters(
SpySalesOrderQuery $query,
SomeCriteriaTransfer $criteriaTransfer
): void {
if ($criteriaTransfer->getCreatedAtFrom()) {
$query->filterByCreatedAt(
$criteriaTransfer->getCreatedAtFrom(),
Criteria::GREATER_EQUAL
);
}
if ($criteriaTransfer->getCreatedAtTo()) {
$query->filterByCreatedAt(
$criteriaTransfer->getCreatedAtTo(),
Criteria::LESS_EQUAL
);
}
}
protected function mapExportDataItem(SpySalesOrder $orderEntity): OrderTransfer
{
// map to OrderTransfer
}
The repository prepares the query with necessary joins, applies filters from the configuration, and maps results into DataExportBatchTransfer.
Field configuration explanation
The fields configuration defines how data from your Transfer objects is mapped to fields in the export output. Each entry specifies:
- Left side – the field name in your export file
- Right side – the field path in the Transfer object returned in the repository collection
For example, if you are exporting orders, the OrderTransfer contains:
/**
* @var \Generated\Shared\Transfer\AddressTransfer|null
*/
protected $billingAddress;
and the AddressTransfer includes:
/**
* @var string
*/
public const PHONE = 'phone';
You can then define in your configuration:
phone: $.billingAddress.phone
Here, $ represents the root OrderTransfer object.
The fields configuration defined in the YAML file has higher priority than the configuration defined in the plugin. If the same field is defined in both the plugin and the YAML configuration, the value from the YAML file will be used.
Fields from both sources (plugin and YAML) are merged into the final configuration:
- Fields defined only in the plugin will be included by default
- Fields defined in the YAML file can override or extend those from the plugin
Example with OrderTransfer:
The following example demonstrates how to export order data with various mapping scenarios:
- Simple field mapping
- Nested object field mapping
- Array field mapping with wildcards
In plugin:
public function getFieldsConfig(): array
{
return [
// Simple field mapping (colon syntax)
'order_reference:$.orderReference',
'created_at:$.createdAt',
// Nested object field mapping
'billing_address_phone:$.billingAddress.phone',
'billing_address_city:$.billingAddress.city',
// Alternative key-value syntax
'customer_email' => '$.customer.email',
// Array mapping with wildcards - exports each order item
// Results in: item_0_sku, item_1_sku, item_2_sku, etc.
'item_*_sku:$.items.*.sku',
'item_*_quantity:$.items.*.quantity',
];
}
In YAML configuration:
actions:
- data_entity: order
fields:
# Override the billing phone from plugin
billing_address_phone: $.billingAddress.phone
# Add new fields not defined in plugin
order_total: $.totals.grandTotal
currency: $.currencyIsoCode
# Add more nested fields
shipping_address_city: $.shippingAddress.city
shipping_address_country: $.shippingAddress.country.name
# Override array mapping to include item names
item_*_name: $.items.*.name
item_*_price: $.items.*.unitPrice
Merged output:
If a field appears in both configurations, the YAML version replaces the plugin one. Fields from both sources are combined in the final configuration:
# From plugin (kept):
order_reference: $.orderReference
created_at: $.createdAt
billing_address_city: $.billingAddress.city
customer_email: $.customer.email
item_*_sku: $.items.*.sku
item_*_quantity: $.items.*.quantity
# From plugin (overridden by YAML):
billing_address_phone: $.billingAddress.phone
# From YAML (new fields):
order_total: $.totals.grandTotal
currency: $.currencyIsoCode
shipping_address_city: $.shippingAddress.city
shipping_address_country: $.shippingAddress.country.name
item_*_name: $.items.*.name
item_*_price: $.items.*.unitPrice
Resulting export data for an order with 2 items:
{
"order_reference": "DE--123",
"created_at": "2025-01-27 10:00:00",
"billing_address_phone": "+49123456789",
"billing_address_city": "Berlin",
"customer_email": "[email protected]",
"order_total": 15000,
"currency": "EUR",
"shipping_address_city": "Munich",
"shipping_address_country": "Germany",
"item_0_sku": "SKU-001",
"item_0_quantity": 2,
"item_0_name": "Product A",
"item_0_price": 5000,
"item_1_sku": "SKU-002",
"item_1_quantity": 1,
"item_1_name": "Product B",
"item_1_price": 5000
}
Or in CSV format:
>order_reference,created_at,billing_address_phone,billing_address_city,customer_email,order_total,currency,shipping_address_city,shipping_address_country,item_0_sku,item_0_quantity,item_0_name,item_0_price,item_1_sku,item_1_quantity,item_1_name,item_1_price
DE--123,2025-01-27 10:00:00,+49123456789,Berlin,[email protected],15000,EUR,Munich,Germany,SKU-001,2,Product A,5000,SKU-002,1,Product B,5000
How wildcard mapping works:
When you use * in the export key (for example item_*_sku), the mapper:
- Evaluates the path
$.items.*.skuagainst the OrderTransfer - Iterates through each item in the
itemsarray - Creates separate fields for each item:
item_0_sku,item_1_sku,item_2_sku, etc. - The
*is replaced with the array index
This is useful when you need to export arrays as flat structure for CSV files or when your destination system expects a fixed column structure.
Alternative: Custom streaming with Generator
By default, the data export system uses DataEntityReaderPluginInterface to retrieve data in batches. The system internally wraps your plugin’s getDataBatch() method with a generator that handles pagination automatically:
do {
$dataExportConfigurationTransfer->setOffset($offset);
$dataExportBatchTransfer = $dataEntityReaderPlugin->getDataBatch($dataExportConfigurationTransfer); // your plugin's getDataBatch() method
$dataExportBatchTransfer->setOffset($offset)->setLimit($limit);
yield $dataExportBatchTransfer;
$offset += count($dataExportBatchTransfer->getData());
} while (count($dataExportBatchTransfer->getData()) === $limit);
When to implement a custom generator
If you need more control over the data streaming process (for example, custom pagination logic, direct database cursor streaming, or memory-optimized iteration), implement DataEntityGeneratorPluginInterface instead of DataEntityReaderPluginInterface.
With DataEntityGeneratorPluginInterface, you have full control over the generator implementation and can yield data items or batches directly without the default pagination wrapper.
// system will use your generator directly
return $this->dataExportPluginProvider
->getDataEntityPluginForInterface($dataEntityName, DataEntityGeneratorPluginInterface::class)
->getBatchGenerator($dataExportConfigurationTransfer);
<?php
namespace Pyz\Zed\Sales\Communication\Plugin\DataExport;
use Generated\Shared\Transfer\DataExportConfigurationTransfer;
use Generated\Shared\Transfer\DataExportResultTransfer;
use Spryker\Zed\DataExportExtension\Dependency\Plugin\DataEntityGeneratorPluginInterface;
use Spryker\Zed\Kernel\Communication\AbstractPlugin;
/**
* @method \Pyz\Zed\Sales\Persistence\SalesRepositoryInterface getRepository()
*/
class OrderDataEntityGeneratorPlugin extends AbstractPlugin implements DataEntityGeneratorPluginInterface
{
public function getDataEntity(): string
{
return 'order';
}
/**
* @param \Generated\Shared\Transfer\DataExportConfigurationTransfer $dataExportConfigurationTransfer
*
* @return \Generator<\Generated\Shared\Transfer\DataExportResultTransfer>
*/
public function getBatchGenerator(DataExportConfigurationTransfer $dataExportConfigurationTransfer): \Generator
{
$criteriaTransfer = (new SomeCriteriaTransfer())
->fromArray($dataExportConfigurationTransfer->getFilterCriteria());
$query = $this->getRepository()->createOrderQuery($criteriaTransfer);
foreach ($query->find() as $orderEntity) {
yield (new DataExportResultTransfer())
->setData($orderEntity->toArray());
}
}
}
Thank you!
For submitting the form