# Full Data Loader — Offline / Historical Purchases Onboarding

Created: 12.07.2023&#x20;

Updated: 31.10.2023&#x20;

Author: Alexander Pushkin

{% hint style="info" %}
For data onboarding, please contact the team working on your project.
{% endhint %}

## Data Onboarding Objectives

The recommendation platform extensively relies on transaction data as a primary data source for a wide array of recommendation algorithms. Purchase events are an important part of the personalization platform implementation and provide a real-time data flow. However, in some instances, it makes sense to incorporate additional data. The most common cases are:

* providing the recommendation platform with historical data right after the implementation to avoid a "cold start",
* onboarding the transactions data from offline stores’ to improve recommendations quality.

Data can be loaded with or without user identifiers. When loading data with user identifiers, events such as "Login" / "Signup" must be set as part of the implementation, and the identifiers must match in both the events and in the loaded data. Only products available in the feed at the time of data loading are used in calculations. Additionally, all rules applied to the original strategies are applied to the onboarded data (e.g., data range).&#x20;

[Recommendation Strategies](https://app.gitbook.com/o/EEasDkRclT6ZeuxuncO3/s/gMtBULwktLmojWzNHUD1/platform-interface/recommendation-strategies)&#x20;

Let's have a look at how information about historical / offline purchases can be used.

1. Refining product popularity — the onboarded data will participate in the popularity ranking.
2. Data for the "Purchased Together" algorithm — onboarded data is taken into account in calculating scores for products purchased together in the same transaction.
3. Personalized "Affinity" algorithm recommendations — onboarded data (user identifiers are required) is used along with the all the other data in the platform.
4. (as of October 2023, feature in development) Excluding products purchased by the user in offline stores or before the platform implementation from recommendations (user identifiers are required).
5. Creating audiences based on historical / offline purchases (user identifiers are required).

## Data Onboarding Procedure

Before onboarding data, it is necessary to understand the objectives of data onboarding and whether user identifiers need to be provided. If you have questions, the team working on your project can assist you in resolving them. The procedure itself consists of the following steps:

1. Discuss use cases where onboarded data is planned to be used, with the team working on your project.
2. Determine the period of onboarded data (for historical data) or the frequency of data onboarding (for offline sales data).
3. Prepare the export in accordance with the format below and provide the option to onboard the file via a direct link.
4. Provide the link to the Gravity Field team.
5. Initial data onboarding and user profile update can take up to a week.

## Data Structure

Each product is added in a separate row, and products purchased in the same transaction are grouped by the <mark style="color:red;">`transactionId`</mark> identifier.

Decimal separator — period

Parameters with an asterisk (\*) are mandatory.

<table><thead><tr><th width="187">Field</th><th>Type</th><th>Value Example</th><th>Description</th></tr></thead><tbody><tr><td>cuidType</td><td>string</td><td>hashedEmail</td><td>User identifier type. Must match the identifier passed in the website/app identification events</td></tr><tr><td>cuid</td><td>string</td><td>b642b4217b34b1e8d3bd915fc65c4452</td><td>User identifier</td></tr><tr><td>*transactionId</td><td>string</td><td>344778231777174</td><td>Transaction identifier</td></tr><tr><td>transactionSource</td><td>string</td><td>Store 1521</td><td>Channel identifier for offline purchases (optional)</td></tr><tr><td>*transactionDatetime</td><td>date, ISO 8601</td><td>2019-08-05T09:45:23+03:00</td><td>Transaction date and time in the dateTtime+offset format</td></tr><tr><td>*value</td><td>float</td><td>4583.18</td><td>Transaction value</td></tr><tr><td>*currency</td><td>sting, ISO 4217</td><td>KZT</td><td>Currency letter code according to ISO 4217 standard</td></tr><tr><td>*productId</td><td>string</td><td>124350</td><td>Product identifier (must be the same as “sku” in the product feed)</td></tr><tr><td>*quantity</td><td>integer, positive only</td><td>2</td><td>Quantity purchased within the transaction</td></tr><tr><td>itemPrice</td><td>float</td><td>548.20</td><td>Value of one product unit before applying promotions and discounts</td></tr><tr><td>transactionItemValue</td><td>float</td><td>877.12</td><td>Product cart value. Calculated as the unit value multiplied by the number of units in the transaction, minus discounts</td></tr><tr><td>customFields</td><td></td><td></td><td>Additional fields for creating audiences based on onboarded data. You can onboard data that is not present in the product feed.</td></tr></tbody></table>

## Checking the Correctness of Data Upload

1. Logs of each upload are available in a dedicated folder on our server with password access (access is provided upon request). The log file name format is <mark style="color:red;">`offline_purchase_upload_log_YYYY-MM-DD-HH-MM-SS.csv`</mark>.
2. After the initial data uploading, it is recommended to check that the data appears in the audiences when selecting the "Products purchased offline" condition.

## Sample Data File

{% file src="<https://3786223776-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FgMtBULwktLmojWzNHUD1%2Fuploads%2F10K2iLvCR15XzDgOyXOd%2Foffline_purchases_data_example.csv?alt=media&token=2eb9738b-35f2-4f2e-ba80-a1a99c4d1a37>" %}
