Full Data Loader — Offline / Historical Purchases Onboarding

Created: 12.07.2023

Updated: 31.10.2023

Author: Alexander Pushkin

For data onboarding, please contact the team working on your project.

Data Onboarding Objectives

The recommendation platform extensively relies on transaction data as a primary data source for a wide array of recommendation algorithms. Purchase events are an important part of the personalization platform implementation and provide a real-time data flow. However, in some instances, it makes sense to incorporate additional data. The most common cases are:

  • providing the recommendation platform with historical data right after the implementation to avoid a "cold start",

  • onboarding the transactions data from offline stores’ to improve recommendations quality.

Data can be loaded with or without user identifiers. When loading data with user identifiers, events such as "Login" / "Signup" must be set as part of the implementation, and the identifiers must match in both the events and in the loaded data. Only products available in the feed at the time of data loading are used in calculations. Additionally, all rules applied to the original strategies are applied to the onboarded data (e.g., data range).

Recommendation Strategies

Let's have a look at how information about historical / offline purchases can be used.

  1. Refining product popularity — the onboarded data will participate in the popularity ranking.

  2. Data for the "Purchased Together" algorithm — onboarded data is taken into account in calculating scores for products purchased together in the same transaction.

  3. Personalized "Affinity" algorithm recommendations — onboarded data (user identifiers are required) is used along with the all the other data in the platform.

  4. (as of October 2023, feature in development) Excluding products purchased by the user in offline stores or before the platform implementation from recommendations (user identifiers are required).

  5. Creating audiences based on historical / offline purchases (user identifiers are required).

Data Onboarding Procedure

Before onboarding data, it is necessary to understand the objectives of data onboarding and whether user identifiers need to be provided. If you have questions, the team working on your project can assist you in resolving them. The procedure itself consists of the following steps:

  1. Discuss use cases where onboarded data is planned to be used, with the team working on your project.

  2. Determine the period of onboarded data (for historical data) or the frequency of data onboarding (for offline sales data).

  3. Prepare the export in accordance with the format below and provide the option to onboard the file via a direct link.

  4. Provide the link to the Gravity Field team.

  5. Initial data onboarding and user profile update can take up to a week.

Data Structure

Each product is added in a separate row, and products purchased in the same transaction are grouped by the transactionId identifier.

Decimal separator — period

Parameters with an asterisk (*) are mandatory.

Field
Type
Value Example
Description

cuidType

string

hashedEmail

User identifier type. Must match the identifier passed in the website/app identification events

cuid

string

b642b4217b34b1e8d3bd915fc65c4452

User identifier

*transactionId

string

344778231777174

Transaction identifier

transactionSource

string

Store 1521

Channel identifier for offline purchases (optional)

*transactionDatetime

date, ISO 8601

2019-08-05T09:45:23+03:00

Transaction date and time in the dateTtime+offset format

*value

float

4583.18

Transaction value

*currency

sting, ISO 4217

KZT

Currency letter code according to ISO 4217 standard

*productId

string

124350

Product identifier (must be the same as “sku” in the product feed)

*quantity

integer, positive only

2

Quantity purchased within the transaction

itemPrice

float

548.20

Value of one product unit before applying promotions and discounts

transactionItemValue

float

877.12

Product cart value. Calculated as the unit value multiplied by the number of units in the transaction, minus discounts

customFields

Additional fields for creating audiences based on onboarded data. You can onboard data that is not present in the product feed.

Checking the Correctness of Data Upload

  1. Logs of each upload are available in a dedicated folder on our server with password access (access is provided upon request). The log file name format is offline_purchase_upload_log_YYYY-MM-DD-HH-MM-SS.csv.

  2. After the initial data uploading, it is recommended to check that the data appears in the audiences when selecting the "Products purchased offline" condition.

Sample Data File

Last updated

Was this helpful?