Full Data Loader — Offline / Historical Purchases Onboarding
Last updated
Was this helpful?
Last updated
Was this helpful?
Created: 12.07.2023
Updated: 31.10.2023
Author: Alexander Pushkin
The recommendation platform extensively relies on transaction data as a primary data source for a wide array of recommendation algorithms. Purchase events are an important part of the personalization platform implementation and provide a real-time data flow. However, in some instances, it makes sense to incorporate additional data. The most common cases are:
providing the recommendation platform with historical data right after the implementation to avoid a "cold start",
onboarding the transactions data from offline stores’ to improve recommendations quality.
Data can be loaded with or without user identifiers. When loading data with user identifiers, events such as "Login" / "Signup" must be set as part of the implementation, and the identifiers must match in both the events and in the loaded data. Only products available in the feed at the time of data loading are used in calculations. Additionally, all rules applied to the original strategies are applied to the onboarded data (e.g., data range).
Let's have a look at how information about historical / offline purchases can be used.
Refining product popularity — the onboarded data will participate in the popularity ranking.
Data for the "Purchased Together" algorithm — onboarded data is taken into account in calculating scores for products purchased together in the same transaction.
Personalized "Affinity" algorithm recommendations — onboarded data (user identifiers are required) is used along with the all the other data in the platform.
(as of October 2023, feature in development) Excluding products purchased by the user in offline stores or before the platform implementation from recommendations (user identifiers are required).
Creating audiences based on historical / offline purchases (user identifiers are required).
Before onboarding data, it is necessary to understand the objectives of data onboarding and whether user identifiers need to be provided. If you have questions, the team working on your project can assist you in resolving them. The procedure itself consists of the following steps:
Discuss use cases where onboarded data is planned to be used, with the team working on your project.
Determine the period of onboarded data (for historical data) or the frequency of data onboarding (for offline sales data).
Prepare the export in accordance with the format below and provide the option to onboard the file via a direct link.
Provide the link to the Gravity Field team.
Initial data onboarding and user profile update can take up to a week.
Each product is added in a separate row, and products purchased in the same transaction are grouped by the transactionId
identifier.
Decimal separator — period
Parameters with an asterisk (*) are mandatory.
cuidType
string
hashedEmail
User identifier type. Must match the identifier passed in the website/app identification events
cuid
string
b642b4217b34b1e8d3bd915fc65c4452
User identifier
*transactionId
string
344778231777174
Transaction identifier
transactionSource
string
Store 1521
Channel identifier for offline purchases (optional)
*transactionDatetime
date, ISO 8601
2019-08-05T09:45:23+03:00
Transaction date and time in the dateTtime+offset format
*value
float
4583.18
Transaction value
*currency
sting, ISO 4217
KZT
Currency letter code according to ISO 4217 standard
*productId
string
124350
Product identifier (must be the same as “sku” in the product feed)
*quantity
integer, positive only
2
Quantity purchased within the transaction
itemPrice
float
548.20
Value of one product unit before applying promotions and discounts
transactionItemValue
float
877.12
Product cart value. Calculated as the unit value multiplied by the number of units in the transaction, minus discounts
customFields
Additional fields for creating audiences based on onboarded data. You can onboard data that is not present in the product feed.
Logs of each upload are available in a dedicated folder on our server with password access (access is provided upon request). The log file name format is offline_purchase_upload_log_YYYY-MM-DD-HH-MM-SS.csv
.
After the initial data uploading, it is recommended to check that the data appears in the audiences when selecting the "Products purchased offline" condition.