Unifier API Documentation

Welcome to the Unifier API documentation. This guide will help you understand and use the Unifier API to query time series data with precise bi-temporal controls.

Overview

The Unifier API provides access to harmonized time-series datasets across multiple domains. All data is accessible programmatically through our API or client libraries, with a focus on consistent data models and reliable delivery.

Installation

pip install --upgrade unifier

Authentication

Unifier uses API tokens to authenticate all requests. You can view and manage your tokens through the Unifier user portal.

from unifier import unifier

# Set credentials directly
unifier.user = 'your_username'
unifier.token = 'your_api_token'

Point-in-Time Queries

Unifier's core feature is its ability to support bi-temporal queries across a time series date range. This allows you to retrieve data as it was known at a specific point in time, while also specifying the business time range you're interested in.

Key Concepts

asof_date: Specifies the point-in-time timestamp. This parameter allows you to query data as it was known on that specific date. If set to None, it will use the most recent asof_date based on your up_to parameter.
asof_back_to: Defines the start date for querying a range of as-of dates.
back_to: Defines the start of the business time range (inclusive).
up_to: Defines the end of the business time range (inclusive).

Bi-temporal Query Model

Unifier's bi-temporal query model allows you to:

Access historical snapshots of data (as it was known at a specific point in time)
Query data across a range of as-of dates for different identifiers
Filter data within specific business date ranges
Combine time-series data from multiple sources with consistent datetime handling

Parameters

The following parameters can be used when querying the Unifier API:

Parameter	Type	Description	Required
`name`	string	Name of the dataset or view to query	Yes
`asof_date`	string (ISO date)	The point-in-time timestamp for querying data. If None, uses most recent asof_date based on up_to	No
`asof_back_to`	string (ISO date)	Start date for querying a range of as-of dates	No
`back_to`	string (ISO date)	Start date for the business time range filter (inclusive)	No
`up_to`	string (ISO date)	End date for the business time range filter (inclusive)	No
`key`	string	Single key value used to filter the dataset	No
`keys`	array	List of key values used to filter the dataset	No
`limit`	integer	Maximum number of records to return	No
`disable_view`	boolean	If true, disables Unifier's view transformation and returns raw data	No
`column_filters`	string	SQL-compatible boolean expression to filter columns/rows server-side	No

Unifier Methods

The Unifier Python library provides several methods for interacting with the API:

get_dataframe(name, **kwargs): Returns query results as a pandas DataFrame.
get_json(name, **kwargs): Returns query results as a list of JSON objects.
query(name, **kwargs): Returns the raw response from the API.
get_asof_dates(name): Returns available as-of dates for a dataset as a pandas DataFrame.

Data Replication

Overview

The replicate() method allows you to download full datasets directly to your local machine. It uses rclone for high performance downloads if available, and automatically falls back to a native Python implementation (boto3) if rclone is not installed — no configuration needed.

Installation

pip install --upgrade unifier

For best performance, also install rclone from https://rclone.org/downloads/ (optional).

Method Signature

unifier.replicate(
    name: str,
    target_location: str,
    asof_date: str = None,
    back_to: str = None,
    up_to: str = None,
    bandwidth_limit: int = None,
    use_rclone: bool = True
)

Parameters

Parameter	Type	Required	Description
`name`	str	✅ Yes	The dataset name to replicate
`target_location`	str	✅ Yes	Local directory path to download files into
`asof_date`	str	No	Download data as of a specific date (format: YYYY-MM-DD)
`back_to`	str	No	Start of the business date range to replicate (format: YYYY-MM-DD)
`up_to`	str	No	End of the business date range to replicate (format: YYYY-MM-DD)
`bandwidth_limit`	int	No	Cap download speed in MB/s (rclone only)
`use_rclone`	bool	No	Set to False to force the native Python downloader. Default: True

Example 1 — Download a Date Range

from unifier import unifier
unifier.user = 'your_username'
unifier.token = 'your_api_token'

# Download data between two dates
unifier.replicate(
    name="xtech_macro_us_core_predictions",
    target_location="./data/downloads",
    back_to="2025-01-01",
    up_to="2025-06-30"
)

Example 2 — Download a Specific As-of Date Snapshot

# Download data as it existed on a specific point-in-time date
unifier.replicate(
    name="xtech_macro_us_core_predictions",
    target_location="./data/snapshots",
    asof_date="2025-04-04"
)

Example 3 — Limit Download Speed (rclone only)

# Limit bandwidth to 10MB/s to avoid network saturation
unifier.replicate(
    name="xtech_macro_us_core_predictions",
    target_location="./data/downloads",
    back_to="2025-01-01",
    up_to="2025-12-31",
    bandwidth_limit=10
)

Example 4 — Force Native Python Downloader (No rclone required)

# Force the native Python implementation (useful in corporate environments
# where rclone cannot be installed)
unifier.replicate(
    name="xtech_macro_us_core_predictions",
    target_location="./data/downloads",
    back_to="2025-01-01",
    up_to="2025-06-30",
    use_rclone=False
)

How It Works

If rclone is installed and use_rclone=True (default): uses rclone for fast, resumable downloads.
If rclone is not installed: automatically uses the native Python downloader — no action required from the user.
The native Python downloader uses parallel downloads (up to 10 concurrent files) and multipart transfers for files larger than 25MB.
Progress is printed to the console throughout the download.

Sample Queries

Basic Query with Business Time Range

This example queries data for Apple (AAPL) within a specific date range:

df = unifier.get_dataframe(
    name="xtech_apollo_group_aggregates_green_buffalo",
    back_to="2025-04-12",
    up_to="2025-04-18",
    key="AAPL",
    asof_date=None  # Uses the latest available data
)

Example output:

#	trade_date	sector	exchange_country	reportable_product	product_name	xt_group	direction	long_dollar_value
614	20241217	EQY	USA	YM	CBOT MINI DOW $5 MULTIPLIER FU	Fund Managers	1	126566210.0
406	20241217	EQY	USA	ES	CME E-MINI S&P 500 FUTURE	Other	1	111025837.5
407	20241217	EQY	USA	ES	CME E-MINI S&P 500 FUTURE	Producers/Hedgers	1	182350100.0
408	20241217	INT	DEU	FBT	EUREX EURO-BTP FUTURES	Fund Managers	1	209344989.84
409	20241217	EQY	FRA	FCE	MONEP CAC 40 FUTURE	Fund Managers	1	13761173.70000000
410	20241217	MYS	MYS	FCP	MALAY PALM OIL	Fund Managers	1	139229565.91
411	20241217	DEU	DEU	FDA	EUREX DAX INDEX FUTURE	Fund Managers	1	203754518.76
412	20241217	NRG	GBR	FDK	IFEU FO OTRT MFFOR SGPLTSLM I	Broker-Dealers	-1	368403750.0
413	20241217	EQY	DEU	FDX	EUREX MINI-DAX INDEX FUTURE	Fund Managers	1	15498668.22
414	20241217	EQY	DEU	FES	EUREX DJ EURO STOXX 50 INDEX FU	Fund Managers	1	62982988.584

Query with As-of Date Range

This example queries data across a range of as-of dates for a specific identifier:

df = unifier.get_dataframe(
    name="lseg_us_reuters_polls",
    key="US&CPIM.Q",
    back_to="2025-01-01",
    up_to="2025-04-30",
    asof_back_to="2025-01-01",
    asof_date="2025-05-12"
)
df = df.sort_values('last_revision_date')
df

Example output:

#	trade_date	sector	exchange_country	reportable_product	product_name	xt_group	direction	long_dollar_value
614	20241217	EQY	USA	YM	CBOT MINI DOW $5 MULTIPLIER FU	Fund Managers	1	126566210.0
406	20241217	EQY	USA	ES	CME E-MINI S&P 500 FUTURE	Other	1	111025837.5
407	20241217	EQY	USA	ES	CME E-MINI S&P 500 FUTURE	Producers/Hedgers	1	182350100.0
408	20241217	INT	DEU	FBT	EUREX EURO-BTP FUTURES	Fund Managers	1	209344989.84
409	20241217	EQY	FRA	FCE	MONEP CAC 40 FUTURE	Fund Managers	1	13761173.70000000
410	20241217	MYS	MYS	FCP	MALAY PALM OIL	Fund Managers	1	139229565.91
411	20241217	DEU	DEU	FDA	EUREX DAX INDEX FUTURE	Fund Managers	1	203754518.76
412	20241217	NRG	GBR	FDK	IFEU FO OTRT MFFOR SGPLTSLM I	Broker-Dealers	-1	368403750.0
413	20241217	EQY	DEU	FDX	EUREX MINI-DAX INDEX FUTURE	Fund Managers	1	15498668.22
414	20241217	EQY	DEU	FES	EUREX DJ EURO STOXX 50 INDEX FU	Fund Managers	1	62982988.584

Point-in-Time Query

This example retrieves data as it was known on April 25, 2025:

df = unifier.get_dataframe(
    name="xtech_apollo_group_aggregates_green_buffalo",
    asof_date="2025-04-25"  # Point-in-time date
)

Example output:

#	trade_date	sector	exchange_country	reportable_product	product_name	xt_group	direction	long_dollar_value
614	20241217	EQY	USA	YM	CBOT MINI DOW $5 MULTIPLIER FU	Fund Managers	1	126566210.0
406	20241217	EQY	USA	ES	CME E-MINI S&P 500 FUTURE	Other	1	111025837.5
407	20241217	EQY	USA	ES	CME E-MINI S&P 500 FUTURE	Producers/Hedgers	1	182350100.0
408	20241217	INT	DEU	FBT	EUREX EURO-BTP FUTURES	Fund Managers	1	209344989.84
409	20241217	EQY	FRA	FCE	MONEP CAC 40 FUTURE	Fund Managers	1	13761173.70000000
410	20241217	MYS	MYS	FCP	MALAY PALM OIL	Fund Managers	1	139229565.91
411	20241217	DEU	DEU	FDA	EUREX DAX INDEX FUTURE	Fund Managers	1	203754518.76
412	20241217	NRG	GBR	FDK	IFEU FO OTRT MFFOR SGPLTSLM I	Broker-Dealers	-1	368403750.0
413	20241217	EQY	DEU	FDX	EUREX MINI-DAX INDEX FUTURE	Fund Managers	1	15498668.22
414	20241217	EQY	DEU	FES	EUREX DJ EURO STOXX 50 INDEX FU	Fund Managers	1	62982988.584

Combined Point-in-Time and Business Time Range

This example combines both bi-temporal filters:

df = unifier.get_dataframe(
    name="xtech_us_equity_options_flow_1min",
    back_to="2025-04-22",    # Business time range start
    up_to="2025-04-29",      # Business time range end
    asof_date="2025-04-30"   # Point-in-time date
)

Multiple Keys with Time Range

This example queries data for multiple securities:

df = unifier.get_dataframe(
    name="xtech_apollo_group_aggregates_green_buffalo",
    back_to="2025-04-01",
    up_to="2025-04-15",
    keys=["AAPL", "MSFT", "GOOGL"],
    asof_date="2025-04-20"
)

Errors

The Unifier API uses HTTP response codes to indicate the success or failure of requests:

Code	Message	Description
200	OK	The request was successful.
400	Bad Request	The request was invalid or improperly formatted.
401	Unauthorized	Authentication failed or credentials were missing.
403	Forbidden	The provided credentials don't have access to the requested resource.
404	Not Found	The requested resource or dataset was not found.
422	Unprocessable Entity	The request was well-formed but contained invalid parameters.
429	Too Many Requests	Rate limit exceeded.
500	Internal Server Error	An unexpected error occurred on the server.
503	Service Unavailable	The service is temporarily unavailable.

Error Handling

It's recommended to implement proper error handling in your code:

try:
    df = unifier.get_dataframe(
        name="xtech_apollo_group_aggregates_green_buffalo",
        back_to="2025-04-12",
        up_to="2025-04-18",
        key="AAPL"
    )
except Exception as e:
    print(f"Error querying data: {e}")
    # Handle the error appropriately