API Reference¶
TrackingDataset¶
Container for tracking data and associated metadata.
Supports multiple DataFrame backends: - Polars (default): pl.DataFrame - PySpark: pyspark.sql.DataFrame
Attributes:
| Name | Type | Description |
|---|---|---|
tracking |
DataFrame or DataFrame
|
Tracking data. DataFrame type depends on engine parameter. |
metadata |
DataFrame or DataFrame
|
Single-row DataFrame with match-level metadata. |
teams |
DataFrame or DataFrame
|
Team information (2 rows: home and away). |
players |
DataFrame or DataFrame
|
Player information with team associations. |
periods |
DataFrame or DataFrame
|
Period information with period_id, start_frame_id, end_frame_id. |
engine |
str
|
The DataFrame engine being used ('polars' or 'pyspark'). |
Examples:
>>> from fastforward import secondspectrum
>>> dataset = secondspectrum.load_tracking("tracking.jsonl", "meta.json")
>>> dataset.tracking # pl.DataFrame
>>> dataset.metadata # pl.DataFrame (1 row)
>>> dataset.periods # pl.DataFrame (2+ rows)
pitch_dimensions
property
¶
Get current pitch dimensions (length, width) in meters.
to_polars ¶
Convert all DataFrames to Polars.
If already using Polars engine, returns self unchanged. For PySpark DataFrames, converts via Arrow/pandas interchange.
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
New TrackingDataset with all Polars DataFrames, or self if already Polars. |
Examples:
to_pyspark ¶
Convert all DataFrames to PySpark.
If already using PySpark engine, returns self unchanged. For Polars DataFrames, converts via Arrow interchange.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spark
|
SparkSession
|
SparkSession to use. If None, gets or creates one. |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
New TrackingDataset with all PySpark DataFrames, or self if already PySpark. |
Examples:
transform ¶
Transform tracking data to different orientation, dimensions, and/or coordinates.
Transformations are applied in the correct order internally: 1. Orientation (flip) - while in CDF/meters 2. Dimensions (zone-based scaling) - while in CDF/meters 3. Coordinates (unit/origin conversion) - last step
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
to_orientation
|
str
|
Target orientation. Options include: - "static_home_away": Home team attacks left-to-right in both halves - "static_away_home": Away team attacks left-to-right in both halves Note: Orientation transforms flip x and y around the center. |
None
|
to_dimensions
|
tuple of (float, float)
|
Target pitch dimensions (length, width) in meters. Uses zone-based scaling to preserve IFAB pitch feature proportions. |
None
|
to_coordinates
|
str
|
Target coordinate system. Options include: - "cdf": Center origin, meters (default) - "tracab": Center origin, centimeters - "opta": Bottom-left origin, 0-100 scale - "kloppy": Top-left origin, 0-1 scale - "sportvu": Top-left origin, meters |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
New dataset with transformed data, or self if no changes needed. |
Examples:
>>> dataset = secondspectrum.load_tracking("tracking.jsonl", "meta.json")
>>> # Single transformation
>>> tracab = dataset.transform(to_coordinates="tracab")
>>> # Multiple transformations (order handled internally)
>>> result = dataset.transform(
... to_orientation="static_away_home",
... to_dimensions=(105.0, 68.0),
... to_coordinates="tracab",
... )
Providers¶
Each provider module exposes a load_tracking() function that returns a TrackingDataset.
CDF¶
Load CDF (Common Data Format) tracking data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_data
|
FileLike
|
Path to JSONL tracking file, or bytes, or file-like object. Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
meta_data
|
FileLike
|
Path to JSON metadata file, or bytes, or file-like object. Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
layout
|
(long, long_ball, wide)
|
DataFrame layout: - "long": Ball as row with team_id="ball", player_id="ball" - "long_ball": Ball in separate columns, only player rows - "wide": One row per frame, player_id in column names |
"long"
|
coordinates
|
cdf
|
Coordinate system: - "cdf": Common Data Format (origin at center) |
"cdf"
|
orientation
|
str
|
Coordinate orientation: - "static_home_away": Home attacks right (+x) entire match - "static_away_home": Away attacks right (+x) entire match - "home_away": Home attacks right 1st half, left 2nd half - "away_home": Away attacks right 1st half, left 2nd half - "attack_right": Attacking team always attacks right - "attack_left": Attacking team always attacks left |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball is in play (ball_state == "alive") |
True
|
exclude_missing_ball_frames
|
bool
|
If True, exclude frames where ball coordinates are missing (null). |
True
|
include_game_id
|
bool or str
|
If True, add game_id column to tracking_df, team_df, and player_df from metadata. If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
engine
|
(polars, pyspark)
|
DataFrame engine to use: - "polars": Return Polars DataFrames (default) - "pyspark": Return PySpark DataFrames |
"polars"
|
spark_session
|
SparkSession
|
PySpark SparkSession to use. If None and engine="pyspark", will get or create a session automatically. |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. If engine="polars", .tracking returns pl.DataFrame. If engine="pyspark", all DataFrames are PySpark DataFrames. |
SecondSpectrum¶
Load SecondSpectrum tracking data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_data
|
FileLike
|
Path to JSONL tracking file, or bytes, or file-like object. Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
meta_data
|
FileLike
|
Path to JSON metadata file, or bytes, or file-like object. Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
layout
|
(long, long_ball, wide)
|
DataFrame layout: - "long": Ball as row with team_id="ball", player_id="ball" - "long_ball": Ball in separate columns, only player rows - "wide": One row per frame, player_id in column names |
"long"
|
coordinates
|
cdf
|
Coordinate system: - "cdf": Common Data Format (origin at center) |
"cdf"
|
orientation
|
str
|
Coordinate orientation: - "static_home_away": Home attacks right (+x) entire match - "static_away_home": Away attacks right (+x) entire match - "home_away": Home attacks right 1st half, left 2nd half - "away_home": Away attacks right 1st half, left 2nd half - "attack_right": Attacking team always attacks right - "attack_left": Attacking team always attacks left |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball is in play (ball_state == "alive") |
True
|
exclude_missing_ball_frames
|
bool
|
If True, exclude frames where ball coordinates are missing (ball_z == -10). SecondSpectrum uses ball_z = -10 as a sentinel value for failed ball tracking. |
True
|
include_game_id
|
bool or str
|
If True, add game_id column to tracking_df, team_df, and player_df from metadata. If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
engine
|
(polars, pyspark)
|
DataFrame engine to use: - "polars": Return Polars DataFrames (default) - "pyspark": Return PySpark DataFrames |
"polars"
|
spark_session
|
SparkSession
|
PySpark SparkSession to use. If None and engine="pyspark", will get or create a session automatically. |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. If engine="polars", .tracking returns pl.DataFrame. If engine="pyspark", all DataFrames are PySpark DataFrames. |
SkillCorner¶
Load SkillCorner tracking data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_data
|
FileLike
|
Path to JSONL tracking file (e.g., tracking_extrapolated.jsonl), or bytes, or file-like object. Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
meta_data
|
FileLike
|
Path to JSON match file (e.g., match.json), or bytes, or file-like object. Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
layout
|
(long, long_ball, wide)
|
DataFrame layout: - "long": Ball as row with team_id="ball", player_id="ball" - "long_ball": Ball in separate columns, only player rows - "wide": One row per frame, player_id in column names |
"long"
|
coordinates
|
cdf
|
Coordinate system: - "cdf": Common Data Format (origin at center) |
"cdf"
|
orientation
|
str
|
Coordinate orientation: - "static_home_away": Home attacks right (+x) entire match - "static_away_home": Away attacks right (+x) entire match - "home_away": Home attacks right 1st half, left 2nd half - "away_home": Away attacks right 1st half, left 2nd half - "attack_right": Attacking team always attacks right - "attack_left": Attacking team always attacks left |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball is in play (matches kloppy default) |
True
|
include_empty_frames
|
bool
|
If True, include frames with no detected players |
False
|
include_game_id
|
bool or str
|
If True, add game_id column to tracking_df, team_df, and player_df from metadata. If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
include_ball_owning_player
|
bool
|
If True, attach a |
False (will become True in fastforward 0.2.0)
|
include_is_detected
|
bool
|
If True, attach an |
False (will become True in fastforward 0.2.0)
|
engine
|
(polars, pyspark)
|
DataFrame engine to use: - "polars": Return Polars DataFrames (default) - "pyspark": Return PySpark DataFrames |
"polars"
|
spark_session
|
SparkSession
|
PySpark SparkSession to use. If None and engine="pyspark", will get or create a session automatically. |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. If engine="polars", .tracking returns pl.DataFrame. If engine="pyspark", all DataFrames are PySpark DataFrames. |
Sportec¶
Load Sportec tracking data from XML files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_data
|
FileLike
|
Path to tracking XML file (e.g., *_tracking.xml), or bytes, or file-like object. Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
meta_data
|
FileLike
|
Path to match info XML file (e.g., *_match_info.xml), or bytes, or file-like object. Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
layout
|
(long, long_ball, wide)
|
DataFrame layout: - "long": Ball as row with team_id="ball", player_id="ball" - "long_ball": Ball in separate columns, only player rows - "wide": One row per frame, player_id in column names |
"long"
|
coordinates
|
cdf
|
Coordinate system: - "cdf": Common Data Format (origin at center) |
"cdf"
|
orientation
|
str
|
Coordinate orientation: - "static_home_away": Home attacks right (+x) entire match - "static_away_home": Away attacks right (+x) entire match - "home_away": Home attacks right 1st half, left 2nd half - "away_home": Away attacks right 1st half, left 2nd half - "attack_right": Attacking team always attacks right - "attack_left": Attacking team always attacks left |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball is in play (matches kloppy default) |
True
|
include_game_id
|
bool or str
|
If True, add game_id column to tracking_df, team_df, and player_df from metadata. If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
include_officials
|
bool
|
If True, include officials in player_df with team_id="officials" and position codes: REF (Main Referee), AREF (Assistant Referee), VAR (Video Assistant Referee), AVAR (Assistant VAR), 4TH (Fourth Official) |
False
|
engine
|
(polars, pyspark)
|
DataFrame engine to use: - "polars": Return Polars DataFrames (default) - "pyspark": Return PySpark DataFrames |
"polars"
|
spark_session
|
SparkSession
|
PySpark SparkSession to use. If None and engine="pyspark", will get or create a session automatically. |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. If engine="polars", .tracking returns pl.DataFrame. If engine="pyspark", all DataFrames are PySpark DataFrames. |
Tracab¶
Load Tracab tracking data.
Supports multiple file formats: - Metadata: XML (hierarchical or flat format), JSON - Raw data: DAT (text/binary), JSON
The native Tracab coordinate system uses centimeters with origin at center. Coordinates are automatically converted to CDF (meters) internally and then transformed to the target coordinate system.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_data
|
FileLike
|
Path to tracking data file (.dat or .json), bytes, or file-like object. |
required |
meta_data
|
FileLike
|
Path to metadata file (.xml or .json), bytes, or file-like object. |
required |
layout
|
(long, long_ball, wide)
|
DataFrame layout: - "long": Ball as separate rows with team_id="ball" - "long_ball": Ball in separate columns (ball_x, ball_y, ball_z) - "wide": One row per frame, player columns as {player_id}_x, _y, _z |
"long"
|
coordinates
|
str
|
Target coordinate system. |
"cdf"
|
orientation
|
str
|
Target orientation. |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball is in play. |
True
|
include_game_id
|
bool or str
|
If True, add game_id column from metadata. If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
engine
|
(polars, pyspark)
|
DataFrame engine to use: - "polars": Return Polars DataFrames (default) - "pyspark": Return PySpark DataFrames |
"polars"
|
spark_session
|
SparkSession
|
PySpark SparkSession to use. If None and engine="pyspark", will get or create a session automatically. |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. If engine="polars", .tracking returns pl.DataFrame. If engine="pyspark", all DataFrames are PySpark DataFrames. |
Examples:
HawkEye¶
Load HawkEye tracking data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ball_data
|
FileLike or List[FileLike]
|
Ball tracking file(s). Can be: - Single FileLike: Path, bytes, or file-like object - List[FileLike]: Multiple ball files (one per minute) Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
player_data
|
FileLike or List[FileLike]
|
Player tracking file(s). Can be: - Single FileLike: Path, bytes, or file-like object - List[FileLike]: Multiple player files (one per minute) Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
meta_data
|
FileLike
|
Path to metadata file (JSON or XML), or bytes, or file-like object. Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths, zip files. |
required |
layout
|
long
|
DataFrame layout. Currently only "long" is supported. - "long": Ball as row with team_id="ball", player_id="ball" (TODO: "long_ball" and "wide" layouts) |
"long"
|
coordinates
|
cdf
|
Coordinate system. Currently only "cdf" is supported. - "cdf": Common Data Format (origin at center, meters) (TODO: Other coordinate systems) |
"cdf"
|
orientation
|
static_home_away
|
Coordinate orientation. Currently only "static_home_away" is supported. - "static_home_away": Home attacks right (+x) entire match (TODO: Other orientations) |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball is in play (play field == "In"). Uses HawkEye's "play" field instead of typical "live" field. |
True
|
pitch_length
|
float
|
Pitch length in meters. Used as fallback if not in metadata. Metadata values take precedence if present. |
105.0
|
pitch_width
|
float
|
Pitch width in meters. Used as fallback if not in metadata. Metadata values take precedence if present. |
68.0
|
object_id
|
(fifa, uefa, he, auto)
|
Object ID preference for team and player identification: - "fifa": Use FIFA IDs (error if not present) - "uefa": Use UEFA IDs (error if not present) - "he": Use HawkEye IDs - "auto": Prefer FIFA > UEFA > HawkEye (automatic fallback) - Custom string: Use custom ID field (error if not found) |
"fifa"
|
include_game_id
|
bool or str
|
If True, add game_id column to tracking_df, team_df, and player_df from metadata. If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
include_officials
|
bool
|
If True, include officials in player_df and tracking data with team_id="officials" and position codes: REF (Main Referee), AREF (Assistant Referee). |
False
|
engine
|
(polars, pyspark)
|
DataFrame engine to use: - "polars": Return Polars DataFrames (default) - "pyspark": Return PySpark DataFrames |
"polars"
|
spark_session
|
SparkSession
|
PySpark SparkSession to use. If None and engine="pyspark", will get or create a session automatically. |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. If engine="polars", .tracking returns pl.DataFrame. If engine="pyspark", all DataFrames are PySpark DataFrames. |
Notes
- Officials are excluded by default; set include_officials=True to include them
- Period and minute are extracted from filename patterns like hawkeye_1_1.ball
Examples:
Load from file paths:
>>> ball_files = ["hawkeye_1_1.ball", "hawkeye_1_2.ball"]
>>> player_files = ["hawkeye_1_1.centroids", "hawkeye_1_2.centroids"]
>>> dataset = load_tracking(ball_files, player_files, "hawkeye_meta.json")
>>> tracking_df = dataset.tracking
Load with specific object ID preference:
PySpark engine:
Signality¶
Load Signality tracking data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
meta_data
|
FileLike
|
Path to metadata file (JSON), or bytes, or file-like object. Contains team names, player info, lineups, and match timestamp. |
required |
raw_data_feeds
|
FileLike or List[FileLike]
|
Raw tracking data file(s). Can be: - Single FileLike: Path, bytes, or file-like object - List[FileLike]: Multiple files (one per period) Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths. |
required |
venue_information
|
FileLike
|
Path to venue information file (JSON) containing pitch dimensions. |
required |
layout
|
(long, long_ball, wide)
|
DataFrame layout: - "long": Ball as row with team_id="ball", player_id="ball" - "long_ball": Ball in separate columns, only player rows - "wide": One row per frame, player_id in column names |
"long"
|
coordinates
|
str
|
Coordinate system. Options: - "cdf": Common Data Format (origin at center, meters) - "signality": Native coordinates (same as CDF) - Other provider coordinate systems |
"cdf"
|
orientation
|
str
|
Coordinate orientation: - "static_home_away": Home attacks right (+x) entire match - Other orientations available |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball is in play ("running" state). |
True
|
include_game_id
|
bool or str
|
If True, add game_id column from metadata. If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
include_officials
|
bool
|
If True, include officials in player_df and tracking data with team_id="officials" and position codes: REF (Main Referee), AREF (Assistant Referee), FOURTH (4th Official). |
False
|
parallel
|
bool
|
If True, process multiple files in parallel using rayon. |
True
|
engine
|
(polars, pyspark)
|
DataFrame engine to use: - "polars": Return Polars DataFrames (default) - "pyspark": Return PySpark DataFrames |
"polars"
|
spark_session
|
SparkSession
|
PySpark SparkSession to use. If None and engine="pyspark", will get or create a session automatically. |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. |
Notes
- Signality uses center-origin coordinates in meters (same as CDF)
- Period is extracted from filename patterns like signality_p1_raw_data.json
- Frame rate is typically 25 Hz (40ms between frames)
Examples:
Load from file paths:
StatsPerform¶
Load StatsPerform tracking data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ma25_data
|
FileLike
|
Path to MA25 tracking data file (text format). |
required |
ma1_data
|
FileLike
|
Path to MA1 metadata file (JSON or XML format, auto-detected). |
required |
pitch_length
|
float
|
Length of the pitch in meters. StatsPerform data does not include pitch dimensions, so this must be provided. Default: 105.0m. |
None
|
pitch_width
|
float
|
Width of the pitch in meters. StatsPerform data does not include pitch dimensions, so this must be provided. Default: 68.0m. |
None
|
layout
|
(long, long_ball, wide)
|
DataFrame layout: - "long": Ball as row with team_id="ball", player_id="ball" - "long_ball": Ball in separate columns, only player rows - "wide": One row per frame, player_id in column names |
"long"
|
coordinates
|
str
|
Coordinate system for output. Options: - "cdf": Center origin, meters (default) - "statsperform" / "sportvu": Native top-left origin, y-down, meters - Other provider coordinate systems |
"cdf"
|
orientation
|
str
|
Coordinate orientation |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball is in play |
True
|
include_game_id
|
Union[bool, str]
|
If True, add game_id column from metadata. If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
include_officials
|
bool
|
If True, include match officials (referees) in the players DataFrame with team_id="officials" and appropriate position codes (REF, AREF, 4TH). |
False
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. |
Notes
StatsPerform uses the SportVU coordinate system: - Origin at top-left corner of the pitch - X increases left to right (0 to ~105m) - Y increases top to bottom (0 to ~68m) - inverted from standard - Units are meters - Frame rate is typically 10 Hz (100ms between frames)
The MA1 metadata format is auto-detected (JSON or XML) based on content.
GradientSports¶
Load GradientSports (PFF) tracking data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_data
|
FileLike
|
Path to JSONL tracking file |
required |
meta_data
|
FileLike
|
Path to JSON metadata file |
required |
roster_data
|
FileLike
|
Path to JSON roster file |
required |
layout
|
(long, long_ball, wide)
|
DataFrame layout: - "long": Ball as row with team_id="ball", player_id="ball" - "long_ball": Ball in separate columns, only player rows - "wide": One row per frame, player_id in column names |
"long"
|
coordinates
|
str
|
Coordinate system (gradientsports uses CDF format natively) |
"gradientsports"
|
orientation
|
str
|
Coordinate orientation |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball is in play |
True
|
include_incomplete_frames
|
bool
|
If True, include frames with null ball coordinates or null player arrays. If False (default), only include frames with complete data. |
False
|
include_game_id
|
Union[bool, str]
|
If True, add game_id column from metadata. If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. |
Respovision¶
Load Respovision tracking data.
Respovision data comes in a single JSONL file containing all tracking frames with embedded metadata. Team names are extracted from the filename pattern YYYYMMDD-HomeTeam-AwayTeam-*.jsonl.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw_data
|
FileLike
|
Path to JSONL tracking file, or bytes, or file-like object. Filename pattern: YYYYMMDD-HomeTeam-AwayTeam-*.jsonl Supports: file paths (str/Path), bytes, file objects, URLs, S3 paths. |
required |
layout
|
(long, long_ball, wide)
|
DataFrame layout: - "long": Ball as row with team_id="ball", player_id="ball" - "long_ball": Ball in separate columns, only player rows - "wide": One row per frame, player_id in column names Note: Wide layout does not include joint angles. |
"long"
|
coordinates
|
str
|
Coordinate system. Options: - "cdf": Common Data Format (origin at center, meters) - "respovision": Native coordinates (origin at bottom-left corner, meters) - Other provider coordinate systems |
"cdf"
|
orientation
|
str
|
Coordinate orientation: - "static_home_away": Home attacks right (+x) entire match - Other orientations available |
"static_home_away"
|
only_alive
|
bool
|
If True, only include frames where ball_possession is not null. |
True
|
exclude_missing_ball_frames
|
bool
|
If True, exclude frames where ball coordinates are missing (null). Respovision data may have frames where ball tracking failed. |
True
|
pitch_length
|
float
|
Pitch length in meters. Used for coordinate transformation. |
105.0
|
pitch_width
|
float
|
Pitch width in meters. Used for coordinate transformation. |
68.0
|
include_game_id
|
bool or str
|
If True, add game_id column (auto-generated from filename). If False, no game_id column is added. If str, use the provided string as the game_id value. |
True
|
include_joint_angles
|
bool
|
If True, include head_angle, shoulders_angle, hips_angle columns. Only applies to long and long_ball layouts. |
True
|
include_officials
|
bool
|
If True, include referees in tracking data with team_id="officials". |
False
|
engine
|
(polars, pyspark)
|
DataFrame engine to use: - "polars": Return Polars DataFrames (default) - "pyspark": Return PySpark DataFrames |
"polars"
|
spark_session
|
SparkSession
|
PySpark SparkSession to use. If None and engine="pyspark", will get or create a session automatically. |
None
|
Returns:
| Type | Description |
|---|---|
TrackingDataset
|
Object with .tracking, .metadata, .teams, .players, .periods properties. |
Notes
- Native coordinate system (respovision): origin at bottom-left corner, meters X in [0, pitch_length], Y in [0, pitch_width]
- Home/away team designation is extracted from filename
- Player IDs are formatted as {team_name_lower}_{jersey_number}
- Team IDs are lowercase team names with spaces replaced by underscores
- Game ID default format: YYYYMMDD-{home_prefix}-{away_prefix}
- Frame rate is typically 25 Hz
- Ball state: alive if ball_possession is not null, dead otherwise
- Joint angles may contain null values (especially for goalkeepers)
Examples:
Load from file path:
>>> from fastforward import respovision
>>> dataset = respovision.load_tracking(
... "20240714-Argentina-Colombia-2d_tracking-tactical.jsonl",
... pitch_length=105.0,
... pitch_width=68.0,
... )
>>> tracking_df = dataset.tracking
Load without joint angles:
Transforms¶
transform_coordinates¶
Transform DataFrame coordinates between coordinate systems.
Uses CDF as intermediate format: source -> CDF -> target.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with x, y columns (and optionally z) |
required |
from_system
|
str
|
Source coordinate system (e.g., "cdf", "tracab", "opta") |
required |
to_system
|
str
|
Target coordinate system |
required |
pitch_length
|
float
|
Pitch length in meters |
required |
pitch_width
|
float
|
Pitch width in meters |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with transformed x, y, z columns |
transform_dimensions¶
Transform DataFrame to different pitch dimensions using zone-based scaling.
Uses IFAB standard zone boundaries to preserve pitch feature proportions (penalty area, six-yard box, center circle, etc.).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with x, y columns (must be in CDF format: center origin, meters) |
required |
from_length
|
float
|
Source pitch length in meters |
required |
from_width
|
float
|
Source pitch width in meters |
required |
to_length
|
float
|
Target pitch length in meters |
required |
to_width
|
float
|
Target pitch width in meters |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with zone-scaled x, y coordinates |
transform_orientation¶
Transform DataFrame orientation by flipping coordinates.
Orientation flipping negates x and y coordinates around the center (0, 0). This is used to ensure consistent attacking direction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with x, y columns (must be in CDF format: center origin) |
required |
flip
|
bool
|
If True, flip the coordinates (negate x and y) |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with flipped x, y coordinates (if flip=True) |