Data Processing
The data processing module provides utilities for loading and preprocessing point cloud data, with particular focus on KITTI format support.
PointCloudLoader
Main class for loading and preprocessing point cloud data from various formats.
Methods
load_and_preprocess(file_path, format="auto")
Loads point cloud data and applies preprocessing steps.
Parameters:
file_path: Path to the point cloud fileformat: File format (“auto”, “kitti”, “ply”, “pcd”)
Returns:
coordinates: Point coordinates (N, 3)reflectivity: Reflectivity values (N,)
load_kitti_format(file_path)
Loads point cloud data in KITTI format (.bin files).
Parameters:
file_path: Path to KITTI .bin file
Returns:
coordinates: Point coordinates (N, 3)reflectivity: Reflectivity values (N,)
preprocess_point_cloud(coordinates, reflectivity, options=None)
Applies preprocessing operations to loaded point cloud data.
Parameters:
coordinates: Raw point coordinatesreflectivity: Raw reflectivity valuesoptions: Preprocessing options dictionary
Returns:
Preprocessed coordinates and reflectivity
Example
from rapid_seg.data import PointCloudLoader
# Initialize loader
loader = PointCloudLoader()
# Load KITTI format data
coordinates, reflectivity = loader.load_and_preprocess("data.bin", "kitti")
# Apply preprocessing
preprocessed_coords, preprocessed_reflectivity = loader.preprocess_point_cloud(
coordinates,
reflectivity,
options={
"normalize": True,
"remove_outliers": True,
"downsample": 0.1
}
)
Supported Formats
KITTI Format (.bin)
Binary format commonly used in autonomous driving datasets:
# Load KITTI data
coordinates, reflectivity = loader.load_kitti_format("velodyne/000001.bin")
# Data structure: [x, y, z, intensity]
# coordinates: (N, 3) - x, y, z coordinates
# reflectivity: (N,) - intensity/reflectivity values
PLY Format (.ply)
ASCII/binary format for 3D point clouds:
# Load PLY data
coordinates, reflectivity = loader.load_and_preprocess("model.ply", "ply")
PCD Format (.pcd)
Point Cloud Data format:
# Load PCD data
coordinates, reflectivity = loader.load_and_preprocess("cloud.pcd", "pcd")
Preprocessing Options
Normalization
# Normalize coordinates to unit sphere
options = {"normalize": True, "normalization_type": "sphere"}
Outlier Removal
# Remove statistical outliers
options = {"remove_outliers": True, "outlier_threshold": 3.0}
Downsampling
# Voxel-based downsampling
options = {"downsample": 0.1, "voxel_size": 0.1}
Filtering
# Range filtering
options = {
"range_filter": True,
"min_range": 0.0,
"max_range": 100.0
}
Data Validation
The loader includes built-in validation:
# Validate data integrity
is_valid = loader.validate_point_cloud(coordinates, reflectivity)
if not is_valid:
print("Invalid point cloud data detected")
# Handle invalid data
Performance Features
Lazy Loading: Load data only when needed
Memory Mapping: Efficient handling of large files
Batch Processing: Process multiple files simultaneously
Progress Tracking: Monitor loading progress for large datasets