User Guide
Getting Started
Installation
xorq can be installed using pip:
Or using nix to drop into an IPython shell:
Quick Start
Let’s build a simple data pipeline using xorq to analyze the iris dataset:
First Pipeline Walk-through
Let’s build a more complex pipeline that demonstrates key xorq features. We’ll:
- Cache intermediate results from one engine into another
- Use a machine learning model for predictions
- Filter the results
This example demonstrates several key xorq capabilities:
- Multi-engine Support: Using both DataFusion and DuckDB
- Caching: Persisting intermediate results with ParquetCacheStorage
- ML Integration: Loading and using XGBoost models
- Expressive API: Chaining operations with a ibis-like interface
Next Steps
- Explore multi-engine workflows using
into_backend
- Learn about different caching strategies
- Create custom UDFs for data transformations
- Check out sample scripts in the examples directory