Data Functions
param
Create a deferred parameter of a given type.
Parameters
Name | Type | Description | Default |
---|---|---|---|
type | Union[dt.DataType, str] | The type of the unbound parameter, e.g., double, int64, date, etc. | required |
Returns
Name | Type | Description |
---|---|---|
Scalar | A scalar expression backend by a parameter |
Examples
schema
Validate and return a Schema
object.
Parameters
Name | Type | Description | Default |
---|---|---|---|
pairs | SchemaLike | None | List or dictionary of name, type pairs. Mutually exclusive with names and types arguments. | None |
names | Iterable[str] | None | Field names. Mutually exclusive with pairs . | None |
types | Iterable[str | dt.DataType] | None | Field types. Mutually exclusive with pairs . | None |
Returns
Name | Type | Description |
---|---|---|
Schema | An ibis schema |
Examples
table
Create a table literal or an abstract table without data.
Ibis uses the word database to refer to a collection of tables, and the
word catalog to refer to a collection of databases. You can use a
combination of catalog
and database
to specify a hierarchical
location for table.
Parameters
Name | Type | Description | Default |
---|---|---|---|
schema | SchemaLike | None | A schema for the table | None |
name | str | None | Name for the table. One is generated if this value is None . | None |
catalog | str | None | A collection of database. | None |
database | str | None | A collection of tables. Required if catalog is not None . | None |
Returns
Name | Type | Description |
---|---|---|
Table | A table expression |
Examples
Create a table with no data backing it
Create a table with no data backing it in a specific location
memtable
Construct an ibis table expression from in-memory data.
Parameters
Name | Type | Description | Default |
---|---|---|---|
data | A table-like object (pandas.DataFrame , pyarrow.Table , or polars.DataFrame ), or any data accepted by the pandas.DataFrame constructor (e.g. a list of dicts). Note that ibis objects (e.g. MapValue ) may not be passed in as part of data and will result in an error. Do not depend on the underlying storage type (e.g., pyarrow.Table), it’s subject to change across non-major releases. | required | |
columns | Iterable[str] | None | Optional of column names. If provided, must match the number of columns in data . | None |
schema | SchemaLike | None | Optional Schema . The functions use data to infer a schema if not passed. | None |
name | str | None | Optional name of the table. | None |
Returns
Name | Type | Description |
---|---|---|
Table | A table expression backed by in-memory data. |
Examples
Create a table literal without column names embedded in the data and
pass columns
Create a table literal without column names embedded in the data. Ibis generates column names if none are provided.
desc
Create a descending sort key from expr
or column name.
Parameters
Name | Type | Description | Default |
---|---|---|---|
expr | ir.Column | str | The expression or column name to use for sorting | required |
See Also
Examples
Returns
Name | Type | Description |
---|---|---|
ir.ValueExpr | An expression |
asc
Create an ascending sort key from asc
or column name.
Parameters
Name | Type | Description | Default |
---|---|---|---|
expr | ir.Column | str | The expression or column name to use for sorting | required |
See Also
Examples
Returns
Name | Type | Description |
---|---|---|
ir.ValueExpr | An expression |
preceding
following
and_
Combine multiple predicates using &
.
Parameters
Name | Type | Description | Default |
---|---|---|---|
predicates | ir.BooleanValue | Boolean value expressions | () |
Returns
Name | Type | Description |
---|---|---|
BooleanValue | A new predicate that evaluates to True if all composing predicates are True. If no predicates were provided, returns True. |
or_
Combine multiple predicates using |
.
Parameters
Name | Type | Description | Default |
---|---|---|---|
predicates | ir.BooleanValue | Boolean value expressions | () |
Returns
Name | Type | Description |
---|---|---|
BooleanValue | A new predicate that evaluates to True if any composing predicates are True. If no predicates were provided, returns False. |
random
Return a random floating point number in the range [0.0, 1.0).
Similar to in the Python standard library.
ibis.random()
will generate a column of distinct random numbers even
if the same instance of ibis.random()
is reused.
When Ibis compiles an expression to SQL, each place where random
is
used will render as a separate call to the given backend’s random number
generator.
import letsql r_a = letsql.random() # doctest: +SKIP
Returns
Name | Type | Description |
---|---|---|
FloatingScalar | Random float value expression |
uuid
Return a random UUID version 4 value.
Similar to [(’uuid.uuid4`) in the Python standard library.
Examples
Returns
Name | Type | Description |
---|---|---|
UUIDScalar | Random UUID value expression |
case
Begin constructing a case expression.
Use the .when
method on the resulting object followed by .end
to
create a complete case expression.
Returns
Name | Type | Description |
---|---|---|
SearchedCaseBuilder | A builder object to use for constructing a case expression. |
See Also
Examples
now
Return an expression that will compute the current timestamp.
Returns
Name | Type | Description |
---|---|---|
TimestampScalar | An expression representing the current timestamp. |
today
Return an expression that will compute the current date.
Returns
Name | Type | Description |
---|---|---|
DateScalar | An expression representing the current date. |
rank
Compute position of first element within each equal-value group in sorted order.
Equivalent to SQL’s RANK()
window function.
Returns
Name | Type | Description |
---|---|---|
Int64Column | The min rank |
Examples
dense_rank
Position of first element within each group of equal values.
Values are returned in sorted order and duplicate values are ignored.
Equivalent to SQL’s DENSE_RANK()
.
Returns
Name | Type | Description |
---|---|---|
IntegerColumn | The rank |
Examples
percent_rank
Return the relative rank of the values in the column.
Returns
Name | Type | Description |
---|---|---|
FloatingColumn | The percent rank |
Examples
cume_dist
Return the cumulative distribution over a window.
Returns
Name | Type | Description |
---|---|---|
FloatingColumn | The cumulative distribution |
Examples
ntile
Return the integer number of a partitioning of the column values.
Parameters
Name | Type | Description | Default |
---|---|---|---|
buckets | int | ir.IntegerValue | Number of buckets to partition into | required |
Examples
row_number
Return an analytic function expression for the current row number.
row_number
is normalized across backends to start at 0
Returns
Name | Type | Description |
---|---|---|
IntegerColumn | A column expression enumerating rows |
Examples
read_csv
Lazily load a CSV or set of CSVs.
This function delegates to the read_csv
method on the current default
backend (DuckDB or ibis.config.default_backend
).
Parameters
Name | Type | Description | Default |
---|---|---|---|
sources | str | Path | Sequence[str | Path] | A filesystem path or URL or list of same. Supports CSV and TSV files. | required |
table_name | str | None | A name to refer to the table. If not provided, a name will be generated. | None |
kwargs | Any | Backend-specific keyword arguments for the file type. For the DuckDB backend used by default, please refer to: * CSV/TSV: https://duckdb.org/docs/data/csv/overview.html#parameters. | {} |
Returns
Name | Type | Description |
---|---|---|
ir.Table | Table expression representing a file |
Examples
read_parquet
Lazily load a parquet file or set of parquet files.
This function delegates to the read_parquet
method on the current
default backend (DuckDB or ibis.config.default_backend
).
Parameters
Name | Type | Description | Default |
---|---|---|---|
sources | str | Path | Sequence[str | Path] | A filesystem path or URL or list of same. | required |
table_name | str | None | A name to refer to the table. If not provided, a name will be generated. | None |
kwargs | Any | Backend-specific keyword arguments for the file type. For the DuckDB backend used by default, please refer to: * Parquet: https://duckdb.org/docs/data/parquet | {} |
Returns
Name | Type | Description |
---|---|---|
ir.Table | Table expression representing a file |
Examples
register
read_postgres
read_sqlite
union
Compute the set union of multiple table expressions.
The input tables must have identical schemas.
Parameters
Name | Type | Description | Default |
---|---|---|---|
table | ir.Table | A table expression | required |
*rest | ir.Table | Additional table expressions | () |
distinct | bool | Only return distinct rows | False |
Returns
Name | Type | Description |
---|---|---|
Table | A new table containing the union of all input tables. |
Examples
intersect
Compute the set intersection of multiple table expressions.
The input tables must have identical schemas.
Parameters
Name | Type | Description | Default |
---|---|---|---|
table | ir.Table | A table expression | required |
*rest | ir.Table | Additional table expressions | () |
distinct | bool | Only return distinct rows | True |
Returns
Name | Type | Description |
---|---|---|
Table | A new table containing the intersection of all input tables. |
Examples
difference
Compute the set difference of multiple table expressions.
The input tables must have identical schemas.
Parameters
Name | Type | Description | Default |
---|---|---|---|
table | ir.Table | A table expression | required |
*rest | ir.Table | Additional table expressions | () |
distinct | bool | Only diff distinct rows not occurring in the calling table | True |
Returns
Name | Type | Description |
---|---|---|
Table | The rows present in self that are not present in tables . |
Examples
ifelse
Construct a ternary conditional expression.
Parameters
Name | Type | Description | Default |
---|---|---|---|
condition | Any | A boolean expression | required |
true_expr | Any | Expression to return if condition evaluates to True | required |
false_expr | Any | Expression to return if condition evaluates to False or NULL | required |
Returns
Name | Type | Description |
---|---|---|
Value | ir.Value | The value of true_expr if condition is True else false_expr |
See Also
Examples
coalesce
Return the first non-null value from args
.
Parameters
Name | Type | Description | Default |
---|---|---|---|
args | Any | Arguments from which to choose the first non-null value | () |
Returns
Name | Type | Description |
---|---|---|
Value | Coalesced expression |
See Also
Value.coalesce()
Value.fill_null()
Examples
greatest
Compute the largest value among the supplied arguments.
Parameters
Name | Type | Description | Default |
---|---|---|---|
args | Any | Arguments to choose from | () |
Returns
Name | Type | Description |
---|---|---|
Value | Maximum of the passed arguments |
Examples
least
Compute the smallest value among the supplied arguments.
Parameters
Name | Type | Description | Default |
---|---|---|---|
args | Any | Arguments to choose from | () |
Returns
Name | Type | Description |
---|---|---|
Value | Minimum of the passed arguments |
Examples
range
Generate a range of values.
Integer ranges are supported, as well as timestamp ranges.
start
is inclusive and stop
is exclusive, just like Python’s builtin
range.
When step
equals 0, however, this function will return an empty array.
Python’s range
will raise an exception when step
is zero.
Parameters
Name | Type | Description | Default |
---|---|---|---|
start | Lower bound of the range, inclusive. | required | |
stop | Upper bound of the range, exclusive. | required | |
step | Step value. Optional, defaults to 1. | required |
Returns
Name | Type | Description |
---|---|---|
ArrayValue | An array of values |
Examples
Range using only a stop argument
Simple range using start and stop
Generate an empty range
Negative step values are supported
ibis.range
behaves the same as Python’s range …
… except when the step is zero, in which case ibis.range
returns an
empty array
Because the resulting expression is array, you can unnest the values
timestamp
Construct a timestamp scalar or column.
Parameters
Name | Type | Description | Default |
---|---|---|---|
value_or_year | Either a string value or datetime.datetime to coerce to a timestamp, or an integral value representing the timestamp year component. | required | |
month | The timestamp month component; required if value_or_year is a year. | None | |
day | The timestamp day component; required if value_or_year is a year. | None | |
hour | The timestamp hour component; required if value_or_year is a year. | None | |
minute | The timestamp minute component; required if value_or_year is a year. | None | |
second | The timestamp second component; required if value_or_year is a year. | None | |
timezone | The timezone name, or none for a timezone-naive timestamp. | None |
Returns
Name | Type | Description |
---|---|---|
TimestampValue | A timestamp expression |
Examples
Create a timestamp scalar from a string
Create a timestamp scalar from components
Create a timestamp column from components
date
time
Return a time literal if value
is coercible to a time.
Parameters
Name | Type | Description | Default |
---|---|---|---|
value_or_hour | Either a string value or datetime.time to coerce to a time, or an integral value representing the time hour component. | required | |
minute | The time minute component; required if value_or_hour is an hour. | None | |
second | The time second component; required if value_or_hour is an hour. | None |
Returns
Name | Type | Description |
---|---|---|
TimeValue | A time expression |
Examples
Create a time scalar from a string
Create a time scalar from hour, minute, and second
Create a time column from hour, minute, and second
interval
Return an interval literal expression.
Parameters
Name | Type | Description | Default |
---|---|---|---|
value | int | datetime.timedelta | None | Interval value. | None |
unit | str | Unit of value | 's' |
years | int | None | Number of years | None |
quarters | int | None | Number of quarters | None |
months | int | None | Number of months | None |
weeks | int | None | Number of weeks | None |
days | int | None | Number of days | None |
hours | int | None | Number of hours | None |
minutes | int | None | Number of minutes | None |
seconds | int | None | Number of seconds | None |
milliseconds | int | None | Number of milliseconds | None |
microseconds | int | None | Number of microseconds | None |
nanoseconds | int | None | Number of nanoseconds | None |
Returns
Name | Type | Description |
---|---|---|
IntervalScalar | An interval expression |
to_sql
Return the formatted SQL string for an expression.
Parameters
Name | Type | Description | Default |
---|---|---|---|
expr | ir.Expr | Ibis expression. | required |
pretty | bool | Whether to use pretty formatting. | True |
Returns
Name | Type | Description |
---|---|---|
str | Formatted SQL string |