sparrow 0.3.0
Loading...
Searching...
No Matches
sparrow::record_batch Class Reference

Table-like data structure. More...

#include <record_batch.hpp>

Public Types

using name_type = std::string
 
using size_type = std::size_t
 
using initializer_type = std::initializer_list<std::pair<name_type, array>>
 
using name_range = std::ranges::ref_view<const std::vector<name_type>>
 
using column_range = std::ranges::ref_view<const std::vector<array>>
 

Public Member Functions

template<std::ranges::input_range NR, std::ranges::input_range CR>
requires (std::convertible_to<std::ranges::range_value_t<NR>, std::string> and std::same_as<std::ranges::range_value_t<CR>, array>)
 record_batch (NR &&names, CR &&columns)
 Constructs a record_batch from a range of names and a range of arrays.
 
template<std::ranges::input_range CR>
requires std::same_as<std::ranges::range_value_t<CR>, array>
 record_batch (CR &&columns)
 
SPARROW_API record_batch (initializer_type init)
 Constructs a record_batch from a list of std::pair<name_type, array>.
 
SPARROW_API record_batch (struct_array &&ar)
 Construct a record batch from the given struct array.
 
SPARROW_API record_batch (const record_batch &)
 
SPARROW_API record_batchoperator= (const record_batch &)
 
 record_batch (record_batch &&)=default
 
record_batchoperator= (record_batch &&)=default
 
SPARROW_API size_type nb_columns () const
 
SPARROW_API size_type nb_rows () const
 
SPARROW_API bool contains_column (const name_type &key) const
 Checks if the record_batch constains a column mapped to the specified name.
 
SPARROW_API const name_typeget_column_name (size_type index) const
 
SPARROW_API const arrayget_column (const name_type &key) const
 
SPARROW_API const arrayget_column (size_type index) const
 
SPARROW_API name_range names () const
 
SPARROW_API column_range columns () const
 
SPARROW_API struct_array extract_struct_array ()
 Moves the internal columns of the record batch into a struct_array object.
 
SPARROW_API void add_column (name_type name, array column)
 Appends the array column to the record batch, and maps it with name.
 
SPARROW_API void add_column (array column)
 Appends the array column to the record batch, and maps it to its internal name.
 

Detailed Description

Table-like data structure.

A record batch is a collection of equal-length arrays mapped to names. Each array represents a column of the table. record_batch is provided as a convenient unit of work for various serialization and computation functions.

Example of usage:

const std::vector<std::string> name_list = {"first", "second", "third"};
constexpr std::size_t data_size = 10;
const std::vector<sparrow::array> array_list = make_array_list(data_size);
const sparrow::record_batch record{name_list, array_list};
assert(record.nb_columns() == array_list.size());
assert(record.nb_rows() == data_size);
assert(record.contains_column(name_list[0]));
assert(record.get_column_name(0) == name_list[0]);
assert(record.get_column(0) == array_list[0]);
assert(std::ranges::equal(record.names(), name_list));
assert(std::ranges::equal(record.columns(), array_list));
Examples
record_batch_example.cpp.

Definition at line 46 of file record_batch.hpp.

Member Typedef Documentation

◆ column_range

using sparrow::record_batch::column_range = std::ranges::ref_view<const std::vector<array>>

Definition at line 55 of file record_batch.hpp.

◆ initializer_type

using sparrow::record_batch::initializer_type = std::initializer_list<std::pair<name_type, array>>

Definition at line 52 of file record_batch.hpp.

◆ name_range

using sparrow::record_batch::name_range = std::ranges::ref_view<const std::vector<name_type>>

Definition at line 54 of file record_batch.hpp.

◆ name_type

using sparrow::record_batch::name_type = std::string

Definition at line 50 of file record_batch.hpp.

◆ size_type

using sparrow::record_batch::size_type = std::size_t

Definition at line 51 of file record_batch.hpp.

Constructor & Destructor Documentation

◆ record_batch() [1/6]

template<std::ranges::input_range NR, std::ranges::input_range CR>
requires (std::convertible_to<std::ranges::range_value_t<NR>, std::string> and std::same_as<std::ranges::range_value_t<CR>, array>)
sparrow::record_batch::record_batch ( NR && names,
CR && columns )

Constructs a record_batch from a range of names and a range of arrays.

Each array is internally mapped to the name at the same position in the names range.

Parameters
namesAn input range of names. The names must be unique.
columnsAn input range of arrays.

Definition at line 209 of file record_batch.hpp.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ record_batch() [2/6]

template<std::ranges::input_range CR>
requires std::same_as<std::ranges::range_value_t<CR>, array>
sparrow::record_batch::record_batch ( CR && columns)

Definition at line 233 of file record_batch.hpp.

Here is the call graph for this function:

◆ record_batch() [3/6]

SPARROW_API sparrow::record_batch::record_batch ( initializer_type init)

Constructs a record_batch from a list of std::pair<name_type, array>.

Parameters
inita list of pair "name - array".

◆ record_batch() [4/6]

SPARROW_API sparrow::record_batch::record_batch ( struct_array && ar)

Construct a record batch from the given struct array.

The array must owns its internal arrow structures.

Parameters
arAn input struct array

◆ record_batch() [5/6]

SPARROW_API sparrow::record_batch::record_batch ( const record_batch & )
Here is the call graph for this function:

◆ record_batch() [6/6]

sparrow::record_batch::record_batch ( record_batch && )
default
Here is the call graph for this function:

Member Function Documentation

◆ add_column() [1/2]

SPARROW_API void sparrow::record_batch::add_column ( array column)

Appends the array column to the record batch, and maps it to its internal name.

column must have a name.

Parameters
columnThe array to append.

◆ add_column() [2/2]

SPARROW_API void sparrow::record_batch::add_column ( name_type name,
array column )

Appends the array column to the record batch, and maps it with name.

Parameters
nameThe name of the column to append.
columnThe array to append.

◆ columns()

SPARROW_API column_range sparrow::record_batch::columns ( ) const
Returns
a range of the columns (i.e. arrays) hold in this record_batch.
Here is the caller graph for this function:

◆ contains_column()

SPARROW_API bool sparrow::record_batch::contains_column ( const name_type & key) const

Checks if the record_batch constains a column mapped to the specified name.

Parameters
keyThe name of the column.
Returns
true if the record_batch contains the mapping, false otherwise.

◆ extract_struct_array()

SPARROW_API struct_array sparrow::record_batch::extract_struct_array ( )

Moves the internal columns of the record batch into a struct_array object.

The record batch is empty anymore after calling this method.

◆ get_column() [1/2]

SPARROW_API const array & sparrow::record_batch::get_column ( const name_type & key) const
Returns
the column mapped ot the specified name in the record_batch.
Parameters
keyThe name of the column to search for.
Exceptions
std::out_of_rangeif the column is not found.

◆ get_column() [2/2]

SPARROW_API const array & sparrow::record_batch::get_column ( size_type index) const
Returns
the column at the specified index in the record_batch.
Parameters
indexThe index of the column. The index must be less than the number of columns.

◆ get_column_name()

SPARROW_API const name_type & sparrow::record_batch::get_column_name ( size_type index) const
Returns
the name mapped to the column at the given index.
Parameters
indexThe index of the column in the record_batch. The index must be less than the number of columns.

◆ names()

SPARROW_API name_range sparrow::record_batch::names ( ) const
Returns
a range of the names in the record_batch.
Here is the caller graph for this function:

◆ nb_columns()

SPARROW_API size_type sparrow::record_batch::nb_columns ( ) const
Returns
the number of columns (i.e. arrays) in the record_batch.

◆ nb_rows()

SPARROW_API size_type sparrow::record_batch::nb_rows ( ) const
Returns
the number of rows (i.e. the size of each array) in the record_batch.

◆ operator=() [1/2]

SPARROW_API record_batch & sparrow::record_batch::operator= ( const record_batch & )
Here is the call graph for this function:

◆ operator=() [2/2]

record_batch & sparrow::record_batch::operator= ( record_batch && )
default
Here is the call graph for this function:

The documentation for this class was generated from the following file: