sparrow 0.3.0
Loading...
Searching...
No Matches
Array

The array class is a dynamically typed array that can be built in many ways:

  • either from an existing typed array, that it will type-erased
  • or from Arrow C data interface. The array can simply have references to the structures, or take their ownership.

Example:

#include "thrid-party-lib.hpp"
namespace sp = sparrow;
namespace tpl = third_party_library;
ArrowArray arr, arr2;
ArrowSchema schema, schema2;
tpl::read_arrow_structures(&arr, &schema);
tpl::read_arrow_structures(&arr2, &schema2);
sp::array ar(&arr, &schema);
sp::array ar2(std::move(arr2), std::move(schema2));
// ...
arr.release(&arr);
schema.release(&schema);
// We don't release arr2 nor shcema2, ar2 will do it for us.
void(* release)(struct ArrowArray *)
void(* release)(struct ArrowSchema *)

The array class provides a similar API to that of the typed arrays, but with certain limitations: iterators are not provided, for performance reasons. Instead, a method for visiting the array and apply an algorithm to the undelying typed array is provided.

Deep copy

Copying an array always perform a deep copy, regardless of whether the source array owns its internal data. This reduces the complexity of the memory model when mixing views and arrays within layouts that have children.

Array API

Capacity

Like typed arrays, array provides the following methods regarding its capacity:

Method Description
empty Checks whether the container is empty
size Returns the number of elements
namespace sp = sparrow;
sp::primitive_array<int> pa = { 1, 2, 3, 4};
sp::array a(std::move(pa));
std::cout << a.size() << std::endl; // Prints 4
std::cout << a.empty() << std::endl; // Prints false

Element access

Accessing an element in an array yields a std::variant of nullable objects. The variant can hold any data type used by the typed array classes. A method is provided so that the user can retrieve the dynamic data_type of the array.

Method Description
data_type Returns the dynamic type of the data
at Access specified element with bounds checking
operator[] Access specified element
front Access the first element
back Access the last element

Example:

namespace sp = sparrow;
sp::primitive_array<int> pa = { 1, 2, 3, 4};
sp::array ar(std::move(pa));
std::visit([](const auto& arg) { std::cout << arg << '\n'; }, ar.front());
std::visit([](const auto& arg) { std::cout << arg << '\n'; }, ar.back());
std::visit([](const auto& arg) { std::cout << arg << '\n'; }, ar[i]);
std:cout << sp::data_type_to_format(ar.data_type()) << std::endl;

Visit

The visit function allows the user to apply a functor to each element of the array. The functor must accept any kind of typed array.

Example:

namespace sp = sparrow;
sp::primitive_array<int> pa = { 1, 2, 3, 4};
sp::array ar(std::move(pa));
std::visit([](const auto& arr)
{
std::for_each(arr.begin(), ar.end(), [](const auto& val)
{
// Do whatever you need here
// Keep in min val can be a primitive type,
// a string, or even a nested array.
});
}, ar);

Conversion to Arrow C data

sparrow provides free functions to either read data from sparrow arrays as Arrow C data, or to extract them.

Checking ownership

Method Description
owns_arrow_array Checks for internal Arrow Array ownership
owns_arrow_schema Checks for internal Arrow Schema ownership

Example:

namespace sp = sparrow;
sp::primitive_array<int> pa = { 1, 2, 3, 4};
sp::array ar(std::move(pa));
std::cout << owns_arrow_array(ar) << std::endl;
std::cout << owns_arrow_schema(ar) << std::endl;

Reading

These methods return pointers to the internal Arrow structures. One must NOT call the release method on these structures after use. The sparrow array object will release them upon destruction.

Method Description
get_arrow_array Returns a pointer to the internal ArrowArray               
get_arrow_schema Returns a pointer to the internal ArrowSchema              
get_arrow_structures Returns a pair of pointer to the internal Arrow structures

Example:

namespace sp = sparrow;
sp::primitive_array<int> pa = { 1, 2, 3, 4};
sp::array ar(std::move(pa));
ArrowArray* arr = get_arrow_array(ar);
ArrowSchema* sch = get_arrow_schema(ar);
// OR
auto [arr, sch] = get_arrow_structures(ar);

Extracting

These methods moves out of the array the internal Arrow structures. The user is responsible for calling the release method of these structures after use.

Method Description
extract_arrow_array Extracts the internal ArrowArray
extract_arrow_schema Extracts the internal ArrowSchema
extract_arrow_structures Extracts the internal Arrow structures

Example:

namespace sp = sparrow;
sp::primitive_array<int> pa = { 1, 2, 3, 4};
sp::array ar(std::move(pa));
ArrowArray* arr = extract_arrow_array(std::move(ar));
ArrowSchema* sch = extract_arrow_schema(std::move(ar));
// OR
auto [arr, sch] = extract_arrow_structures(std::move(ar));
// ...
arr.release(&arr);
sch.release(&sch);