.. _enumerators: Enumerators =========== Although not used that frequently, Ruby supports enumerators that enable both internal and external iteration. The easiest way to create an enumerator is to not pass a block to an enumerable method. For example: .. code-block:: ruby a = [1, 2, 3, 4, 5] # Common way to iterate a.each do |i| puts i end # Get an enumerator instead enumerator = a.each # Now use it enumerator.map |i| i * 2 end Rice has built-in support to returning enumerators for STL containers such as :ref:`std_vector`, :ref:`std_map` and :ref:`std_unordered_map`. Implementing enumerators is tricky - and in fact requires a number of Rice features. So let's take a look at how enumerator support is implemented for std::vector. Implementation -------------- Let's start with looking at the code: .. code-block:: cpp define_method("each", [](T& vector) -> const std::variant, Object> { if (!rb_block_given_p()) { auto rb_size_function = [](VALUE recv, VALUE argv, VALUE eobj) -> VALUE { // Since we can't capture the vector from above (because then we can't send // this lambda to rb_enumeratorize_with_size), extract it from recv T* receiver = Data_Object::from_ruby(recv); return detail::To_Ruby().convert(receiver->size()); }; return rb_enumeratorize_with_size(detail::selfThread, Identifier("each").to_sym(), 0, nullptr, rb_size_function); } for (Value_T& item : vector) { VALUE element = detail::To_Ruby().convert(item); detail::protect(rb_yield, element); } return std::ref(vector); }); We will go through each section in more detail below. Method Signature ---------------- First, Rice defines an ``each`` method to support Ruby's enumerable module. Its signature looks like this: .. code-block:: cpp define_method("each", [](T& vector) -> const std::variant, Object> Since std::vector doesn't have an ``each`` method, Rice creates a lambda function instead that interacts with the vector. The vector is passed in by reference ``T&`` to avoid a copy. Even more interestingly, notice the return type is a ``std::variant``. This is needed because the method can either return a Ruby enumerator or the vector. In the first case, returning the vector is the same as returning ``this`` from a C++ member function or ``self`` from a Ruby function. This allows methods to be chained together - for example ``vector.a.b``. We have to return a reference to the vector and not a copy. Besides being potentially wasteful, a copy would result in creating a new Ruby object. Thus ``self`` would no longer be self - which would be quite unexpected. However, ``std::variants`` cannot container references and thus what we need to return is a ``std::reference_wrapper``. In the second case, we want to return a new Ruby enumerator which has a type of VALUE. However, we can't directly return a VALUE because Rice will interpret it as an unsigned long long (which in fact is what it is). Instead, we return a ``Rice::Object.`` For more information refer to :ref:`return` class. Creating an Enumerator ---------------------- Next, let's look at the code that returns an enumerator: .. code-block:: cpp if (!rb_block_given_p()) { auto rb_size_function = [](VALUE recv, VALUE argv, VALUE eobj) -> VALUE { // Since we can't capture the vector from above (because then we can't send // this lambda to rb_enumeratorize_with_size), extract it from recv T* receiver = Data_Object::from_ruby(recv); return detail::To_Ruby().convert(receiver->size()); }; return rb_enumeratorize_with_size(detail::selfThread, Identifier("each").to_sym(), 0, nullptr, rb_size_function); } If a block is not provided by the user, then the method should return an enumerator. The enumerator is created like this: .. code-block:: cpp return rb_enumeratorize_with_size(detail::selfThread, Identifier("each").to_sym(), 0, nullptr, rb_size_function); Notice the first parameter of ``rb_enumeratorize_with_size`` requires a Ruby instance and not a C++ instance. The correct Ruby instance is the one that is wrapping the C++ instance which is stored in a thread-local variable called selfThread. Supporting Enumerator Size -------------------------- The ``rb_enumeratorize_with_size`` call includes an optional pointer to a function that can return the size of the enumerated object, in this case the vector. That is implemented as another lambda function: .. code-block:: cpp auto rb_size_function = [](VALUE recv, VALUE argv, VALUE eobj) -> VALUE { // Since we can't capture the vector from above (because then we can't send // this lambda to rb_enumeratorize_with_size), extract it from recv T* receiver = Data_Object::from_ruby(recv); return detail::To_Ruby().convert(receiver->size()); }; Since this lambda is being sent to C code, it cannot capture any local variables. Thus it does not have direct access to the ``T& vector`` parameter. Instead, it needs to extract the vector from the Ruby object wrapping the vector: .. code-block:: cpp T* receiver = Data_Object::from_ruby(recv); It then needs to determine the vector size and return it back as a Ruby object: .. code-block:: cpp return detail::To_Ruby().convert(receiver->size()); Yielding to a Block ------------------- Finally we get to the most common use case by far - yielding values to a passed in block: .. code-block:: cpp for (Value_T& item : vector) { VALUE element = detail::To_Ruby().convert(item); detail::protect(rb_yield, element); } The code is fairly simple. Iterate over each item in the vector by reference (no copies!), wrap it in a Ruby object, and return it to the block. Note the call to rb_yield is done via detail::protect in case Ruby raises an exception. Returning Self -------------- Last we return ``self`` is a common practice in Ruby to enable method chaining. Self in this case is the Ruby object wrapping the vector. By returning a reference to the vector, Rice is smart enough to map it back to the original Ruby object. .. code-block:: cpp return std::ref(vector); As explained above, we need to put the vector inside a ``std::reference_wrapper`` to include it in the returned variant.