This article is the third article related to the C++ coroutine that we have translated. The original author is Lewis Baker. For the original text, please refer to: https://lewissbaker.github.io/2018/09/05/understanding-the-promise-type. The following is the text.
This article is the third in a series of articles related to the C++ coroutine technical specification (N4736).
The previous articles in this series include:
- coroutine theory
- understanding the co_await operator
In this article, I will discuss the mechanism by which the compiler converts the coroutine code you write into compiled code. And how you can customize the behavior of the coroutine by defining your own Promise type.
coroutine concept
coroutine technical specification adds three new keywords: co_await, co_yield and co_return. In a function body (body of a function), no matter which one of these three keywords is used, it will trigger the compiler to compile the function as a coroutine instead of an ordinary function. [Annotation: According to our understanding, we believe that the word "body" in the original text has two meanings: one refers to the function method body composed of the original code sentences we wrote (including the compiler's translation of co_await, etc.), and the other It is a coroutine method body formed by the code converted by the compiler according to the promise control logic. For clarity, we use the function body and the coroutine body to refer to them respectively.]
compiler will convert the code you write into a state machine in a very mechanical way, it will make the function suspend execution at some specific points, and then resume execution later.
In the previous article, I described the first of the two new interfaces introduced by the technical specification: the Awaitable interface. The second interface introduced by the technical specification is the Promise interface, which is essential for the above code conversion.
Promise The interface method of is used to customize the behavior of the coroutine itself. Through this interface, library developers can customize what happens when the coroutine is called, and what happens when the coroutine returns (normal return or exception). You can also customize the behavior of each co_await or co_yield expression in the coroutine.
Promise object
By implementing a set of methods that are called at specific points during the execution of the coroutine, the Promise object can define and control the behavior of the coroutine itself.
Before we continue, I hope you can try to get rid of any preconceived notions of what promises are. Although in some usage scenarios, the promise object of the coroutine does play a similar role to the std::promise paired with std::future. But in other scenes, this analogy is a bit far-fetched. It may be easier to understand the promise object of the coroutine as a coroutine state controller, which controls the behavior of the coroutine and is used to track the state of the coroutine.
Each call to a coroutine function will construct a promise object instance in the coroutine frame.
The compiler will be responsible for generating code to call specific methods of the promise object at key points during the execution of the coroutine.
In the following example, it is assumed that the instance variable of the promise object created in the coroutine frame for a specific coroutine call is promise .
When you write a function body (consisting of the statement
When a coroutine function is called, a few extra steps will be executed before the body code of the function is officially executed, which is slightly different from regular functions.
The following is an overview of these steps (I will explain each step in detail below).
- Use the new operator to allocate a coroutine frame (optional step).
- Copy all function parameters to the coroutine frame.
- calls the constructor of the promise object whose type is P.
- calls the promise.get_return_object() method to get a result, which will be returned to the caller when the coroutine is suspended for the first time. Save this result to a local variable.
- calls the promise.initial_suspend() method, and uses co_await to wait for the return result.
- When the co_await promise.initial_suspend() expression resumes execution (immediate resume or asynchronous resume), start to execute the function body code statement you wrote.
During the execution of your statement, if a co_return is encountered, the following additional steps will be performed:
- call promise.return_void() or promise.return_value(
).[Annotation: It depends on the type of the final return value of your coroutine method to the caller and the way the promise is implemented. For example, an asynchronous network transmission may not return a value, and an asynchronous network reception may return the number of bytes read to the caller. ] _Li14li - Destroy all variables with automatic storage life cycle, the order of destruction is opposite to the order in which they were created.
- calls the promise.final_suspend() method, and uses co_await to wait for the return result.
If your
- catch the exception and call promise.unhandled_exception() in the catch block.
- calls the promise.final_suspend() method, and uses co_await to wait for the return result.
Once executed outside the coroutine body, the coroutine frame is destroyed. This is accomplished through the following steps:
- calls the destructor of the promise object.
- calls the destructor of all function parameter copies.
- Use the delete operator to release the memory occupied by the coroutine frame (optional step).
- returns execution to the caller or resumes the program.
When the execution reaches the
allocates memory for the coroutine frame
First, the compiler generates the call code for the new operator, which is responsible for allocating memory for the coroutine frame.
If there is a custom new operator overload on the promise type P, then call this operator overload, otherwise call the global new operator.
needs to pay special attention to the following points:
The size passed to the new operator is not sizeof(P), but the size of the entire coroutine frame. This size is automatically calculated by the compiler according to the following requirements, including: the number and size of parameters, the size of the promise object, the number and size of local variables, and other compiler-specific storage for managing the state of the coroutine demand.
In order to optimize the code, the compiler may omit the call to the new operator when the following conditions are met:
- Make sure that the life cycle of the coroutine frame will be strictly nested within the life cycle of the caller; and
- The compiler can determine the required size of the coroutine frame at the call point.
When these conditions are met, the compiler can allocate storage space for the coroutine frame in the caller's active frame (may be the stack frame part or the coroutine frame part). The
technical specification has not clarified under what circumstances it is possible to ensure that storage allocation is not necessary, so you still need to write code to deal with the std::bad_alloc exception that may occur during the process of allocating memory for the coroutine frame. This also means that you should not normally declare a coroutine function as noexcept unless your program allows std::terminate() to be called when memory allocation for a coroutine frame fails. [Annotation: If we declare a coroutine function as noexcept , Then when the allocation of memory for the coroutine frame fails, because this allocation is done by the code generated by the compiler, and outside the control range of our own code, then the noexcept rule will be violated.At this time, std::terminate() will be called when C++ is running.
However, we have a fallback scheme that can be used to deal with the failure to allocate memory for the coroutine frame, instead of (watching it) throw an exception. This may be necessary in operating environments that do not allow exceptions, such as embedded environments, or high-performance environments that cannot tolerate the overhead caused by exceptions.
If we provide a static P::get_return_object_on_allocation_failure() member function for the promise type, then the compiler will generate the call code for the overload of new(size_t, nothrow_t) instead of the default new. If this call returns nullptr, the coroutine will immediately call P::get_return_object_on_allocation_failure() and return the result of this call to the caller of the coroutine instead of throwing an exception.
Customize the memory allocation of the coroutine frame
Your promise type can provide an overload of the new() operator. When the compiler needs to allocate memory for the coroutine frame that uses your promise type, Will call your overloaded version instead of the global new operator.
For example:
"But, can't a custom memory allocator work?" I heard you ask.
You can still provide an overload of P::operator new() that carries additional parameters. In the case of a match, the overload will be called and the lvalue references of the coroutine function parameters are passed to these additional parameters. This allows a memory allocator to be passed to the new operator as a parameter of the coroutine function, and then the new operator can call the allocate() method of the memory allocator.
However, you need to do some extra work to include a copy of the memory allocator in the allocated memory,In this way, the memory allocator can be referenced in the corresponding delete operator, because the parameters of the coroutine function are not passed to the delete operator. This is because the parameters are stored in the coroutine frame, and they have been destructed when the delete operator is called. [Annotation: The destruction of the copy of the function parameter occurs before the release (delete) of the coroutine frame. ]
For example, you can implement a new operator so that it can allocate an extra space after the coroutine frame to save a copy of the memory allocator, which is then used to release the coroutine frame RAM. An example of
is as follows:
In order to make the custom my_promise_type act on the coroutine whose first parameter is std::allocator_arg, you need to specialize the coroutine_traits class (see the chapter on coroutine_traits below for details) .
looks like this:
Note that even if you customize the memory allocation strategy for the coroutine, the compiler is still allowed to ignore the use of your memory allocator .
Copy parameters to the coroutine frame
The coroutine needs to copy every parameter passed by the original caller to the coroutine function to the coroutine frame so that they remain valid after the coroutine is suspended.
If the parameters are passed to the coroutine by value, these parameters will be copied to the coroutine frame by calling its type of move constructor.
If the parameters are passed to the coroutine by reference (regardless of lvalue reference or rvalue reference), then only their reference is copied to the coroutine frame, not the value they reference.
Need to pay attention to, for the trivial destructor [Annotation: trivial destructor,It is a term in C++ that means that the destructor does not perform any operations. For convenience, I borrowed the mathematical terminology and translated it into the type of trivial destructor. If the parameter of this type is never quoted after the
There are many pitfalls when passing parameters to the coroutine by reference, because you may not be able to expect these references to remain valid during the life cycle of the coroutine. Techniques commonly used in ordinary functions, such as perfect-forwarding and universal-references, may cause undefined behavior in the code if they are used in coroutines. If you want to know more, please refer to Toby Allsopp's great article about this [Annotation: https://toby-allsopp.github.io/2017/04/22/coroutines-reference-params.html ].
If any parameter copy or move constructor throws an exception, then all the constructed parameters will be destroyed, the coroutine frame will be released, and the exception will be propagated to the caller.
constructs the promise object
Once all the parameters are copied into the coroutine frame, the coroutine will then construct the promise object.
The reason for copying parameters before the construction of the promise object is to allow the promise object to access these copied parameters in its constructor.
First, the compiler checks whether the promise has a constructor overload that can accept the lvalue reference of the copied parameter. If the compiler finds such a constructor overload, it generates code that calls the overload. If you can't find such an overload, go back and generate code that calls the default constructor of the promise.
Please note that the ability of the promise constructor to "peek" (peek) parameters is a recent adjustment to the technical specifications of the coroutine.This adjustment was accepted into the N4723 working draft at the Jacksonville 2018 meeting. Please refer to P0914R1 for the proposal. Therefore, older versions of Clang or MSVC may not support this capability.
If the promise constructor throws an exception, before the exception is propagated to the caller, during stack unwinding, the parameter copy will be destroyed and the coroutine frame will be released.
Get the return object
The first thing the coroutine does to the promise object is to get the return-object by calling promise.get_return_object(). [Annotation: For example, a task
When the coroutine is suspended or finished for the first time, and the execution returns to the caller, this return-object will be returned to the caller of the coroutine function.
You can think of the control flow as roughly as follows:
Please note that we need to get this return object before the coroutine body starts, because the call to resume() may be on the current thread , Or on another thread, the coroutine frame (including the promise object) may be destroyed before the coroutine_handle::resume() returns. Therefore, it is not safe to call promise.get_return_object() after the coroutine body is started.
initial-suspend point
Once the coroutine frame is initialized and the return object has been obtained, the next thing the coroutine has to do is to execute the statement co_await promise.initial_suspend();.
This statement allows developers of the promise type to control whether the coroutine should be suspended before the function body is executed, or the function body should be executed immediately.
If the coroutine is suspended at the point of the initial suspension point, then you can restore or destroy the coroutine later by calling resume() or destroy() on the coroutine handle at a certain time of your choice.
co_await promise.initial_suspend() The result of the expression will be discarded, so the implementation generally returns void from the await_resume() method of the awaiter object. [Annotation: Because the result of promise.initial_suspend() can be co_awaited, then the return result is an awaiter object. According to the definition of the Awaiter interface, there will be an await_resume() method on this object. ]
It is important to note that this statement is outside the try/catch block that protects the rest of the coroutine (if you forget what it looks like, please scroll up to the place where the coroutine body is defined). This means that before reaching its
If your return object has RAII semantics, and the coroutine frame is destroyed during its destruction, you need to be extra careful. You must ensure that co_await promise.initial_suspend() is noexcept to avoid repeated release of the coroutine frame.
Please note that there are proposals to adjust the semantics so that all or part of the co_await promise.initial_suspend() expression can be included in the try/catch block of the coroutine body. Therefore, the exact semantics may change before the coroutine is officially completed.
For many types of coroutines,The initial_suspend() method either returns std::experimental::suspend_always (when asynchronous operation is delayed start), or returns std::experimental::suspend_never (when asynchronous operation starts immediately). Both of these objects are noexcept awaitable objects [Annotation: The three methods of await_ready, await_suspend and await_resume of these two types are all declared as noexcept], so there is generally no problem.
returns to the caller
When the coroutine function is executed to its first
It should be noted that the return-object type does not need to be the same as the return type of the coroutine function. When necessary, the return object will be implicitly converted to the return type of the coroutine function.
Please note that Clang's coroutine implementation (since 5.0) does not perform this conversion until the return object is returned from the coroutine call. The implementation of MSVC from 2017 Update 3 is to perform the conversion immediately after calling get_return_object(). Although the technical specifications do not clearly state the expected behavior on this, I believe that MSVC plans to change their implementation to a more Clang-like way, because this will help to achieve some interesting application scenarios. [Annotation: please refer to https://github. com/toby-allsopp/coroutine_monad].
uses co_return to return from the coroutine
When the coroutine executes a co_return statement,This statement is converted to promise.return_void() or promise.return_value(
co_return are as follows:
- co_return; Convert to promise.return_void();
- co_return
; If the type of is void, convert to ; promise.return_void();. If the type of is not void, convert to promise.return_value( );.
Next, goto FinalSuspend; will cause all local variables with automatic storage lifetime to be destroyed, and the order of destruction is opposite to the order in which they were constructed before. After this, execute co_await promise.final_suspend();.
Please note that if the coroutine does not contain any co_return statement, if the execution leaves the end of the coroutine, it is equivalent to including a co_return; at the end of the function body. In this case, if the promise type does not provide a return_void() method, then the behavior is undefined.
If an exception occurs when
handles exceptions thrown from the coroutine body
when the exception is thrown from the function body,The exception will be caught, and the promise.unhandled_exception() method in the catch block will be called. The implementation of
generally calls the std::current_exception() method to obtain a copy of the exception, and then saves this copy for later rethrowing in a different context.
can also rethrow the exception immediately by executing the throw; statement in this method. For examples, please refer to folly::Optional [Annotation: Please refer to https://github.com/facebook/folly/blob/4af3040b4c2192818a413bad35f7a6cc5846ed0b/folly/Optional.h#L587]. However, doing so will (or possibly, see below) cause the coroutine frame to be destroyed immediately, and the exception will be propagated to the caller or recovery program. If the upper encapsulation of the coroutine assumes or requires that the call to coroutine_handle::resume() is noexcept, then it will cause problems, so usually only when you can fully control who or what can call resume() Go use this way.
Please note that for the case of rethrowing an exception when unhandled_exception() is called (or any logic outside the try block throws an exception), what expected behavior should behave, the wording of the current technical specifications (wording) It’s not too clear yet [Annotation: see https://github.com/GorNishanov/coroutines-ts/issues/17].
My current interpretation of the wording of the technical specification is that if the control flow exits the coroutine body, by throwing an exception in co_await promise.initial_suspend(), promise.unhandled_exception() or co_await promise.final_suspend(), or using co_await Promise.final_suspend() terminates the operation of the coroutine by completing synchronously,Then before returning to the caller or restoring the program, the coroutine frame will be automatically destroyed. However, this interpretation has its own problems. A future version of the
coroutine specification is expected to clarify this issue. However, until then, I will avoid throwing exceptions from initial_suspend(), final_suspend(), and unhandled_exception(). Please continue to pay attention to the changes in the specification! (Stay tuned!)
final-suspend point (final-suspend point)
Once the execution exits the user-defined part of the coroutine body, and the result has been captured by return_void(), return_value() or unhandled_exception(), At the same time, all local variables have been destroyed, so before returning to the caller or restoring the program, the coroutine has the opportunity to execute some additional logic.
At this time, the coroutine will execute the co_await promise.final_suspend(); statement.
This allows the coroutine to do things like publishing results, signaling completion, or resuming a continuation [Annotation: Please refer to the explanation of the continuation below]. You can also let the coroutine choose to suspend immediately before the execution is complete and the coroutine frame is destroyed.
Please note that if the coroutine is suspended at the final_suspend() point, then calling resume() on it will cause undefined behavior. The only thing you can do with a coroutine suspended here is destroy() it.
According to Gor Nishanov, the rationale for this restriction is that it can reduce the number of pending states that need to be expressed by the coroutine, and it may also reduce the number of branches required, thereby giving the compiler a chance Do some optimization work.
It should be noted that although the coroutine does not have to be suspended at final_suspend(), recommends that you build your coroutine carefully.So that they can be suspended at final_suspend() if possible. This is because this can prompt you to call the coroutine's .destroy() method outside the coroutine (usually in the destructor of a RAII object), and it can also make it easier for the compiler to decide under what circumstances. The life cycle of the process frame is nested in the life cycle of the caller. This in turn makes it more likely that the compiler does not have to allocate memory for the coroutine frame.
How does the compiler choose the promise type
Now, let us examine how, given a coroutine, the compiler decides which type of promise object to use. The
compiler uses the std::experimental::coroutine_traits class to determine the promise object type of the coroutine according to the coroutine signature.
If you have a coroutine function with the following signature:
Then, the compiler will pass the return type and parameter type list as template parameters to coroutine_traits to infer the promise type of the coroutine.
If the coroutine function is a non-static member function of a class, then the type of this class will be passed to coroutine_traits as the second template parameter. Note that if your method is an overload of an rvalue reference, then the second template parameter will also be an rvalue reference.
For example, if you have the following two methods:
then the compiler will use the following promise types respectively:
returns the definition defined in the default coroutine template
The promise_type type alias (typedef) on the value type determines the promise type of the coroutine,That is, it is similar to the following definition (but some additional SFINAE magic will be used to make coroutine_traits ::promise_type undefined when RET::promise_type is not found). [Annotation: SFINAE: Substitution Failure Is Not An Error, matching failure is not an error. ]
So, for the return type of the coroutine you can control, you only need to define a promise_type for them, and the compiler will use this promise_type as the type of the coroutine promise object, this promise object Will return an object instance of your class [Annotation: promise.get_return_object() returns an object instance of task<>].>
For example:
For the return type of the coroutine that you cannot control, you can specify the type of promise to be used by the coroutine by specializing coroutine_traits without modifying the return type.
For example, to specify the promise type for a coroutine that returns std::optional
to identify specific coroutine activity frames
When you call a protocol Process function, the coroutine frame will be created. In order to be able to restore its associated coroutine or destroy the coroutine frame, you need some way of identifying or referencing the specific coroutine frame. The mechanism provided by the
technical specification for this is the coroutine_handle type.
The simplified interface of this type is defined as follows:
There are two ways to obtain the coroutine handle:
- During the execution of the co_await expression,This handle will be passed to the await_suspend() method.
- If you have a reference to a coroutine promise object, you can use coroutine_handle
::from_promise() to reconstruct the handle of the target coroutine.
When the coroutine is suspended at the
Please note that coroutine_handle is not a object, and of course it is not an RAII object. You must manually call the .destroy() method to destroy the coroutine frame and release related resources. Think of it as the void* equivalent for managing memory. This is due to performance considerations: making it an RAII object will add additional overhead to the coroutine, such as the need for reference counting.
Generally, you should use higher-level types that provide RAII semantics for coroutines, such as the cppcoro library (shameless self-promotion) [Annotation: original shameless plug,The author of this article is also the author of the cppcoro library, so I will ridicule the type provided by the opportunity to promote, or write your own advanced type for your coroutine to encapsulate the life cycle of the coroutine frame.
Customize the behavior of co_await
The promise type can also customize the behavior of every co_await expression that appears in the coroutine body.
By defining a method named await_transform() for the promise type, the compiler will convert each co_await
This method has many important and powerful uses:
allows you to wait for types that cannot be waited under normal circumstances.
For example, a coroutine whose return type is std::optional
allows you to disable certain types of waiting by declaring a deleted await_transform() overload.
For example, the promise type of a coroutine whose return type is std::generator
allows you to change the behavior of the normally awaitable value (normally awaitable) to meet the new situation.
For example, you can define such a coroutine by encapsulating the awaitable object in a resume_on() operator (see cppcoro::resume_on()) to ensure that the coroutine will always be executed from an associated program Recovery in every co_await expression.
Regarding await_transform(), the last thing to emphasize is that if any await_transform() member is defined on the promise type, it will trigger the compiler to convert all co_await expressions to promise.await_transform () call. This means that if you only want to customize the behavior of co_await for certain types, you also need to provide a fallback overload of await_transform() that only forwards parameters.
Customize the behavior of co_yield
The last thing you can customize through the promise type is the behavior of the co_yield keyword.
If the co_yield keyword appears in the coroutine, the compiler will convert the co_yield
Please note that unlike await_transform(), if the promise type does not define a yield_value() method, then co_yield has no default behavior. Therefore, the promise type needs to explicitly choose not to support co_await by declaring the deleted await_tranform(), and it also needs to explicitly choose to support co_yield by defining yield_value().
A typical example of a promise type with a yield_value() method is the generator
Summary
Various conversions made by the compiler.
I hope this article can help you understand how to customize the behavior of different types of coroutines by defining different types of your own promises. The coroutine mechanism provides many flexible components through which you can customize the behavior of the coroutine in many different ways.
Nevertheless, there is one more important conversion that the compiler will perform that I haven't introduced yet-that is, the conversion of the coroutine body into a state machine. However, since this article is already too long, I will postpone it to the next article to explain it. stay tuned! (Stay tuned!)