Ksh93 does a lot to avoid forks. I have no idea how it knows how to handle the first case, as a truss shows that it only calls one write(2) call with the final result.
It may be that David scans the command in macro.c and knows that he may handle "echo" internally.
What I can say is that I rewrote the parser and the interpreter of the "Bourne Shell" last year and mainly reduced the number of forks and replaced many of the forks by vfork() calls. This currently makes the Bourne Shell the second fastest shell past ksh93. You may like to run your tests with bosh as well.
BTW: ksh93 avoids forks in general. It implements a structure that contains all previous global variables and this made the shell code reentrant if it is called with different instances of the "global" variable structure pointer.
This method is used by ksh93 whenever there is a (cmd) subshell.
The reason for this rewrite is that David is using Win-DOS on his laptop and he did not like the slow Cygwin, so he wrote UWIN and uses ksh93 on Win-DOS directly. As there is no fork() on Win-DOS, he needed to find a new solution...