The question is mainly about mawk, since it uses a bytecode interpreter, and the question mentions its command-line option for showing the bytecodes in an assembly-like output. While a recent change to gawk (2010) added an analogous, the mnemonics (and command-line options) differ. With gawk, you can use its debugger for getting a (different) listing.
No, there is no written reference aside from the source-code (which generally doesn't have comments explaining the mnemonics for the byte-code operations).
For just a list, you could read code.h, while seeing the process of disassembling the da.c file is helpful. However, the action takes place in execute.c But there's no tutorial.
Some recent changes have been made to make the output more understandable. For example here's mawk 1.3.3 on the ct_length.awk file:
MAIN
000 omain
001 jmp 084
003 pushs "%s"
005 pushi $0
007 pushd 1
009 pushi i
011 pushd 5
013 add
014 pushint 3
016 substr
018 pushint 2
020 printf
022 f_pusha $0
024 pushi $0
026 pushi i
028 pushd 6
030 add
031 pushint 2
033 substr
035 f_assign
036 pop
037 pushi $0
039 pushc 0xfc2810 /^[ \t]*\(/
041 match
043 jz 078
045 pushs "%s"
047 pushi $0
049 pushd 1
051 pushi RLENGTH
053 pushint 3
055 substr
057 pushint 2
059 printf
061 f_pusha $0
063 pushi $0
065 pushi RLENGTH
067 pushd 1
069 add
070 pushint 2
072 substr
074 f_assign
075 pop
076 jmp 084
078 pushs "($0)"
080 pushint 1
082 printf
084 pusha i
086 pushi $0
088 pushs "length"
090 index
092 assign
093 jnz 003
095 pushint 0
097 print
099 ol_gl
while the 20161120 snapshot shows the finite state machine for the regular expression (near the end):
MAIN
000 . omain
001 . jmp 084
003 . pushs "%s"
005 . pushi $0
007 . pushd 1
009 . pushi i
011 . pushd 5
013 . add
014 . pushint 3
016 . substr
018 . pushint 2
020 . printf
022 . f_pusha $0
024 . pushi $0
026 . pushi i
028 . pushd 6
030 . add
031 . pushint 2
033 . substr
035 . f_assign
036 . pop
037 . pushi $0
039 . pushc 0xbb7998 /^[ \t]*\(/
041 . match
043 . jz 078
045 . pushs "%s"
047 . pushi $0
049 . pushd 1
051 . pushi RLENGTH
053 . pushint 3
055 . substr
057 . pushint 2
059 . printf
061 . f_pusha $0
063 . pushi $0
065 . pushi RLENGTH
067 . pushd 1
069 . add
070 . pushint 2
072 . substr
074 . f_assign
075 . pop
076 . jmp 084
078 . pushs "($0)"
080 . pushint 1
082 . printf
084 . pusha i
086 . pushi $0
088 . pushs "length"
090 . index
092 . assign
093 . jnz 003
095 . pushint 0
097 . print
099 . ol_gl
# regex 0xbb7998
000 . M_START
001 . M_2JA 4
002 . M_SAVE_POS
003 . M_CLASS [\011 ]
004 . M_2JC -2
005 . M_STR "("
006 . M_ACCEPT
With gawk 4.x, you would get the analogous information by starting its debugger, e.g.,
gawk -D -f cf_length.awk
and telling it to dump from the command-line. Here's the result:
gawk> dump
[ :0x55977de761e0] Op_newfile : [target_jmp = 0x55977de751d0] [target_endfile = 0x55977de751f0]
[target_get_record = 0x55977de75230]
[ :0x55977de75210] Op_no_op :
[ :0x55977de75970] Op_after_beginfile :
[ :0x55977de75230] Op_get_record : [target_newfile = 0x55977de761e0]
# Rule
[ 10:0x55977de77b50] Op_rule : [in_rule = Rule] [source_file = ct_length.awk]
[ 12:0x55977de75310] Op_push_i : 0 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 12:0x55977de752f0] Op_field_spec :
[ 12:0x55977de75370] Op_push_i : "length" [MALLOC|STRING|STRCUR]
[ 12:0x55977de752d0] Op_builtin : index [arg_count = 2]
[ 12:0x55977de75270] Op_push_lhs : i [do_reference = false]
[ 12:0x55977de75290] Op_assign :
[ :0x55977de75650] Op_lint : [lint_type = LINT_assign_in_cond]
[ :0x55977de75450] Op_jmp_false : [target_jmp = 0x55977de75810]
[ 14:0x55977de75350] Op_push_i : "%s" [MALLOC|STRING|STRCUR]
[ 14:0x55977de75410] Op_push_i : 0 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 14:0x55977de753f0] Op_field_spec :
[ 14:0x55977de75470] Op_push_i : 1 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 14:0x55977de754b0] Op_push : i
[ 14:0x55977de754d0] Op_plus_i : 5 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 14:0x55977de753d0] Op_builtin : substr [arg_count = 3]
[ 14:0x55977de752b0] Op_K_printf : [expr_count = 2] [redir_type = ""]
[ 15:0x55977de75510] Op_push_i : 0 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 15:0x55977de75530] Op_field_spec :
[ 15:0x55977de75590] Op_push : i
[ 15:0x55977de755b0] Op_plus_i : 6 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 15:0x55977de75490] Op_builtin : substr [arg_count = 2]
[ 15:0x55977de75430] Op_push_i : 0 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 15:0x55977de753b0] Op_store_field :
[ 17:0x55977de75570] Op_push_i : 0 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 17:0x55977de754f0] Op_field_spec :
[ 17:0x55977de755f0] Op_push_re : /^[ \t]*\(/
------[Enter] to continue or q [Enter] to quit------
[ 17:0x55977de75550] Op_builtin : match [arg_count = 2]
[ :0x55977de758b0] Op_jmp_false : [target_jmp = 0x55977de75790]
[ 20:0x55977de75630] Op_push_i : "%s" [MALLOC|STRING|STRCUR]
[ 20:0x55977de756d0] Op_push_i : 0 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 20:0x55977de756b0] Op_field_spec :
[ 20:0x55977de75730] Op_push_i : 1 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 20:0x55977de75770] Op_push : RLENGTH
[ 20:0x55977de75690] Op_builtin : substr [arg_count = 3]
[ 20:0x55977de75610] Op_K_printf : [expr_count = 2] [redir_type = ""]
[ 21:0x55977de757d0] Op_push_i : 0 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 21:0x55977de757b0] Op_field_spec :
[ 21:0x55977de75830] Op_push : RLENGTH
[ 21:0x55977de75850] Op_plus_i : 1 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 21:0x55977de75750] Op_builtin : substr [arg_count = 2]
[ 21:0x55977de756f0] Op_push_i : 0 [MALLOC|NUMCUR|NUMBER|NUMINT]
[ 21:0x55977de75670] Op_store_field :
[ :0x55977de757f0] Op_jmp : [target_jmp = 0x55977de75870]
[ 24:0x55977de75790] Op_push_i : "($0)" [MALLOC|STRING|STRCUR]
[ 24:0x55977de75710] Op_K_printf : [expr_count = 1] [redir_type = ""]
[ :0x55977de75870] Op_no_op :
[ :0x55977de75390] Op_jmp : [target_jmp = 0x55977de75310]
[ :0x55977de75810] Op_no_op :
[ 27:0x55977de755d0] Op_K_print_rec : [redir_type = ""]
[ :0x55977de75890] Op_no_op :
[ :0x55977de75950] Op_jmp : [target_jmp = 0x55977de75230]
[ :0x55977de751f0] Op_no_op :
[ :0x55977de75930] Op_after_endfile :
[ :0x55977de751d0] Op_no_op :
[ :0x55977de75250] Op_atexit :
[ :0x55977de758d0] Op_stop :
gawk> q
If you were to put the two side-by-side, you'd see pushes and pops (since both use a similar approach to handling the stack), but little similarity beyond that.