8000 Perl_newSLICEOP: Optimise '(caller)[0]' into 'scalar caller' by richardleach · Pull Request #23369 · Perl/perl5 · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Perl_newSLICEOP: Optimise '(caller)[0]' into 'scalar caller' #23369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: blead
Choose a base branch
from

Conversation

richardleach
Copy link
Contributor
@richardleach richardleach commented Jun 12, 2025

A subroutine can obtain just the package of its caller in a couple of ways. Both seem somewhat common.

  • caller - in scaler context, as in my $x = caller;
  • (caller)[0], as in my $x = (caller)[0];

In the first, caller finds the package name, sticks it in a new SV, and puts that (or undef) on the stack:

<0> caller[t2] s

In the second, caller (a) finds the package name, filename, and line (b) creates three new SVs to hold them all (c) puts those SVs on the stack (d) does a list slice to leave just the package SV on the stack.

7        <2> lslice sK/2 ->8
-           <1> ex-list lK ->5
3              <0> pushmark s ->4
4              <$> const[IV 0] s ->5
-           <1> ex-list lK ->7
5              <0> pushmark s ->6
6              <0> caller[t2] l ->7

This commit checks for the second case inside Perl_newSLICEOP and instead of constructing a lslice OP, returns just the caller OP with scalar context applied.


  • This set of changes does not require a perldelta entry.

A subroutine can obtain just the package of its caller in a couple of ways.
Both seem somewhat common.

 * `caller` - in scaler context, as in `my $x = caller;`
 * `(caller)[0]`, as in `my $x = (caller)[0];`

In the first, `caller` finds the package name, sticks it in a new SV, and
puts that (or `undef`) on the stack:

    <0> caller[t2] s

In the second, `caller` (a) finds the package name, filename, and line
(b) creates three new SVs to hold them all (c) puts those SVs on the stack
(d) does a list slice to leave just the package SV on the stack.

    7        <2> lslice sK/2 ->8
    -           <1> ex-list lK ->5
    3              <0> pushmark s ->4
    4              <$> const[IV 0] s ->5
    -           <1> ex-list lK ->7
    5              <0> pushmark s ->6
    6              <0> caller[t2] l ->7

This commit checks for the second case inside `Perl_newSLICEOP` and
instead of constructing a `lslice` OP, returns just the `caller` OP
with scalar context applied.
@richardleach richardleach added the defer-next-dev This PR should not be merged yet, but await the next development cycle label Jun 12, 2025
@bulk88
Copy link
Contributor
bulk88 commented Jun 12, 2025

https://grep.metacpan.org/search?size=20&_bb=86025425&q=%5C%28caller%5C%29%5C%5B&qft=*.pm%2C+*.t&qd=&qifl=

Should've been done 30 years ago. Most PP devs think (caller())[\w] is constant folded as if Perl is identical to C++. The truth is list context PP context() is like writing a 100MB core dump to a SSD everytime you execute it. caller() shouldve never ever have become Perl best practices/cargo culted. There is nothing wrong with caller's public PP API IMO, but b/c of its horrible runtime internal implementation, it shouldve been sent to the landfill, by use strict; on day 1 of use strict;, just like this Perl 5 code was sent to the landfill by use strict;

C:\Users\Owner>perl -e" push( @a, THISS); push( @a, ISS); push( @a, PERL); $, =' '; print @a;"
THISS ISS PERL
C:\Users\Owner>

Rich, would you pretty please be able to tackle adding OP tree compile time next gen G_VOID propagation to the other 12-15 list context retval indexes/retval SV*s created by pp_caller?

@richardleach richardleach added the do not merge Don't merge this PR, at least for now label Jun 13, 2025
@richardleach
Copy link
Contributor Author

Hmmm, if I can co-opt op_private then it's likely possible to cover individual elements 1,2,3 and in-order slices like [1,2], which seem to make up the majority of usage on CPAN. I'll have a go.

@bulk88
Copy link
Contributor
bulk88 commented Jun 13, 2025

Hmmm, if I can co-opt op_private then it's likely possible to cover individual elements 1,2,3 and in-order slices like [1,2], which seem to make up the majority of usage on CPAN. I'll have a go.

pp_caller() has 3 return prototypes,

-1 SV*
-3 SVs
-10 SV
s

All that is needed is a room to put a U16 variable in OP_CALLER's OP struct, or 10 unused bits somewhere. 10 bits will allow a fast and easy pattern of

if(op->opaque & 0x4) {
    PUSH(sumsv);
}
if(op->opaque & 0x8) {
    PUSH(sumsv);
}
if(op->opaque & 0x10) {
    PUSH(sumsv);
}

inside pp_caller.

My thought 5 years ago was to add a 2nd integer argument, allowing the end use to pick 1 of 10 elements to return, but I didn't like my proposal, since it would only help new code, and heavily policed and evangelized new code.
And "heavily policed and evangelized new code" really means @bulk88 is making PRs to various CPAN modules and demanding those authors make a new CPAN release tar.gz for @bulk88's brand new perl 5 grammer enhancement authored by @bulk88.

My wishlist quickly changed to, the existing production in the field deployed perl code needs to be left un-touched, the correct fix would be from the P5P side, from the 'yylex'/'ck_op_()' side by analyzing the scalar or list context = operator and the target lvalue somehow, and eventually see if it was assigned to a $/@ lvalue or to an anonymous array, array ref, whatever this is my $x = (caller)[0];.

7        <2> lslice sK/2 ->8
-           <1> ex-list lK ->5
3              <0> pushmark s ->4
4              <$> const[IV 0] s ->5
-           <1> ex-list lK ->7

I don't know how ex-list and lslice pp_foo() funcs work from the top of my head, I very rarely or never see them flash by while holding F11 or breakpointing each cycle of runops_std(). But back to my point, the wish list is to extract that array dereference const literal integer from [] operator's OP, and stick it into caller() operator's OP.

@richardleach
Copy link
Contributor Author

_-10 SV_s

I didn't do an exhaustive grep of CPAN on the train this morning, but it did seem like handling only the first 4 SVs would cover the vast majority of actually-encounted slice cases. Hence abuse of op_private seemed like it might be enough.

@bulk88
Copy link
Contributor
bulk88 commented Jun 16, 2025

_-10 SV_s

I didn't do an exhaustive grep of CPAN on the train this morning, but it did seem like handling only the first 4 SVs would cover the vast majority of actually-encounted slice cases. Hence abuse of op_private seemed like it might be enough.

values 0-9 aka 10 SVs fit in 4 bits. Picking 1, 2 , 3 or more non linear SVs out of 10, requires 10 bits of space.

IDK how to read all the lines below, but my eyes see worst case ever 3 bits free, at minimum case a U8 available, getting creative steal some bits from U8 op_flags, after that add another type code inside PERL_BITFIELD16 op_type:9;, after that steal PADOFFSET op_targ;, im not sure a caller in list context, can have a TARG since isn't TARG only for caller() in scalar context , and then TARG is the lvalue on the left side caller() ? Im guessing but not checking the src code to verify, that caller() always pulls its 1 and only incoming arg, which is an IV/U32 of how many PP frames to go backwards, from the PL stack, not from caller() OP's TARG, correct me if I am wrong.

caller		caller			ck_fun		t%	S?
# baseop/unop - %
#define BASEOP				\
    OP*		op_next;		\
    OP*		op_sibparent;		\
    OP*		(*op_ppaddr)(pTHX);	\
    PADOFFSET	op_targ;		\
    PERL_BITFIELD16 op_type:9;		\
    PERL_BITFIELD16 op_opt:1;		\
    PERL_BITFIELD16 op_slabbed:1;	\
    PERL_BITFIELD16 op_savefree:1;	\
    PERL_BITFIELD16 op_static:1;	\
    PERL_BITFIELD16 op_folded:1;	\
    PERL_BITFIELD16 op_moresib:1;       \
    PERL_BITFIELD16 op_spare:1;		\
    U8		op_flags;		\
    U8		op_private;
#endif
    /* CALLER     */ (OPpARG4_MASK|OPpOFFBYONE),
#define OPpARG4_MASK            0x0f
#define OPpOFFBYONE             0x80

There is also the sneaky solution of pp_caller() at runloop time doing a sneaky deref into a const folded/disabled but not defragmented OP*, and learning the const literal integer from a different OP* struct than its own OP* struct. Certain other pp_*() funcs do this design pattern already.

@richardleach
Copy link
Contributor Author

values 0-9 aka 10 SV_s fit in 4 bits. Picking 1, 2 , 3 or more non linear SV_s out of 10, requires 10 bits of space.

Yeah, but (caller)[1,2] seems to crop up quite a lot, whereas I couldn't spot a (caller)[8] for example. IIRC, these were the cases I mostly saw from a CPAN grep:

  • (caller)[0]
  • (caller)[1]
  • (caller \d?)[2]
  • (caller \d)[3]
  • (caller \d?)[1,2]
  • (caller)[0,1]
  • (caller)[0,2]

We could have a new unop_aux OP that supports arbitrary element emitting in arbitrary orders, but that feels excessive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defer-next-dev This PR should not be merged yet, but await the next development cycle do not merge Don't merge this PR, at least for now
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0