Print args on separate lines more explicitly #345

jlongster · 2016-11-29T02:24:23Z

I have a few tweaks incoming to the generic pretty printer. I hope these won't be too controversial because the focus of recast's printer seems to mostly be to keep the original formatting in tact. These changes shouldn't affect the non-generic printer, and I'm interested in just the generic printer.

This is a change to the generic pretty printer that I think improve how arguments are printed.

The way it worked before was if the width of the line surpasses the options.wrapColumn threshold, it would print the args on separate lines. For example:

foo(really_long_expression() * 100 / 5, another_long_call() * 12345);

The pretty printer doesn't do anything here because it doesn't surpass 72 chars (the default width threshold. However, if we add another argument:

foo(really_long_expression() * 100 / 5, another_long_call() * 12345, another_one(10000));

Now the pretty-printer does this:

foo(
  really_long_expression() * 100 / 5,
  another_long_call() * 12345,
  another_one(10000)
);

This is really nice. However, the problem I've seen is when arguments already break themselves across lines, the whole expression never gets wide enough for the arguments themselves to be clearly separated. For example, take the following code:

arr.something(y => { return 4 }, x => { return x + 1 });

The previous behavior of the printer was to show this:

arr.something(y => {
  return 4;
}, x => {
  return x + 1;
});

This happens because the function bodies break themselves across multiple lines, which is fine, but it makes the function call very hard to read. With this PR, the printer now shows this:

arr.something(
  y => {
    return 4;
  },
  x => {
    return x + 1;
  }
);

The reason is because arguments are forced on there own line if any of the arguments except the last one span multiple lines. Why the exception of the last one? We want code this like to work:

arr.something(10, x => { return x + 1 });

In this case the printer won't force all args to be on separate lines, and it's OK that the last one spans multiple lines:

arr.something(10, x => {
  return x + 1;
});

jlongster · 2016-12-01T16:50:14Z

@benjamn Any chance you could check out these PRs? This is a friendly ping and there's no rush at all (I know how managing OSS projects go).

I also noticed that the printer doesn't accurately measure when to break things on new lines. This is because in the places where it checks to break, it only measures the child expression's length, but doesn't take into account the current column position of the expression. So it will only break a function call's arguments across lines if the call itself is greater than 72 chars, but not if it's any less, even if it's indented 30 spaces and should be broken across lines.

I've been reading this paper and I plan on refactoring how the printer handled breaking things across lines. I don't think it should be a huge refactor, but the basic idea is everywhere we inject \n we should instead use a newline() combinator and after the printing is done we do a second pass to actually generate the string and breaking things across lines appropriately.

Do you want to talk about this more, have any hesitations, etc?

benjamn · 2016-12-01T17:00:04Z

One hesitation: when the printer prints an AST node, the resulting Lines object is always indented flush with the left margin (i.e. the first line not indented at all), and then later the Lines object may get reindented*, according to where that node ends up in the output. So I'm not sure you'll have enough information while printing each AST node to make line breaking decisions that take the final indentation into account.

Do you really think a deeply indented node should have to fit itself into fewer columns? I know people like to keep their line lengths below ~80, but that seems like a rule of thumb justified more by ease of enforcement than by sensibility.

Another hesitation: making multiple passes sounds like it might have performance implications.

–––––
* Lines objects support efficient reindentation, since the indentation for each line is stored as a number internally (and can even be negative!).

jlongster · 2016-12-01T18:17:00Z

So I'm not sure you'll have enough information while printing each AST node to make line breaking decisions that take the final indentation into account.

Yep, that's what I struggled with, and why we need combinators and a second pass to make those decisions.

Do you really think a deeply indented node should have to fit itself into fewer columns?

I'm looking for a printer that always fits code within a certain width if possible (there are times, of course, where you can't). It's not just indented code that's the problem; the way it currently measures code doesn't accurately reflect if it crosses the width threshold in many circumstances.

For example, here is a deeply nested function call. The comment marks the column boundaries.

console.log(foosdfsdf(sdfdsfsdfd(sldkfjsdlkfsdf(sdfkldsjflkdjx(sdklfjsdlkfj(sldkfsdlkfj(sdfsdf(sdfsf(sdfsdf(skldfsldkjf(lsdkfjlskdfjl())))))))))));
//       |10       |20       |30       |40       |50       |60       |70       |80       |90       |100

Here's the output if I pretty-print with a width of 80:

console.log(foosdfsdf(sdfdsfsdfd(sldkfjsdlkfsdf(
  sdfkldsjflkdjx(sdklfjsdlkfj(sldkfsdlkfj(sdfsdf(sdfsf(sdfsdf(skldfsldkjf(lsdkfjlskdfjl())))))))
))));
//       |10       |20       |30       |40       |50       |60       |70       |80       |90       |100

Clearly it could have nested more of the function calls easily to get it within 80. The reason it didn't is because it doesn't take the function's name into account; the arguments to the call on line 2 is within 80 so it passes. Now, for CallExpression this could be fixed by also including the function name's length into the equation, but there are other scenarios that are more difficult.

For example:

let x = { result: foosdfsdf(sdfdsfsdfd()) };
//       |10       |20       |30       |40       |50

That is the result of pretty-printing the code with a width of 35. It doesn't do anything. This is because it doesn't take into account it's actual column position. If you run this on real code you see a lot of cases where it doesn't properly fit into the width threshold.

You're goal may be different than mine. I'm actually experimenting with using this printer as a formatter to format all of my code. I want to use this to enforce consistent style.

I will implement an experimental change and see what you think. If you don't want to move forward with it, I can just maintain my own printer and use it instead of using recast.prettyPrint. :)

Another hesitation: making multiple passes sounds like it might have performance implications.

The second pass should be very light-weight: it's just a list of lines and linebreaks and it iterates through them and generates strings. On average it should only be a few thousand items long, I believe. I don't think the overhead should be much, and there are places it would simplify and possible remove overhead because you don't need to create temporary strings to see if should break or not (all of the breaking logic is done in the second pass).

benjamn

The single test failure makes sense given your changes, so please just update that test.

I was skeptical of this change at first, but it's growing on me. In particular, I think this style should make it easier to print leading comments in the right places for multiline arguments, since each argument starts on a new line, and the comment can go on the previous line, rather than having to get inserted somewhere between arguments.

jlongster · 2016-12-02T04:38:23Z

@benjamn After working through this some is does seem like it will be a slightly significant rewrite. I think I can maintain most of the code in printer.js but the algorithm will be fundamentally different, so it's probably best if I just fork it maintain it as my own printer. As I develop this further I'm happy to talk more with you about it and see if there's any interest merging back.

jlongster · 2016-12-02T04:40:03Z

I was skeptical of this change at first, but it's growing on me.

Oops, didn't see this reply before my above comment. Also, forgot that this specific PR was just about a small style change!

Sounds good. Since you'll accept it, I'll fix up the test.

keyz · 2017-01-13T17:08:53Z

Would love to see this and #347 merged! The way prettier handles parameter destructuring is highly desirable for codemods. :)

Print args on separate lines more explicitly

336c94a

jlongster force-pushed the print-args branch from 8c83d6c to 336c94a Compare November 29, 2016 02:48

benjamn requested changes Dec 1, 2016

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Print args on separate lines more explicitly #345

Print args on separate lines more explicitly #345

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Print args on separate lines more explicitly #345

Are you sure you want to change the base?

Print args on separate lines more explicitly #345

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!