Prettier のコードを読む Part 2

2024/09/02 14:42:40

続き。

Prettier のコードを読む Part 1 | Memory ice cubes https://leaysgur.github.io/posts/2024/09/02/103846/

今回は、整形処理のメインであるcoreFormat(text, options)の詳細を、いけるところまで。

整形前文字列からAST
ASTからDoc
Docから整形後文字列

という流れだった。

`parse(text, options)`

https://github.com/prettier/prettier/blob/3.3.3/src/main/parse.js

指定されたパーサーの実体をロードして、ASTに変換する。定義されてれば、パース前にparser.preprocess(text, options)してから。

（initParser()って、optionsをこねてるときにもやってなかった・・・？）

options.originalTextをこっそり足してるのは気になるが、それ以外に特別なことはやってなさそう。

`printAstToDoc(ast, options)`

https://github.com/prettier/prettier/blob/3.3.3/src/main/ast-to-doc.js

ここがPrettierの中心っぽい！

prepareToPrint(ast, options)でASTを下処理
- ASTのcommentsやtokensを、またもoptionsにコピー
- attachComments(ast, options)
  - main/comments/attach.jsは後述
- printer.preprocessがあれば、実行したASTを返す
ASTをAstPathというクラスに変換
callPluginPrintFunction(astPath, options, mainPrint)
- これが再帰でDocを作る
ensureAllCommentsPrinted(options)
- 事前にコピーしておいたコメントを、print済のコメントと見比べていく
- attachComments()時に生やしておいたフラグをdeleteしながら・・・
Docを返す

どれも粒ぞろいである・・・。順に見ていく。

`attachComments(ast, options)`

https://github.com/prettier/prettier/blob/3.3.3/src/main/comments/attach.js#L136

attachComments()という重厚な関数がいて、JSDocを彷彿とさせる嫌な予感がする。（蘇る悪夢・・・）

これは、ASTのcommentsそれぞれに対して、decorateComment(node, comment, options, enclosingNode)を実行し、

enclosingNode
precedingNode
followingNode

といった所在をマーキングしていくらしい。これらは必ずすべて生えるわけではなく、undefinedなままになることもある。

https://github.com/prettier/prettier/blob/3.3.3/src/main/comments/attach.js#L59

そのコメントの最寄りのASTノードのことを調べてるって感じかな？

以前に調べたTypeScriptとも、ESLintのJSDocを探すやつとも、また違うロジックに見える。（まぁ・・・そうなるか・・・コメントの用途をパーサーが類推するなが教訓だったわ）

https://github.com/prettier/prettier/blob/3.3.3/src/main/comments/utils.js

最終的に、渡したASTの各ノードに、comments: Comment[]を生やすのが主な目的。もちろん、Comment型はもはやESTreeのそれではなく、拡張していろいろ生やしたやつ。

ここを理解するだけですごい時間かかりそうね。

`new AstPath(ast)`

https://github.com/prettier/prettier/blob/main/src/common/ast-path.js#L1

stack: Ast[]だけを持つクラス。

Docを生成するために使われるデータ構造で、コメントにもそれらしいことが書いてあった。

* All the while, these functions pass a "path" variable around, which
* is a stack-like data structure (AstPath) that maintains the current
* state of the recursion. It is called "path", because it represents
* the path to the current node through the Abstract Syntax Tree.

スタックになってるのは、ASTのノードを降りていきつつ、それぞれ一部分ごとにプリンターに渡すためかな？

役割はわかりやすいけど、その用途がまだうまくイメージできない。

`callPluginPrintFunction(astPath, options, mainPrint)`

https://github.com/prettier/prettier/blob/3.3.3/src/main/ast-to-doc.js#L94

commentsプロパティを足したASTを元に初期化したAstPathを受け取って、ノードごとにprinter.print()をひたすら呼んでいく。

それぞれprinter.print()は、各language-*/index.jsでexportされてたやつ。

JSの場合、そのフロントマンであるESTreeのプラグインがその実体。その裏側では、

Angular
JSX
Flow
TypeScript
Estree

という順に呼んでみて、それぞれ独自のノード名に合致したら処理し、Docが返ってきたらそれを使うようになってた。

というかestreeのロジック、自称ESTree互換を謳う各種パーサー全部に対応できるようになっててすごいことになってた。

引数で渡してるmainPrint()ことmainPrintInternal()が、callPluginPrintFunction()を呼び直していることから、mainPrint()とprinter.print()がキャッチボールしながら再帰でどんどんDocを育てていく感じらしい。

* This is done by descending down the AST recursively. The recursion
* involves two functions that call each other:
*
* 1. mainPrint(), which is defined as an inner function here.
*    It basically takes care of node caching.
* 2. callPluginPrintFunction(), which checks for some options, and
*    ultimately calls the print() function provided by the plugin.
*
* The plugin function will call mainPrint() again for child nodes
* of the current node. mainPrint() will do its housekeeping, then call
* the plugin function again, and so on.

mainPrintInternal()では、Docのキャッシュがあったらそれを返し、なければcallPluginPrintFunction()を呼ぶ。

つまりDocは、「コードを最終的にこういう風に整形せよ」というコマンドを積み重ねたもの。

printComments(path, doc, options)というコメント用のやつもある。既存Docの前後にコメントがあればそれを出すようにする的な。

https://github.com/prettier/prettier/blob/3.3.3/src/main/comments/print.js#L198

ここが一番大事なところであることはわかりつつ、これ以上をちゃんと理解するためには、実際に具体的なユースケースを洗っていくしかなさそうか。

とりあえず次へ進む。

`printDocToString(doc, options)`

https://github.com/prettier/prettier/blob/3.3.3/src/document/printer.js#L302

Docをスタックに積みながら、ひたすら文字列化していくフェーズ。 Docは配列になってることもあるらしく、その都度スタックに展開していく。

type DocCommand =
  | Align
  | BreakParent
  | Cursor
  | Fill
  | Group
  | IfBreak
  | Indent
  | IndentIfBreak
  | Label
  | Line
  | LineSuffix
  | LineSuffixBoundary
  | Trim;
type Doc = string | Doc[] | DocCommand;

リポジトリのルートにドキュメントもあった。

https://github.com/prettier/prettier/blob/3.3.3/commands.md

コードと照らしてみるに、これらDocを生み出すためのDocビルダー？は別の概念として存在していて、それらはもう少し種類がある。これらはJavaScriptで実行できて、返り値としてこれらDocを返すようになってる。

ここに載ってないので言うと、たとえばdedent(): Docというやつは、AlignのDocを返すやつらしい。

https://github.com/prettier/prettier/blob/3.3.3/src/document/builders.js

コメントもstringなコマンドで既に積まれてるので、最後に.join("")でまとめておしまい。

巨大なswitch-caseであり、ここはもう単純作業フェーズって感じだろうか。オプションもタブとか幅とかそういうのしか考慮してないし。

追うとしてもまた今度で。

ここまでのまとめ

整形前コード > AST
AST > Doc
Doc > 整形後コード

この3段階処理の詳細および成果物としては、

コメントが各ノードに埋め込まれたAST
そのASTと整形オプションから生成されたDoc
そのDocといくつかのプション（タブとか最大幅とか）から出力される整形後コード

この3つのチェックポイントがあると。

まぁやっぱコメントまわりが鬼門だろうな〜。プラグイン側からその挙動を制御することもできるらしい。

https://prettier.io/docs/en/plugins#handling-comments-in-a-printer

だいたいなんとなくはわかったけど、神は細部に宿るということで、結局language-js/print配下がすべてなんやろうな。

この先は、具体的なコードが整形されていく様を追いかけるしかなさそう。