Friday, August 31, 2007

Pretty printing comments

(Note: there's a new demo implementing what's described below; sadly, snapdrive.net seems to have a lot of downtime, if the link doesn't work please try later)

Pretty printing is simple, right? Just render the syntax tree as text in a pretty way and that's it.

Hmmm. What about comments? Some pretty printers (Synpl earlier versions included) just throw away comments, since they don't appear in the syntax tree. This is very wrong, as it loses precious information.

If we pretty print comments, where do we place them? The shape of the program may have changed drastically by reformatting it.

The solution I adopted is pretty naive, but it works rather well. This is what I do:
  • when tokenizing the source, collect comments in a separate list; record the content and the starting and ending positions for each comment;
  • allow some syntax tree nodes to have "after" and "before" comments; a good list of nodes to allow comments is: statements, item declarations in "DECLARE" sections of blocks, functions, procedures, triggers, packages (both spec and body) and types (also both spec and body);
  • associate each comment with the closest node that accepts comments and doesn't overlap the comment;
  • when pretty printing the node, print the comments with the same level of indentation as the node (for both "above" and "before" list of comments);
This works well, but... what about cases like:

begin
if 1 < a then
c := 1;
else
null;
-- uncomment next PL/SQL line to remove warnings about variable c
-- sometimes being used before initialization

--c := 2;
end if;
null;
end;
There are three comments here. They should all be associated to the null statement after the else. This is what happens for the first two comments. But the third is "closer" to the second null statement, and is therefore inserted in its "before" list and shows between end if and null in the pretty print. This is clearly wrong. Also, the blank line between comments is lost in translation.

The solution? Group consecutive comments together into a larger comment, and preserve the blank lines in between.

After implementing comment grouping things work as expected. There are plenty of things left to chance, such as:
  • what to do with code inside comments
  • what about broken code inside comments
  • what about formatting paragraphs in comments so they fit in the required number of columns?
Turns out pretty printing is an art in itself :)

Friday, August 17, 2007

New Synpl demo

Here's the first demo since switching languages to Scala.

What's in there:
  • a parser for many SQL and PL/SQL language elements (blocks, basic variable definitions, most types of statements, most DML stuff, procedures, functions, packages)
  • a basic analyzer (can look at a PL/SQL block and report variables that are used before being initialized, shadowed variables (same names used in inner blocks), variables that are defined and/or initialized but are never used (usage is not detected in all cases, and initialization by INTO clauses is not recognized)
  • a pretty printer (adds missing elements - for instance, if the user forgot a 'THEN', the parser will report an error but recover and the pretty printer will show the source with all such errors corrected)
  • a small GUI to help with testing
What's missing:
  • many SQL and PL/SQL language elements are not identified by the parser (explicit join syntax for SQL, variable initialization at definition, PL/SQL objects and arrays etc.)
  • most of the analyzer features
  • advanced pretty printing (such as: detect total length of a SELECT query, and if it doesn't overflow the current line, print it on a single line)
Use 'run.bat' to launch the GUI. Use "File|Open" within the GUI to reach a sample PL/SQL source that demonstrates the error-recovery features of the parser, the current analyzer features and the pretty printer.


The GUI was tested with Java 1.6.

Thursday, August 16, 2007

Synpl in Scala

I've decided to rewrite synpl in Scala. It's much better than C# for this particular task because of several reasons:
  • it compiles to Java bytecodes, which means I can build a JAR file and then use it to build JDeveloper or SqlDeveloper extensions
  • Scala has pattern matching and "case classes" which make it very easy to build AST trees and to walk those trees to do static analysis
So far, so good. These blog entries (1) (2) proved very useful as a Scala cheat sheet and saved me a lot of time.

Right now a significant chunk of the PL/SQL grammar is implemented and some static analysis also works (detecting variable use before initialization). There's plenty of work to be done, but it looks like Scala is a great language choice.

Probably tomorrow or the day after I'll put together a new demo.