Saturday, May 30, 2009

Synpl broken code uploaded on GitHub

Yes, I know, broken code is not very useful to other people. I just wanted to have some kind of backup :-)

Today I worked pretty hard at finding a way to store text changes between the last successful parse and the current moment. Turns out that storing information about white space and comments in the parsed information is not very useful. I've given up on moving comments along with the item they're referring to. For now.

The most interesting result of today is that a very neat way of storing the text changes is a kind of a diff from the last successful parse. This can also be stored on disk very efficiently (before that I was thinking about storing the parse tree representation, which is much larger).

The only file that works right now is TextWithChanges.py in the root of the repository. The unit tests show what it can do. I'm pretty happy with the abstraction (I'm certain I'm rediscovering something classic here, just not realizing it clearly yet).

Basically, a TextWithChanges (TWC) stores the characters that were added or deleted since the last successful (local) parse. It can provide the 'old' version (that parses) and the current version, using current coordinates (which are actual cursor positions from the editor).

An old version of a slice of a TWC can look quite weird, as deleted characters share the same position (in terms of actual cursor positions).

"An example looks like this."

The red characters were deleted, the green ones added. The old version for a slice will not include the green characters, the current version will not include the red ones. Once a slice parses successfully, the changes are forgotten (the characters become black again).

Storing the version with changes on disk allows the editor to have access to a previous valid parse. Since it's possible to store only the changes and refer the file content with a hash, this can be very space-effective.

Friday, May 29, 2009

Clipboard managers for Gnome

On Windows I use clipx to be more efficient when copying and pasting (no, it's not always code).

I'm playing again with Ubuntu and I miss clipx. Fortunately, there is parcellite which is a worthy replacement (Ubuntu packages).

With a bit of tweaking, I got it to behave almost like clipx. Copying and pasting is comfortable again!

Wednesday, May 27, 2009

Sexp structural editing

Yesterday I managed to write some code should be very useful in creating a structured sexp editor plugin for Gedit. Need to clean it up a little bit, then post it to GitHub.

It's interesting how parsing for an editor needs to take into account detailed position information, whitespace and comments information (one of the tasks I want to accomplish is switching items in a sexp list without disturbing the existing layout - if possible - and move comments attached to an item along with the item). I previously thought that moving text chunks like that would require a very smart pretty printer, it turns out that may not be required.

However, there are still problems with languages with a more complicated syntax. I need to implement a parser for a simplified Basic to test the pretty printer-less approach in a more realistic setting.

Structured editing of sexp has long been accomplished in Emacs and lisp modes. I think it would be really interesting to see that extended to other languages, based on a lisp-like parse tree of the code.

Useful trick for interactive Gedit Python API exploration

I mentioned the usefulness of 'dir' in trying to find out how to perform different tasks using the Gedit API. I should have thought about this earlier, but here is a dir+grep function that should help with the large number of methods associated with most Gedit API objects.


>>> def dir2(o, str):
... return [k for k in dir(o) if k.find(str) != -1]
...
>>> v = window.get_active_view()
>>> dir2(v, "selection")


This is very neat - it was stupid of me to browse those long method name lists without some automated help, especially since it was so easy to implement.

Monday, May 25, 2009

Final basic tasks for Gedit plugin API research

I need to:
  1. be able to add menus and keyboard shortcuts
  2. be able to use dialogs and auto-completion like drop-downs
  3. write a EAL (editor abstraction layer) so the Python code I write can be ported with reasonable ease to other editors that support Python scripting/modules
For task number 1, I will look into:
  1. http://live.gnome.org/Gedit/PythonPluginHowTo#head-5d5e6827eac2ca14b5bd4d0fc7d88c28e646865c (again)
  2. http://www.russellbeattie.com/blog/my-first-gedit-plugin
For task number 2:
  1. http://users.tkk.fi/~otsaloma/gedit/completion.py
  2. http://linil.wordpress.com/2008/05/31/using-gedit-to-auto-complete-python-code/
Task number 3 needs more thought, hopefully the design will be somewhat clearer once tasks 1 and 2 are reasonably well researched and understood.

Gedit API - how to replace selected text

Today's post is rather short. It is also a bit incomplete, as I don't understand one of the functions I use (even though the others seem very simple and obvious).


>>> v = window.get_active_view()
>>> b = v.get_buffer()
>>> b.insert_at_cursor("This is some sample text.\nHello, world!\n")


This is basic code to populate the editing buffer with some text to experiment on.


>>> sel0 = b.get_iter_at_offset(8)
>>> sel1 = b.get_iter_at_offset(12)
>>> b.select_range(sel0, sel1)


Selects the word "some" on the first line.


>>> b.delete_selection(0, 1)
>>> b.insert_at_cursor("emos")


The first line deletes the selected text. I don't know the meaning of the parameters - I tried several combinations and they delete the entire selection or seem to have no effect. So I just use (0, 1) as parameters without understanding.

The insert_at_cursor() call is already known and it inserts the string provided as parameter... at the current cursor position, obviously. Which is right where the selected text used to be, because we didn't move it.

Wednesday, May 20, 2009

Gedit API Text Coordinates

I need to be able to convert between positions expressed as (line, column) pairs or (offset-in-characters-from-the-beginning-of-the-file). I also need to be able to get and set the cursor position and get and set the current selection. Also get the bounds of the document (first and last position).

The content of this post (and its "sister posts" about the Gedit API) is in a "exploratory notes" format. It doesn't provide a full API description, it is just meant to help me rediscover quickly how to perform certain tasks. I expected I would get this information from some blog somewhere, but it doesn't seem to be the case. I'm making these notes public because I suspect someone may find them useful at some point. If you're that someone, don't forget to use dir(object) in the Python console to explore further, and to browse the C APIs for Gtk/Gdk and Gedit if you need to dig deeper. I used to have some experience with Gtk, but that is a long time ago and I used it from C, not Python. So I'm a kind of newbie to Gtk/Gedit programming. I never used the Gnome APIs ;-)

Gedit uses the concepts of iterators and ranges to define positions in text and regions of text. An iterator can be used to inspect the text it points to, it can be moved forward and backward, it can be inspected to see if it is placed on the end of a line or word (this last feature is not that useful to me, as I'll have my own definitions for word/paragraph/block).

How do we get an iterator? Easy:

>>> v = window.get_active_view()
>>> b = v.get_buffer()
>>> b.insert_at_cursor("First line.\nSecond line\n.")
>>> start, end = b.get_bounds()
>>> end.get_line()
2
>>> end.get_line_offset()
0
>>> end.get_offset()
25


So, get_bounds() returns a sequence containing the iterator that points to the beginning of the document and the end of the document (the first two lines are pretty self explanatory, a "buffer" is a "document", the model displayed in the view).

"end" has a more interesting position ("start" is on offset 0, line 0, column 0, obviously). get_line() returns the line (counting from 0 as the first line), get_line_offset() returns the column (also counting from 0 as the first column) and get_offset() returns the number of characters since the beginning of the file (I suspect this will be the information I will use, as it is easily converted to line/column formats).

Next: look at the text pointed by an iterator, move the iterator.

>>> start.get_chars_in_line()
12
>>> start.get_char()
u'F'
>>> start.forward_char()
True
>>> start.get_char()
u'i'
>>> start.forward_to_line_end()
True
>>> start.get_char()
u'\n'
>>> start.get_offset()
11
>>> start.get_line()
0
>>> start.get_line_offset()
11
>>> start.forward_char()
True
>>> start.get_offset()
12
>>> start.get_line()
1
>>> start.get_line_offset()
0


So I've looked at the char pointed to by the iterator, moved the iterator and played with its position to make sure my assumptions about get_line(), get_line_offset() and get_offset() are correct.

It's a bit weird to use an iterator named "start" to do all this, but since iterators are mutable objects, it worked. BTW, we didn't affect the beginning of the file in any way. "start" just happened to begin its life by pointing at the beginning of the file, that's it.

How to set the position of an iterator if we know the "offset", for instance?

>>> start.get_offset()
12
>>> end.set_offset(start.get_offset())
>>> end.get_offset()
12
>>> end.get_char()
u'S'
>>> start.get_char()
u'S'
>>> end.get_line()
1
>>> end.get_line_offset()
0


So end and start point to the same position, and I can use an iterator to convert between the offset-from-BOF and (line, column) representations of a position. I can also set the line and column via set_line() and set_line_offset() then get the offset-from-BOF using get_offset(), so the conversion works both ways.

How do I get the iterator at cursor? Place the cursor at the end of the word "Second" on the second line of text using the mouse.

>>> b.get_insert()

>>> b.get_iter_at_mark(b.get_insert())

>>> at_cursor = b.get_iter_at_mark(b.get_insert())
>>> at_cursor.get_line()
1
>>> at_cursor.get_line_offset()
6
>>> at_cursor.get_char()
u' '


I don't know what a mark is yet, and I don't know if these iterators and marks don't need to be removed somehow to avoid memory leaks. However, I'm not too worried about that now. If it becomes a problem, I'll fix it.

The "at_cursor" iterator seems to be positioned properly, though.

How to position the cursor using an iterator (the reverse of getting the position of the cursor as an iterator)? I'll just try to move the cursor over the space between "Second" and "line", then get the position of the cursor again and checked it moved.


>>> at_cursor.forward_char()
True
>>> b.place_cursor(at_cursor)
>>> at_cursor_new = b.get_iter_at_mark(b.get_insert())
>>> at_cursor_new.get_offset()
19
>>> at_cursor.get_offset()
19


So the relevant call is "b.place_cursor()".

What if I want to find out the coordinates of the current selection? Make sure no text is selected.

>>> b.get_selection_bounds()
()


Now select "ond" in "Second" on the second line.

>>> b.get_selection_bounds()
(, )
>>> sel0, sel1 = b.get_selection_bounds()
>>> sel0.get_offset()
15
>>> sel1.get_offset()
18
>>> sel1.get_char()
u' '


So "get_selection_bounds()" returns two iterators, one for the begining and the other for the end of the selection. "sel1" points to the first character after the selection.

How do I set the selection? Deselect the text.

>>> b.select_range(sel0, sel1)


The text is selected again.

How do I get the text between two iterators (using a loop and get_char() seems wasteful, especially since Python strings are immutable)?

>>> b.get_text(sel0, sel1)
'ond'


That concludes the things I wanted to explore in this post. Again, these are just a newbie's personal notes and I will use them to draft a plugin. There are probably better ways to accomplish the tasks described in this post. I was just happy to find one way to perform them, given the lack of documentation (this perception of the documentation is very subjective, some APIs have far worse documentation, some far better; however, since I use the Python "dir()" function and a lot of trial and error, I dare say "lack of documentation").

Monday, May 18, 2009

Test for the syntax highlighter

The point of this post is to test the syntax highlighter.

>>> a = ['Some', 'test']
>>> ', '.join(a)
'Some, test'
Seems I managed to convince it to work.

Gedit colorization

So, here's a gedit text colorization snippet.

Start with an empty document and open the bottom pane, with the Python console activated (View|Bottom Pane, Edit|Preferences|Plugins).

Enter this code:

>>> v = window.get_active_view()
>>> b = v.get_buffer()
>>> t = b.create_tag("simple_tag")
>>> b.insert(b.get_start_iter(), "hello")
>>> b.apply_tag(t, b.get_start_iter(), b.get_end_iter())


Nothing happens, we need to change the colors associated with the tag. It looks like I need to install some kind of support for inserting source code in blog posts. Tomorrow I will try these suggestions.


>>> import gtk
>>> t.props.foreground_gdk = gtk.gdk.Color("#fa0")


Tada! The text is now colorized. Since I'm selfish and sleepy, I won't go into details about the functions that were used (this post is mainly a self-note of how to perform text colorization). If anyone reads this and you feel like you could use some advice, try:


>>> dir(b)


'dir'ring python objects is helpful at times. Still, it would be better if the gedit/gtksourceview guys provided more documentation.

Good night to myself.

Resurrection... again

So I tried working on this project last year... and I I failed. Or at list I didn't get what I wanted, just a better understanding of monads and with that the conclusion that I don't like working with them.

It seems the old synpl project is like a fungus, it never dies if left alone in dark corners with high humidity.

I'm going to try again. This time, no super PL/SQL editor, no C compiler... but a structured editor.

Why? I'm very impressed with the usability of org-mode in Emacs and I want that kind of functionality (moving around content as if it was a tree) in other editing tasks.

I guess this experiment will die an unspectacular death like the previous ones. Well, if that's what it takes to keep the project alive...

Enough rambling. Time to actually use this blog space for something practical (and maybe useful to others).

Since I am not completely crazy, I don't want to implement an editor from scratch. I also don't want to use emacs (somehow elisp seems hard to use for large projects). I will use... gedit.

Yes, the puny notepad like editor distributed with Gnome.

Actually, it turns out that it is a very nice editor. For my purposes, at least. Because it has all the features I want (stylable editor, an API - in Python and explorable with a REPL, no less).

So my learning tasks are now these:
  1. Learn the Gedit API.
  2. Write an editor wrapper lib for the functionality I need (so I can switch to IronPython and ScintillaNET if I want to port my little scripts to Windows).
  3. Write a Scheme/Lisp parser (or use an existing one) and test the usability of the structured approach.
  4. Keep going with a JavaScript parser and structured editor.
  5. Extend the JavaScript parser to support QooxDoo and project files (discovered automatically from exploring the file system neighborhood of the current file).
  6. ... hmmm, profit? (mentally, at least)
Step number 1 is going to take a few blogposts. I intend to write small tutorials on programming the building blocks of my structured editor with the Gedit API. A preliminary list of these is:
  1. Get the cursor position.
  2. Set the cursor position.
  3. Get the current selection.
  4. Set the current selection.
  5. Change text.
  6. Colorize text.
  7. Handle keyboard shortcuts.
Now that I know what I'll do with my free time, it's time to go to bed.