Sonntag, 25. September 2011

libass 0.10.0 released

I just released libass 0.10.0, which finally wraps up bidirectional layout and shaping support and makes it available to many users. Get it from the Google Code download page.

Montag, 22. August 2011

GSoc 2011 Wrap-up

Complex text layout

With this year's GSoC, libass gained solid complex text layout support, with help of the FriBidi and HarfBuzz-ng libraries. In practice that means Arabic, Hebrew, Devanagari and many other scripts that are non-trivial to render display correctly (and without side effects or bugs). Additionally, for latin text,  ligatures, combining marks and language-specific features should work as expected. Vertical CJK text layout is also improved, now it can use proper vertical glyphs variants, if available. The work on complex text layout progressed quickly, and shortly after the midterms most text rendered without major problems.

A real-life example of Arabic subtitles.

New font handling code (without fontconfig)

Since there was quite a lot of time left in GSoC, I started focusing on a completely different problem: font matching. libass used fontconfig for that, but fontconfig is a pain on Windows. Moreover, the font sorter/matcher of fontconfig isn't very suitable for libass, since it does not match names the way Windows, and therefore VSFilter does.

I completely ripped out fontconfig and replaced it by my own font sorting/matching code. Various backends can provide font meta information to the font sorter. Currently, three such backends exist:

  • FreeType backend, mostly useful for embedded fonts
  • fontconfig backend, to access system fonts on Unix
  • Windows GDI backend, to access system fonts on Windows

In the end, in practice the most important advantage is that there's no hard dependency on fontconfig anymore. Even without any additional platform-specific backend, libass can now render embedded Matroska subtitles correctly, as the long as the required fonts are attached. The Windows GDI backend gets rid of the fontconfig cache building, which bothers many users.

What's cooking

Performance Improvements to the Windows GDI font backend.

The GDI interface is very bad and awkward to use for collecting font metadata. The new DirectWrite API is much more sane and I'll likely implement a backend for this. It is supported on Windows Vista and up.

I have half-finished SSE2-accelerated versions of the blur filters (\be, \blur) lying around, these should be  completed.

Rendering quality will be improved with an internal compositor. Especially if alpha transparency is in use, results are less than optimal at the moment.

Sonntag, 21. August 2011

System fonts without fontconfig

I've uploaded a new version of my win32 VLC test build. This new download includes an experimental Windows GDI font collector (patch), so system fonts can be used now! The interface for that (FontProvider) is now publicly exposed in the libass library as well.

Not actually that horrible.

GDI is a bad API for what I need to do, so there are some limitations. I can't get all "full names" for a font, only the localized or often English name. More importantly, fonts are accessed with the GetFontData call that buffers them into memory. This can be quite slow, especially for big CJK fonts or Unicode fonts like "Arial Unicode MS".

Still, for most purposes, this should work fine.

Download the test build

Freitag, 19. August 2011

VLC win32 test build

I've built VLC for Windows against the latest code of the fonts branch, and of course with the FriBidi and HarfBuzz support from mainline. This means:

  • Arabic and Hebrew text works correctly (but no HarfBuzz support)
  • OpenType shaping works
  • The annoying "Building font cache" message is gone
However, currently only embedded fonts are supported. Usually most fonts are embedded in Matroska files, so this often is not a big issue.

Download the test build

Update: Download replaced by a new build with HarfBuzz support.

Donnerstag, 18. August 2011


With the latest commits to the fonts branch, I can say with some excitement that libass does reasonably work without fontconfig now. Embedded fonts work perfectly, and so does the fallback font (if specified).

Next stop: a public interface for providing information about system fonts.

Montag, 8. August 2011

Font handling: it's hard!

So far libass uses FontConfig for collecting font metadata and matching fonts. However, fontconfig isn't ideal for matching fonts in the way the ASS/SSA formats need it. These formats primarily use the "full name" for matching the font, that is, a name including style, such as "Arial Bold Italic". FontConfig does not match against this name at all, and there are various hacks in libass to work around that (badly).

If you dig deeper into it, you'll find out that font naming is a big and complicated mess. Let's sum up the facts:
  • Various different names exist, such as family name, full name, PostScript name, subfamily name, etc.
  • Most of these can be localized
  • Different platforms (such as Windows or Mac) can have individual names
So there's no such thing as a single name for a font at all! Gladly we don't really need to support all of these names (family name and full name seems to be good enough).

As a first step to get away from the FontConfig dependency I've implemented my own font sorter and matcher that is optimized for the needs of ASS/SSA. I've also begun to work on an interface (FontProvider) that allows various font sources (such as Windows DirectWrite or container-embedded fonts) to plug into libass. At the moment, this already allows libass to work without fontconfig when only embedded fonts are needed.

There are still some problems to solve with this new code, but in the meantime, it will live in the fonts branch, until it is ready for general consumption.

Sonntag, 31. Juli 2011

libass 0.9.13 released

A few days ago FreeType 2.4.6 was released, which contains the new stroker code, which unfortunately crashes with libass 0.9.12 under some circumstances. I've released a new bugfix release, 0.9.13, to remedy this. Get it at Google Code!

Note: this release does not contain the recent complex text layout work. It's 0.9.12 plus bugfixes.

Donnerstag, 28. Juli 2011

libass git repo on Google Code

Google Code recently introduced Git support, and at the same time suffered from some reliability problems. Thus I've mirrored the repository at Google Code now. I will keep both repositories up to date.

Sonntag, 17. Juli 2011

New outline stroker from FreeType

FreeType recently received an update to its stroker, which supposedly fixes many rendering issues. And indeed it does! However, first I needed to fix the outline preprocessing in libass, since it wasn't very cleanly handling outline modifications. Now that this is fixed, here's a quick comparison of old vs. new stroker.

One font that has been very problematic for FreeType's stroker is Comiquita Sans. Previously, the outlines generated by the stroker were broken in pretty funny ways and this was clearly visible in the rendering:

The new stroker fixes these issues completely:

The new stroker is available in FreeType git master and will appear in the next FreeType release (2.4.6).

Dienstag, 12. Juli 2011

Vertical writing

VSFilter, pretty much the reference renderer for SSA/ASS, can make use of an obscure Windows feature, often called "@font". When a font name is prefixed with an "@" symbol, Windows switches to a pseudo-vertical writing mode for CJK. Latin characters are written as usual, but CJK characters are rotated and substituted by their vertical forms, if possible. Until now, libass wasn't able to do these substitutions, that especially means punctuation, brackets, parentheses and so on were wrong and and/or wrongly positioned.

With HarfBuzz shaping in place, it is easy to do these substitutions. The OpenType features vert and vkna are responsible for them. vert enables support for vertical writing in general (substitution of brackets, punctuation, et cetera) and vkna enables alternate forms for Kana, if available.

Here's a side-by-side comparison of some random Japanese text. On the left a rendering without substitutions, on the right with both features enabled. The Meiryo font was used.

Montag, 11. Juli 2011

HarfBuzz shaping support

Yeah, it's pretty much working now... after hunting down a few of stupid bugs. libass can now render text without all kinds of OpenType features. Here are a few samples.

Arabic text with diacritics (buggy with FriBidi's shaper).

Connected handwriting (using the Zaner-Bloser Schoolhouse font)

Automatic fractions (using the Calluna font)

Samstag, 9. Juli 2011

Simple Arabic shaping

FriBidi contains a simple Arabic shaper. This shaper is based around the fact that Unicode contains codepoints for presentational forms of many Arabic characters, for traditional reasons. This can be great, as it allows very easy and simple shaping by analyzing the text, without doing any font-specific lookups. However, more advanced shaping features are not possible.

Just now I added support for FriBidi's simple Arabic shaper to libass. The bug about Arabic support contains a test case, and this renders great now.

Work to use a "real" shaper, i.e. HarfBuzz, is already under way, of course. :)

Mittwoch, 6. Juli 2011

libass now supports bidi

After a rather long phase of refactoring and cleaning up libass, I finally started with the BiDi implementation, using FriBidi. This turned out to be easier than expected! Well, at least a buggy first implementation was easy. Let's see where we can go from here.

Here's a sample. Latin text with a bit of Hebrew in it that is rendered right-to-left (RTL).

for example, the Hebrew name Sarah (שרה) is spelled\Nshin (ש) resh (ר) heh (ה) from right to left.

Here's a more complex one. Hebrew text (RTL) with numbers in it (LTR) and brackets, which are mirrored by FriBidi's rather simple shaper (Later, HarfBuzz is going to do mirroring, and a lot more). Note that the dots are incorrectly positioned, that's because I'm forcing the pararaph text direction to LTR at the moment.

דייטשלאנד געהערט צו דער שענגען זאנע (אן גרענעצן) און האט אדאפטירט די איירא (די בשותפותדיקע אייראפעישע וואלוטע) אום 1999...

Directional overrides using special Unicode characters are supported, too.

Hallo wie geht?\N‮Hallo wie geht?

The code will be available in the libass repo soon. There's still some cleanup to do.

Montag, 20. Juni 2011

Sonntag, 5. Juni 2011

Standalone renderer

I just uploaded sources of the standalone renderer with some additional fixes. The archive contains the Git history as well as some sample files.

Freitag, 3. Juni 2011

Finally: Bidi, shaping and line wrapping

It turned out to be more tricky than I imagined, but now it works: a simple, but full text layout engine, that supports bidirectional text with full shaping (where needed) and is capable of wrapping bidirectional text correctly. The code is probably horrible and very inefficient, though.

In the bottom you can see the reference rendering (rendered by a GTK app, i.e. Pango), in the top is my rendering. Note that I have no idea what the Arabic text actually means, I don't know any Arabic language.  I can read the script a bit, but that's all. The text was copy & pasted together from somewhere.

Now it's time to start digging into libass, cleaning up some of the mess and preparing it for inclusion of this functionality.

Montag, 30. Mai 2011

libass 0.9.12 released

I released libass 0.9.12 just now. There are no surprises, this is just a bugfix release that further improves compatibility with VSFilter. Tarballs are available from the project page.

Donnerstag, 26. Mai 2011

Vertical shaper in Harfbuzz

Looks like support for vertical layouts and the vert and vrt2 tables became usable in Harfbuzz now, just in time! This will definitely make proper vertical CJK layout easier in libass. I still need to test it, though...

After a little bit of trouble, this worked just fine.

Dienstag, 24. Mai 2011

Getting complex text layout into libass

Unfortunately, libass's rendering model doesn't make it easy to plug-in contextual transforms that are needed for complex text layout. Currently, for every subtitle event, a lot of  processing is done per-glyph before line breaking, positioning, etc. take place. However, the complex text layout engine needs runs of text. Let's look again at the text layout pipeline:

  1. Split up text into runs according to style (font, size, decoration).
  2. Split up runs further according to text direction (depending on script and language).
  3. Shape runs that need it.
  4. Break lines.
  5. Reorder lines into visual order
Step 1 is currently not done with runs, style is strictly applied per glyph. This is not without problems, for example it makes text decorations (underline, strike-through) hard to implement correctly and positioning after certain style changes is hard to get right (from italic to non-italic style). Moreover, this requires inter-glyph blending later on in the rendering pipeline.

My plan is to completely refactor the main rendering loop from individual glyphs to runs to get rid of these problems. Obviously, the other advantage of it is that it makes plugging complex text layout into rendering much easier.

What about the next steps? Steps 2 and 5, BiDi transformation, will be handled by fribidi. Step 3, text shaping, will be handled by the new harfbuzz-ng library. Step 4 is going to be handled by liblinebreak plus support code in libass.

As a first step, I will implement a simple standalone renderer for steps 2-5. I'm using the hb-view program from harfbuzz-ng as the base.

Update: added step 5, reordering.

GSoC welcome package

It just arrived. I like the glowing sticker!

Montag, 23. Mai 2011

Introduction to complex text layout

Complex text layout is, as the name says, a pretty complicated process. The term stands for various text transformations that need to be done to render scripts that require more than trivial codepoint-to-glyph mapping.

Generally, the following transformations are complex text layout:

  • BiDi. Many languages do not write from left to right, but from right to left (e.g. Hebrew). Usually, numbers are still written from left to right, though. Sometimes you need to mix right to left text into left to right text. When there's any mix between directions, and that can happen quite quickly, the text needs to be split up into so-called runs with the same direction, and rendered accordingly. Unicode specifies such an algorithm, the Unicode Bidirectional Algorithm.
  • Text shaping. Many scripts, especially cursive scripts (most importantly Arabic and derivatives) require contextual glyph substitutions. Depending on the position of a glyph inside a word, a certain variant needs to be used. Moreover, as it is a cursive script, the letters need to be repositioned so that they connect cleanly to each other. There are a lot more features referred to as shaping. Shaping requires runs of text with the same direction and script.
  • Line breaking. Mixing text direction, language and script complicates line breaking. Unicode specifies the Unicode Line Breaking Algorithm to deal with that.
According to that, a full complex text layout engine needs to do a lot of work.
  1. Split up text into runs according to style (font, size, decoration).
  2. Split up runs further according to text direction (depending on script and language).
  3. Shape runs that need it.
  4. Break lines.
Complete and easy to use cross-platform engines already exist and work well. One example is the popular Pango library. However, Pango only offers a very high-level API for the complete engine. It's not flexible enough for libass, which does a lot of rather low-level font manipulations, and it's said to be slow, while performance is critical for libass.

So there's no way around doing all steps yourself. Fortunately, stable libraries for all of the critical steps are available. I'm going to describe the plans for the libass implementation in the next posts.

Freitag, 20. Mai 2011

Let's get this started

This blog is mostly dedicated for Google Summer of Code 2011, where I am going to implement complex text layout support for the libass subtitle rendering library, under mentorship of the VideoLAN project.

I'll probably also blog about some topics only mildly related to GSOC. Anyway, let's get this started!