The Tor Browser: gfx/cairo/libpixman/src/refactor@6474c204b198

Cloned upstream origin tor-browser at tor-browser-31.3.0esr-4.5-1-build1
revision ID fc1c9ff7c1b2defdbc039f12214767608f46423f for hacking purpose.

     1 Roadmap

     3 - Move all the fetchers etc. into pixman-image to make pixman-compose.c

     4   less intimidating.

     6   DONE

     8 - Make combiners for unified alpha take a mask argument. That way

     9   we won't need two separate paths for unified vs component in the

    10   general compositing code.

    12   DONE, except that the Altivec code needs to be updated. Luca is

    13   looking into that.

    15 - Delete separate 'unified alpha' path

    17   DONE

    19 - Split images into their own files

    21   DONE

    23 - Split the gradient walker code out into its own file

    25   DONE

    27 - Add scanline getters per image

    29   DONE

    31 - Generic 64 bit fetcher

    33   DONE

    35 - Split fast path tables into their respective architecture dependent

    36   files.

    38 See "Render Algorithm" below for rationale

    40 Images will eventually have these virtual functions:

    42        get_scanline()

    43        get_scanline_wide()

    44        get_pixel()

    45        get_pixel_wide()

    46        get_untransformed_pixel()

    47        get_untransformed_pixel_wide()

    48        get_unfiltered_pixel()

    49        get_unfiltered_pixel_wide()

    51        store_scanline()

    52        store_scanline_wide()

    54 1.

    56 Initially we will just have get_scanline() and get_scanline_wide();

    57 these will be based on the ones in pixman-compose. Hopefully this will

    58 reduce the complexity in pixman_composite_rect_general().

    60 Note that there is access considerations - the compose function is

    61 being compiled twice.

    64 2.

    66 Split image types into their own source files. Export noop virtual

    67 reinit() call.  Call this whenever a property of the image changes.

    70 3.

    72 Split the get_scanline() call into smaller functions that are

    73 initialized by the reinit() call.

    75 The Render Algorithm:

    76 	(first repeat, then filter, then transform, then clip)

    78 Starting from a destination pixel (x, y), do

    80 	1 x = x - xDst + xSrc

    81 	  y = y - yDst + ySrc

    83 	2 reject pixel that is outside the clip

    85 	This treats clipping as something that happens after

    86 	transformation, which I think is correct for client clips. For

    87 	hierarchy clips it is wrong, but who really cares? Without

    88 	GraphicsExposes hierarchy clips are basically irrelevant. Yes,

    89 	you could imagine cases where the pixels of a subwindow of a

    90 	redirected, transformed window should be treated as

    91 	transparent. I don't really care

    93 	Basically, I think the render spec should say that pixels that

    94 	are unavailable due to the hierarcy have undefined content,

    95 	and that GraphicsExposes are not generated. Ie., basically

    96 	that using non-redirected windows as sources is fail. This is

    97 	at least consistent with the current implementation and we can

    98 	update the spec later if someone makes it work.

   100 	The implication for render is that it should stop passing the

   101 	hierarchy clip to pixman. In pixman, if a souce image has a

   102 	clip it should be used in computing the composite region and

   103 	nowhere else, regardless of what "has_client_clip" says. The

   104 	default should be for there to not be any clip.

   106 	I would really like to get rid of the client clip as well for

   107 	source images, but unfortunately there is at least one

   108 	application in the wild that uses them.

   110 	3 Transform pixel: (x, y) = T(x, y)

   112 	4 Call p = GetUntransformedPixel (x, y)

   114 	5 If the image has an alpha map, then

   116 		Call GetUntransformedPixel (x, y) on the alpha map

   118 		add resulting alpha channel to p

   120 	   return p

   122 	Where GetUnTransformedPixel is:

   124 	6 switch (filter)

   125 	  {

   126 	  case NEAREST:

   127 		return GetUnfilteredPixel (x, y);

   128 		break;

   130 	  case BILINEAR:

   131 		return GetUnfilteredPixel (...) // 4 times

   132 		break;

   134 	  case CONVOLUTION:

   135 		return GetUnfilteredPixel (...) // as many times as necessary.

   136 		break;

   137 	  }

   139 	Where GetUnfilteredPixel (x, y) is

   141 	7 switch (repeat)

   142 	   {

   143 	   case REPEAT_NORMAL:

   144 	   case REPEAT_PAD:

   145 	   case REPEAT_REFLECT:

   146 		// adjust x, y as appropriate

   147 		break;

   149 	   case REPEAT_NONE:

   150 	        if (x, y) is outside image bounds

   151 		     return 0;

   152 		break;

   153 	   }

   155 	   return GetRawPixel(x, y)

   157 	Where GetRawPixel (x, y) is

   159 	8 Compute the pixel in question, depending on image type.

   161 For gradients, repeat has a totally different meaning, so

   162 UnfilteredPixel() and RawPixel() must be the same function so that

   163 gradients can do their own repeat algorithm.

   165 So, the GetRawPixel

   167 	for bits must deal with repeats

   168 	for gradients must deal with repeats (differently)

   169 	for solids, should ignore repeats.

   171 	for polygons, when we add them, either ignore repeats or do

   172 	something similar to bits (in which case, we may want an extra

   173 	layer of indirection to modify the coordinates).

   175 It is then possible to build things like "get scanline" or "get tile" on

   176 top of this. In the simplest case, just repeatedly calling GetPixel()

   177 would work, but specialized get_scanline()s or get_tile()s could be

   178 plugged in for common cases.

   180 By not plugging anything in for images with access functions, we only

   181 have to compile the pixel functions twice, not the scanline functions.

   183 And we can get rid of fetchers for the bizarre formats that no one

   184 uses. Such as b2g3r3 etc. r1g2b1? Seriously? It is also worth

   185 considering a generic format based pixel fetcher for these edge cases.

   187 Since the actual routines depend on the image attributes, the images

   188 must be notified when those change and update their function pointers

   189 appropriately. So there should probably be a virtual function called

   190 (* reinit) or something like that.

   192 There will also be wide fetchers for both pixels and lines. The line

   193 fetcher will just call the wide pixel fetcher. The wide pixel fetcher

   194 will just call expand, except for 10 bit formats.

   196 Rendering pipeline:

   198 Drawable:

   199 	0. if (picture has alpha map)

   200 		0.1. Position alpha map according to the alpha_x/alpha_y

   201 	        0.2. Where the two drawables intersect, the alpha channel

   202 		     Replace the alpha channel of source with the one

   203 		     from the alpha map. Replacement only takes place

   204 		     in the intersection of the two drawables' geometries.

   205 	1. Repeat the drawable according to the repeat attribute

   206 	2. Reconstruct a continuous image according to the filter

   207 	3. Transform according to the transform attribute

   208 	4. Position image such that src_x, src_y is over dst_x, dst_y

   209 	5. Sample once per destination pixel

   210 	6. Clip. If a pixel is not within the source clip, then no

   211 	   compositing takes place at that pixel. (Ie., it's *not*

   212 	   treated as 0).

   214 	Sampling a drawable:

   216 	- If the channel does not have an alpha channel, the pixels in it

   217 	  are treated as opaque.

   219 	Note on reconstruction:

   221 	- The top left pixel has coordinates (0.5, 0.5) and pixels are

   222 	  spaced 1 apart.

   224 Gradient:

   225 	1. Unless gradient type is conical, repeat the underlying (0, 1)

   226 		gradient according to the repeat attribute

   227 	2. Integrate the gradient across the plane according to type.

   228 	3. Transform according to transform attribute

   229 	4. Position gradient

   230 	5. Sample once per destination pixel.

   231  	6. Clip

   233 Solid Fill:

   234 	1. Repeat has no effect

   235 	2. Image is already continuous and defined for the entire plane

   236 	3. Transform has no effect

   237 	4. Positioning has no effect

   238 	5. Sample once per destination pixel.

   239 	6. Clip

   241 Polygon:

   242 	1. Repeat has no effect

   243 	2. Image is already continuous and defined on the whole plane

   244 	3. Transform according to transform attribute

   245 	4. Position image

   246 	5. Supersample 15x17 per destination pixel.

   247 	6. Clip

   249 Possibly interesting additions:

   250 	- More general transformations, such as warping, or general

   251 	  shading.

   253 	- Shader image where a function is called to generate the

   254           pixel (ie., uploading assembly code).

   256 	- Resampling kernels

   258 	  In principle the polygon image uses a 15x17 box filter for

   259 	  resampling. If we allow general resampling filters, then we

   260 	  get all the various antialiasing types for free.

   262 	  Bilinear downsampling looks terrible and could be much

   263 	  improved by a resampling filter. NEAREST reconstruction

   264 	  combined with a box resampling filter is what GdkPixbuf

   265 	  does, I believe.

   267 	  Useful for high frequency gradients as well.

   269 	  (Note that the difference between a reconstruction and a

   270 	  resampling filter is mainly where in the pipeline they

   271 	  occur. High quality resampling should use a correctly

   272 	  oriented kernel so it should happen after transformation.

   274 	  An implementation can transform the resampling kernel and

   275 	  convolve it with the reconstruction if it so desires, but it

   276 	  will need to deal with the fact that the resampling kernel

   277 	  will not necessarily be pixel aligned.

   279 	  "Output kernels"

   281 	  One could imagine doing the resampling after compositing,

   282 	  ie., for each destination pixel sample each source image 16

   283 	  times, then composite those subpixels individually, then

   284 	  finally apply a kernel.

   286 	  However, this is effectively the same as full screen

   287 	  antialiasing, which is a simpler way to think about it. So

   288 	  resampling kernels may make sense for individual images, but

   289 	  not as a post-compositing step.

   291 	  Fullscreen AA is inefficient without chained compositing

   292 	  though. Consider an (image scaled up to oversample size IN

   293 	  some polygon) scaled down to screen size. With the current

   294 	  implementation, there will be a huge temporary. With chained

   295 	  compositing, the whole thing ends up being equivalent to the

   296 	  output kernel from above.

   298 	- Color space conversion

   300 	  The complete model here is that each surface has a color

   301 	  space associated with it and that the compositing operation

   302 	  also has one associated with it. Note also that gradients

   303 	  should have associcated colorspaces.

   305 	- Dithering

   307 	  If people dither something that is already dithered, it will

   308 	  look terrible, but don't do that, then. (Dithering happens

   309 	  after resampling if at all - what is the relationship

   310 	  with color spaces? Presumably dithering should happen in linear

   311 	  intensity space).

   313 	- Floating point surfaces, 16, 32 and possibly 64 bit per

   314 	  channel.

   316 	Maybe crack:

   318 	- Glyph polygons

   320 	  If glyphs could be given as polygons, they could be

   321 	  positioned and rasterized more accurately. The glyph

   322 	  structure would need subpixel positioning though.

   324 	- Luminance vs. coverage for the alpha channel

   326 	  Whether the alpha channel should be interpreted as luminance

   327           modulation or as coverage (intensity modulation). This is a

   328           bit of a departure from the rendering model though. It could

   329 	  also be considered whether it should be possible to have

   330 	  both channels in the same drawable.

   332 	- Alternative for component alpha

   334 	  - Set component-alpha on the output image.

   336 	    - This means each of the components are sampled

   337 	      independently and composited in the corresponding

   338 	      channel only.

   340 	  - Have 3 x oversampled mask

   342 	  - Scale it down by 3 horizontally, with [ 1/3, 1/3, 1/3 ]

   343             resampling filter.

   345 	    Is this equivalent to just using a component alpha mask?

   347 	Incompatible changes:

   349 	- Gradients could be specified with premultiplied colors. (You

   350 	  can use a mask to get things like gradients from solid red to

   351 	  transparent red.

   353 Refactoring pixman

   355 The pixman code is not particularly nice to put it mildly. Among the

   356 issues are

   358 - inconsistent naming style (fb vs Fb, camelCase vs

   359   underscore_naming). Sometimes there is even inconsistency *within*

   360   one name.

   362       fetchProc32 ACCESS(pixman_fetchProcForPicture32)

   364   may be one of the uglies names ever created.

   366   coding style:

   367   	 use the one from cairo except that pixman uses this brace style:

   369 		while (blah)

   370 		{

   371 		}

   373 	Format do while like this:

   375 	       do

   376 	       {

   378 	       }

   379 	       while (...);

   381 - PIXMAN_COMPOSITE_RECT_GENERAL() is horribly complex

   383 - switch case logic in pixman-access.c

   385   Instead it would be better to just store function pointers in the

   386   image objects themselves,

   388   	get_pixel()

   389 	get_scanline()

   391 - Much of the scanline fetching code is for formats that no one

   392   ever uses. a2r2g2b2 anyone?

   394   It would probably be worthwhile having a generic fetcher for any

   395   pixman format whatsoever.

   397 - Code related to particular image types should be split into individual

   398   files.

   400 	pixman-bits-image.c

   401 	pixman-linear-gradient-image.c

   402 	pixman-radial-gradient-image.c

   403 	pixman-solid-image.c

   405 - Fast path code should be split into files based on architecture:

   407        pixman-mmx-fastpath.c

   408        pixman-sse2-fastpath.c

   409        pixman-c-fastpath.c

   411        etc.

   413   Each of these files should then export a fastpath table, which would

   414   be declared in pixman-private.h. This should allow us to get rid

   415   of the pixman-mmx.h files.

   417   The fast path table should describe each fast path. Ie there should

   418   be bitfields indicating what things the fast path can handle, rather than

   419   like now where it is only allowed to take one format per src/mask/dest. Ie.,

   421   {

   422     FAST_a8r8g8b8 | FAST_x8r8g8b8,

   423     FAST_null,

   424     FAST_x8r8g8b8,

   425     FAST_repeat_normal | FAST_repeat_none,

   426     the_fast_path

   427   }

   429 There should then be *one* file that implements pixman_image_composite().

   430 This should do this:

   432      optimize_operator();

   434      convert 1x1 repeat to solid (actually this should be done at

   435      image creation time).

   437      is there a useful fastpath?

   439 There should be a file called pixman-cpu.c that contains all the

   440 architecture specific stuff to detect what CPU features we have.

   442 Issues that must be kept in mind:

   444        - we need accessor code to be preserved

   446        - maybe there should be a "store_scanline" too?

   448          Is this sufficient?

   450 	 We should preserve the optimization where the

   451 	 compositing happens directly in the destination

   452 	 whenever possible.

   454 	- It should be possible to create GPU samplers from the

   455 	  images.

   457 The "horizontal" classification should be a bit in the image, the

   458 "vertical" classification should just happen inside the gradient

   459 file. Note though that

   461       (a) these will change if the tranformation/repeat changes.

   463       (b) at the moment the optimization for linear gradients

   464           takes the source rectangle into account. Presumably

   465 	  this is to also optimize the case where the gradient

   466 	  is close enough to horizontal?

   468 Who is responsible for repeats? In principle it should be the scanline

   469 fetch. Right now NORMAL repeats are handled by walk_composite_region()

   470 while other repeats are handled by the scanline code.

   473 (Random note on filtering: do you filter before or after

   474 transformation?  Hardware is going to filter after transformation;

   475 this is also what pixman does currently). It's not completely clear

   476 what filtering *after* transformation means. One thing that might look

   477 good would be to do *supersampling*, ie., compute multiple subpixels

   478 per destination pixel, then average them together.

The Tor Browser / file revision

gfx/cairo/libpixman/src/refactor@6474c204b198

gfx/cairo/libpixman/src/refactor