gfx/cairo/libpixman/src/refactor

Wed, 31 Dec 2014 06:09:35 +0100

author
Michael Schloh von Bennewitz <michael@schloh.com>
date
Wed, 31 Dec 2014 06:09:35 +0100
changeset 0
6474c204b198
permissions
-rw-r--r--

Cloned upstream origin tor-browser at tor-browser-31.3.0esr-4.5-1-build1
revision ID fc1c9ff7c1b2defdbc039f12214767608f46423f for hacking purpose.

     1 Roadmap
     3 - Move all the fetchers etc. into pixman-image to make pixman-compose.c
     4   less intimidating.
     6   DONE
     8 - Make combiners for unified alpha take a mask argument. That way
     9   we won't need two separate paths for unified vs component in the
    10   general compositing code.
    12   DONE, except that the Altivec code needs to be updated. Luca is
    13   looking into that.
    15 - Delete separate 'unified alpha' path
    17   DONE
    19 - Split images into their own files
    21   DONE
    23 - Split the gradient walker code out into its own file
    25   DONE
    27 - Add scanline getters per image
    29   DONE
    31 - Generic 64 bit fetcher 
    33   DONE
    35 - Split fast path tables into their respective architecture dependent
    36   files.
    38 See "Render Algorithm" below for rationale
    40 Images will eventually have these virtual functions:
    42        get_scanline()
    43        get_scanline_wide()
    44        get_pixel()
    45        get_pixel_wide()
    46        get_untransformed_pixel()
    47        get_untransformed_pixel_wide()
    48        get_unfiltered_pixel()
    49        get_unfiltered_pixel_wide()
    51        store_scanline()
    52        store_scanline_wide()
    54 1.
    56 Initially we will just have get_scanline() and get_scanline_wide();
    57 these will be based on the ones in pixman-compose. Hopefully this will
    58 reduce the complexity in pixman_composite_rect_general().
    60 Note that there is access considerations - the compose function is
    61 being compiled twice.
    64 2.
    66 Split image types into their own source files. Export noop virtual
    67 reinit() call.  Call this whenever a property of the image changes.
    70 3. 
    72 Split the get_scanline() call into smaller functions that are
    73 initialized by the reinit() call.
    75 The Render Algorithm:
    76 	(first repeat, then filter, then transform, then clip)
    78 Starting from a destination pixel (x, y), do
    80 	1 x = x - xDst + xSrc
    81 	  y = y - yDst + ySrc
    83 	2 reject pixel that is outside the clip
    85 	This treats clipping as something that happens after
    86 	transformation, which I think is correct for client clips. For
    87 	hierarchy clips it is wrong, but who really cares? Without
    88 	GraphicsExposes hierarchy clips are basically irrelevant. Yes,
    89 	you could imagine cases where the pixels of a subwindow of a
    90 	redirected, transformed window should be treated as
    91 	transparent. I don't really care
    93 	Basically, I think the render spec should say that pixels that
    94 	are unavailable due to the hierarcy have undefined content,
    95 	and that GraphicsExposes are not generated. Ie., basically
    96 	that using non-redirected windows as sources is fail. This is
    97 	at least consistent with the current implementation and we can
    98 	update the spec later if someone makes it work.
   100 	The implication for render is that it should stop passing the
   101 	hierarchy clip to pixman. In pixman, if a souce image has a
   102 	clip it should be used in computing the composite region and
   103 	nowhere else, regardless of what "has_client_clip" says. The
   104 	default should be for there to not be any clip.
   106 	I would really like to get rid of the client clip as well for
   107 	source images, but unfortunately there is at least one
   108 	application in the wild that uses them.
   110 	3 Transform pixel: (x, y) = T(x, y)
   112 	4 Call p = GetUntransformedPixel (x, y)
   114 	5 If the image has an alpha map, then
   116 		Call GetUntransformedPixel (x, y) on the alpha map
   118 		add resulting alpha channel to p
   120 	   return p
   122 	Where GetUnTransformedPixel is:
   124 	6 switch (filter)
   125 	  {
   126 	  case NEAREST:
   127 		return GetUnfilteredPixel (x, y);
   128 		break;
   130 	  case BILINEAR:
   131 		return GetUnfilteredPixel (...) // 4 times 
   132 		break;
   134 	  case CONVOLUTION:
   135 		return GetUnfilteredPixel (...) // as many times as necessary.
   136 		break;
   137 	  }
   139 	Where GetUnfilteredPixel (x, y) is
   141 	7 switch (repeat)
   142 	   {
   143 	   case REPEAT_NORMAL:
   144 	   case REPEAT_PAD:
   145 	   case REPEAT_REFLECT:
   146 		// adjust x, y as appropriate
   147 		break;
   149 	   case REPEAT_NONE:
   150 	        if (x, y) is outside image bounds
   151 		     return 0;
   152 		break;
   153 	   }
   155 	   return GetRawPixel(x, y)
   157 	Where GetRawPixel (x, y) is
   159 	8 Compute the pixel in question, depending on image type.
   161 For gradients, repeat has a totally different meaning, so
   162 UnfilteredPixel() and RawPixel() must be the same function so that
   163 gradients can do their own repeat algorithm.
   165 So, the GetRawPixel
   167 	for bits must deal with repeats
   168 	for gradients must deal with repeats (differently)
   169 	for solids, should ignore repeats.
   171 	for polygons, when we add them, either ignore repeats or do
   172 	something similar to bits (in which case, we may want an extra
   173 	layer of indirection to modify the coordinates).
   175 It is then possible to build things like "get scanline" or "get tile" on
   176 top of this. In the simplest case, just repeatedly calling GetPixel()
   177 would work, but specialized get_scanline()s or get_tile()s could be
   178 plugged in for common cases. 
   180 By not plugging anything in for images with access functions, we only
   181 have to compile the pixel functions twice, not the scanline functions.
   183 And we can get rid of fetchers for the bizarre formats that no one
   184 uses. Such as b2g3r3 etc. r1g2b1? Seriously? It is also worth
   185 considering a generic format based pixel fetcher for these edge cases.
   187 Since the actual routines depend on the image attributes, the images
   188 must be notified when those change and update their function pointers
   189 appropriately. So there should probably be a virtual function called
   190 (* reinit) or something like that.
   192 There will also be wide fetchers for both pixels and lines. The line
   193 fetcher will just call the wide pixel fetcher. The wide pixel fetcher
   194 will just call expand, except for 10 bit formats.
   196 Rendering pipeline:
   198 Drawable:
   199 	0. if (picture has alpha map)
   200 		0.1. Position alpha map according to the alpha_x/alpha_y
   201 	        0.2. Where the two drawables intersect, the alpha channel
   202 		     Replace the alpha channel of source with the one
   203 		     from the alpha map. Replacement only takes place
   204 		     in the intersection of the two drawables' geometries.
   205 	1. Repeat the drawable according to the repeat attribute
   206 	2. Reconstruct a continuous image according to the filter
   207 	3. Transform according to the transform attribute
   208 	4. Position image such that src_x, src_y is over dst_x, dst_y
   209 	5. Sample once per destination pixel 
   210 	6. Clip. If a pixel is not within the source clip, then no
   211 	   compositing takes place at that pixel. (Ie., it's *not*
   212 	   treated as 0).
   214 	Sampling a drawable: 
   216 	- If the channel does not have an alpha channel, the pixels in it
   217 	  are treated as opaque.
   219 	Note on reconstruction:
   221 	- The top left pixel has coordinates (0.5, 0.5) and pixels are
   222 	  spaced 1 apart.
   224 Gradient:
   225 	1. Unless gradient type is conical, repeat the underlying (0, 1)
   226 		gradient according to the repeat attribute
   227 	2. Integrate the gradient across the plane according to type.
   228 	3. Transform according to transform attribute
   229 	4. Position gradient 
   230 	5. Sample once per destination pixel.
   231  	6. Clip
   233 Solid Fill:
   234 	1. Repeat has no effect
   235 	2. Image is already continuous and defined for the entire plane
   236 	3. Transform has no effect
   237 	4. Positioning has no effect
   238 	5. Sample once per destination pixel.
   239 	6. Clip
   241 Polygon:
   242 	1. Repeat has no effect
   243 	2. Image is already continuous and defined on the whole plane
   244 	3. Transform according to transform attribute
   245 	4. Position image
   246 	5. Supersample 15x17 per destination pixel.
   247 	6. Clip
   249 Possibly interesting additions:
   250 	- More general transformations, such as warping, or general
   251 	  shading.
   253 	- Shader image where a function is called to generate the
   254           pixel (ie., uploading assembly code).
   256 	- Resampling kernels
   258 	  In principle the polygon image uses a 15x17 box filter for
   259 	  resampling. If we allow general resampling filters, then we
   260 	  get all the various antialiasing types for free. 
   262 	  Bilinear downsampling looks terrible and could be much 
   263 	  improved by a resampling filter. NEAREST reconstruction
   264 	  combined with a box resampling filter is what GdkPixbuf
   265 	  does, I believe.
   267 	  Useful for high frequency gradients as well.
   269 	  (Note that the difference between a reconstruction and a
   270 	  resampling filter is mainly where in the pipeline they
   271 	  occur. High quality resampling should use a correctly
   272 	  oriented kernel so it should happen after transformation.
   274 	  An implementation can transform the resampling kernel and
   275 	  convolve it with the reconstruction if it so desires, but it
   276 	  will need to deal with the fact that the resampling kernel
   277 	  will not necessarily be pixel aligned.
   279 	  "Output kernels"
   281 	  One could imagine doing the resampling after compositing,
   282 	  ie., for each destination pixel sample each source image 16
   283 	  times, then composite those subpixels individually, then
   284 	  finally apply a kernel.
   286 	  However, this is effectively the same as full screen
   287 	  antialiasing, which is a simpler way to think about it. So
   288 	  resampling kernels may make sense for individual images, but
   289 	  not as a post-compositing step.
   291 	  Fullscreen AA is inefficient without chained compositing
   292 	  though. Consider an (image scaled up to oversample size IN
   293 	  some polygon) scaled down to screen size. With the current
   294 	  implementation, there will be a huge temporary. With chained
   295 	  compositing, the whole thing ends up being equivalent to the
   296 	  output kernel from above.
   298 	- Color space conversion
   300 	  The complete model here is that each surface has a color
   301 	  space associated with it and that the compositing operation
   302 	  also has one associated with it. Note also that gradients
   303 	  should have associcated colorspaces.
   305 	- Dithering
   307 	  If people dither something that is already dithered, it will
   308 	  look terrible, but don't do that, then. (Dithering happens
   309 	  after resampling if at all - what is the relationship
   310 	  with color spaces? Presumably dithering should happen in linear
   311 	  intensity space).
   313 	- Floating point surfaces, 16, 32 and possibly 64 bit per
   314 	  channel.
   316 	Maybe crack:
   318 	- Glyph polygons
   320 	  If glyphs could be given as polygons, they could be
   321 	  positioned and rasterized more accurately. The glyph
   322 	  structure would need subpixel positioning though.
   324 	- Luminance vs. coverage for the alpha channel
   326 	  Whether the alpha channel should be interpreted as luminance
   327           modulation or as coverage (intensity modulation). This is a
   328           bit of a departure from the rendering model though. It could
   329 	  also be considered whether it should be possible to have 
   330 	  both channels in the same drawable.
   332 	- Alternative for component alpha
   334 	  - Set component-alpha on the output image.
   336 	    - This means each of the components are sampled
   337 	      independently and composited in the corresponding
   338 	      channel only.
   340 	  - Have 3 x oversampled mask
   342 	  - Scale it down by 3 horizontally, with [ 1/3, 1/3, 1/3 ]
   343             resampling filter. 
   345 	    Is this equivalent to just using a component alpha mask?
   347 	Incompatible changes:
   349 	- Gradients could be specified with premultiplied colors. (You
   350 	  can use a mask to get things like gradients from solid red to
   351 	  transparent red.
   353 Refactoring pixman
   355 The pixman code is not particularly nice to put it mildly. Among the
   356 issues are
   358 - inconsistent naming style (fb vs Fb, camelCase vs
   359   underscore_naming). Sometimes there is even inconsistency *within*
   360   one name.
   362       fetchProc32 ACCESS(pixman_fetchProcForPicture32)
   364   may be one of the uglies names ever created.
   366   coding style: 
   367   	 use the one from cairo except that pixman uses this brace style:
   369 		while (blah)
   370 		{
   371 		}
   373 	Format do while like this:
   375 	       do 
   376 	       {
   378 	       } 
   379 	       while (...);
   381 - PIXMAN_COMPOSITE_RECT_GENERAL() is horribly complex
   383 - switch case logic in pixman-access.c
   385   Instead it would be better to just store function pointers in the
   386   image objects themselves,
   388   	get_pixel()
   389 	get_scanline()
   391 - Much of the scanline fetching code is for formats that no one 
   392   ever uses. a2r2g2b2 anyone?
   394   It would probably be worthwhile having a generic fetcher for any
   395   pixman format whatsoever.
   397 - Code related to particular image types should be split into individual
   398   files.
   400 	pixman-bits-image.c
   401 	pixman-linear-gradient-image.c
   402 	pixman-radial-gradient-image.c
   403 	pixman-solid-image.c
   405 - Fast path code should be split into files based on architecture:
   407        pixman-mmx-fastpath.c
   408        pixman-sse2-fastpath.c
   409        pixman-c-fastpath.c
   411        etc.
   413   Each of these files should then export a fastpath table, which would
   414   be declared in pixman-private.h. This should allow us to get rid
   415   of the pixman-mmx.h files.
   417   The fast path table should describe each fast path. Ie there should
   418   be bitfields indicating what things the fast path can handle, rather than
   419   like now where it is only allowed to take one format per src/mask/dest. Ie., 
   421   { 
   422     FAST_a8r8g8b8 | FAST_x8r8g8b8,
   423     FAST_null,
   424     FAST_x8r8g8b8,
   425     FAST_repeat_normal | FAST_repeat_none,
   426     the_fast_path
   427   }
   429 There should then be *one* file that implements pixman_image_composite(). 
   430 This should do this:
   432      optimize_operator();
   434      convert 1x1 repeat to solid (actually this should be done at
   435      image creation time).
   437      is there a useful fastpath?
   439 There should be a file called pixman-cpu.c that contains all the
   440 architecture specific stuff to detect what CPU features we have.
   442 Issues that must be kept in mind:
   444        - we need accessor code to be preserved
   446        - maybe there should be a "store_scanline" too?
   448          Is this sufficient?
   450 	 We should preserve the optimization where the
   451 	 compositing happens directly in the destination
   452 	 whenever possible.
   454 	- It should be possible to create GPU samplers from the
   455 	  images.
   457 The "horizontal" classification should be a bit in the image, the
   458 "vertical" classification should just happen inside the gradient
   459 file. Note though that
   461       (a) these will change if the tranformation/repeat changes.
   463       (b) at the moment the optimization for linear gradients
   464           takes the source rectangle into account. Presumably
   465 	  this is to also optimize the case where the gradient
   466 	  is close enough to horizontal?
   468 Who is responsible for repeats? In principle it should be the scanline
   469 fetch. Right now NORMAL repeats are handled by walk_composite_region()
   470 while other repeats are handled by the scanline code.
   473 (Random note on filtering: do you filter before or after
   474 transformation?  Hardware is going to filter after transformation;
   475 this is also what pixman does currently). It's not completely clear
   476 what filtering *after* transformation means. One thing that might look
   477 good would be to do *supersampling*, ie., compute multiple subpixels
   478 per destination pixel, then average them together.

mercurial