gfx/cairo/libpixman/src/refactor

changeset 0
6474c204b198
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/gfx/cairo/libpixman/src/refactor	Wed Dec 31 06:09:35 2014 +0100
     1.3 @@ -0,0 +1,478 @@
     1.4 +Roadmap
     1.5 +
     1.6 +- Move all the fetchers etc. into pixman-image to make pixman-compose.c
     1.7 +  less intimidating.
     1.8 +
     1.9 +  DONE
    1.10 +
    1.11 +- Make combiners for unified alpha take a mask argument. That way
    1.12 +  we won't need two separate paths for unified vs component in the
    1.13 +  general compositing code.
    1.14 +
    1.15 +  DONE, except that the Altivec code needs to be updated. Luca is
    1.16 +  looking into that.
    1.17 +
    1.18 +- Delete separate 'unified alpha' path
    1.19 + 
    1.20 +  DONE
    1.21 +
    1.22 +- Split images into their own files
    1.23 +
    1.24 +  DONE
    1.25 +
    1.26 +- Split the gradient walker code out into its own file
    1.27 +
    1.28 +  DONE
    1.29 +
    1.30 +- Add scanline getters per image
    1.31 +
    1.32 +  DONE
    1.33 +
    1.34 +- Generic 64 bit fetcher 
    1.35 +
    1.36 +  DONE
    1.37 +
    1.38 +- Split fast path tables into their respective architecture dependent
    1.39 +  files.
    1.40 +
    1.41 +See "Render Algorithm" below for rationale
    1.42 +
    1.43 +Images will eventually have these virtual functions:
    1.44 +
    1.45 +       get_scanline()
    1.46 +       get_scanline_wide()
    1.47 +       get_pixel()
    1.48 +       get_pixel_wide()
    1.49 +       get_untransformed_pixel()
    1.50 +       get_untransformed_pixel_wide()
    1.51 +       get_unfiltered_pixel()
    1.52 +       get_unfiltered_pixel_wide()
    1.53 +
    1.54 +       store_scanline()
    1.55 +       store_scanline_wide()
    1.56 +
    1.57 +1.
    1.58 +
    1.59 +Initially we will just have get_scanline() and get_scanline_wide();
    1.60 +these will be based on the ones in pixman-compose. Hopefully this will
    1.61 +reduce the complexity in pixman_composite_rect_general().
    1.62 +
    1.63 +Note that there is access considerations - the compose function is
    1.64 +being compiled twice.
    1.65 +
    1.66 +
    1.67 +2.
    1.68 +
    1.69 +Split image types into their own source files. Export noop virtual
    1.70 +reinit() call.  Call this whenever a property of the image changes.
    1.71 +
    1.72 +
    1.73 +3. 
    1.74 +
    1.75 +Split the get_scanline() call into smaller functions that are
    1.76 +initialized by the reinit() call.
    1.77 +
    1.78 +The Render Algorithm:
    1.79 +	(first repeat, then filter, then transform, then clip)
    1.80 +
    1.81 +Starting from a destination pixel (x, y), do
    1.82 +
    1.83 +	1 x = x - xDst + xSrc
    1.84 +	  y = y - yDst + ySrc
    1.85 +
    1.86 +	2 reject pixel that is outside the clip
    1.87 +
    1.88 +	This treats clipping as something that happens after
    1.89 +	transformation, which I think is correct for client clips. For
    1.90 +	hierarchy clips it is wrong, but who really cares? Without
    1.91 +	GraphicsExposes hierarchy clips are basically irrelevant. Yes,
    1.92 +	you could imagine cases where the pixels of a subwindow of a
    1.93 +	redirected, transformed window should be treated as
    1.94 +	transparent. I don't really care
    1.95 +
    1.96 +	Basically, I think the render spec should say that pixels that
    1.97 +	are unavailable due to the hierarcy have undefined content,
    1.98 +	and that GraphicsExposes are not generated. Ie., basically
    1.99 +	that using non-redirected windows as sources is fail. This is
   1.100 +	at least consistent with the current implementation and we can
   1.101 +	update the spec later if someone makes it work.
   1.102 +
   1.103 +	The implication for render is that it should stop passing the
   1.104 +	hierarchy clip to pixman. In pixman, if a souce image has a
   1.105 +	clip it should be used in computing the composite region and
   1.106 +	nowhere else, regardless of what "has_client_clip" says. The
   1.107 +	default should be for there to not be any clip.
   1.108 +
   1.109 +	I would really like to get rid of the client clip as well for
   1.110 +	source images, but unfortunately there is at least one
   1.111 +	application in the wild that uses them.
   1.112 +
   1.113 +	3 Transform pixel: (x, y) = T(x, y)
   1.114 +
   1.115 +	4 Call p = GetUntransformedPixel (x, y)
   1.116 +
   1.117 +	5 If the image has an alpha map, then
   1.118 +
   1.119 +		Call GetUntransformedPixel (x, y) on the alpha map
   1.120 +		
   1.121 +		add resulting alpha channel to p
   1.122 +
   1.123 +	   return p
   1.124 +
   1.125 +	Where GetUnTransformedPixel is:
   1.126 +
   1.127 +	6 switch (filter)
   1.128 +	  {
   1.129 +	  case NEAREST:
   1.130 +		return GetUnfilteredPixel (x, y);
   1.131 +		break;
   1.132 +
   1.133 +	  case BILINEAR:
   1.134 +		return GetUnfilteredPixel (...) // 4 times 
   1.135 +		break;
   1.136 +
   1.137 +	  case CONVOLUTION:
   1.138 +		return GetUnfilteredPixel (...) // as many times as necessary.
   1.139 +		break;
   1.140 +	  }
   1.141 +
   1.142 +	Where GetUnfilteredPixel (x, y) is
   1.143 +
   1.144 +	7 switch (repeat)
   1.145 +	   {
   1.146 +	   case REPEAT_NORMAL:
   1.147 +	   case REPEAT_PAD:
   1.148 +	   case REPEAT_REFLECT:
   1.149 +		// adjust x, y as appropriate
   1.150 +		break;
   1.151 +
   1.152 +	   case REPEAT_NONE:
   1.153 +	        if (x, y) is outside image bounds
   1.154 +		     return 0;
   1.155 +		break;
   1.156 +	   }
   1.157 +
   1.158 +	   return GetRawPixel(x, y)
   1.159 +
   1.160 +	Where GetRawPixel (x, y) is
   1.161 +
   1.162 +	8 Compute the pixel in question, depending on image type.
   1.163 +
   1.164 +For gradients, repeat has a totally different meaning, so
   1.165 +UnfilteredPixel() and RawPixel() must be the same function so that
   1.166 +gradients can do their own repeat algorithm.
   1.167 +
   1.168 +So, the GetRawPixel
   1.169 +
   1.170 +	for bits must deal with repeats
   1.171 +	for gradients must deal with repeats (differently)
   1.172 +	for solids, should ignore repeats.
   1.173 +
   1.174 +	for polygons, when we add them, either ignore repeats or do
   1.175 +	something similar to bits (in which case, we may want an extra
   1.176 +	layer of indirection to modify the coordinates).
   1.177 +
   1.178 +It is then possible to build things like "get scanline" or "get tile" on
   1.179 +top of this. In the simplest case, just repeatedly calling GetPixel()
   1.180 +would work, but specialized get_scanline()s or get_tile()s could be
   1.181 +plugged in for common cases. 
   1.182 +
   1.183 +By not plugging anything in for images with access functions, we only
   1.184 +have to compile the pixel functions twice, not the scanline functions.
   1.185 +
   1.186 +And we can get rid of fetchers for the bizarre formats that no one
   1.187 +uses. Such as b2g3r3 etc. r1g2b1? Seriously? It is also worth
   1.188 +considering a generic format based pixel fetcher for these edge cases.
   1.189 +
   1.190 +Since the actual routines depend on the image attributes, the images
   1.191 +must be notified when those change and update their function pointers
   1.192 +appropriately. So there should probably be a virtual function called
   1.193 +(* reinit) or something like that.
   1.194 +
   1.195 +There will also be wide fetchers for both pixels and lines. The line
   1.196 +fetcher will just call the wide pixel fetcher. The wide pixel fetcher
   1.197 +will just call expand, except for 10 bit formats.
   1.198 +
   1.199 +Rendering pipeline:
   1.200 +
   1.201 +Drawable:
   1.202 +	0. if (picture has alpha map)
   1.203 +		0.1. Position alpha map according to the alpha_x/alpha_y
   1.204 +	        0.2. Where the two drawables intersect, the alpha channel
   1.205 +		     Replace the alpha channel of source with the one
   1.206 +		     from the alpha map. Replacement only takes place
   1.207 +		     in the intersection of the two drawables' geometries.
   1.208 +	1. Repeat the drawable according to the repeat attribute
   1.209 +	2. Reconstruct a continuous image according to the filter
   1.210 +	3. Transform according to the transform attribute
   1.211 +	4. Position image such that src_x, src_y is over dst_x, dst_y
   1.212 +	5. Sample once per destination pixel 
   1.213 +	6. Clip. If a pixel is not within the source clip, then no
   1.214 +	   compositing takes place at that pixel. (Ie., it's *not*
   1.215 +	   treated as 0).
   1.216 +
   1.217 +	Sampling a drawable: 
   1.218 +
   1.219 +	- If the channel does not have an alpha channel, the pixels in it
   1.220 +	  are treated as opaque.
   1.221 +
   1.222 +	Note on reconstruction:
   1.223 +
   1.224 +	- The top left pixel has coordinates (0.5, 0.5) and pixels are
   1.225 +	  spaced 1 apart.
   1.226 +
   1.227 +Gradient:
   1.228 +	1. Unless gradient type is conical, repeat the underlying (0, 1)
   1.229 +		gradient according to the repeat attribute
   1.230 +	2. Integrate the gradient across the plane according to type.
   1.231 +	3. Transform according to transform attribute
   1.232 +	4. Position gradient 
   1.233 +	5. Sample once per destination pixel.
   1.234 + 	6. Clip
   1.235 +
   1.236 +Solid Fill:
   1.237 +	1. Repeat has no effect
   1.238 +	2. Image is already continuous and defined for the entire plane
   1.239 +	3. Transform has no effect
   1.240 +	4. Positioning has no effect
   1.241 +	5. Sample once per destination pixel.
   1.242 +	6. Clip
   1.243 +
   1.244 +Polygon:
   1.245 +	1. Repeat has no effect
   1.246 +	2. Image is already continuous and defined on the whole plane
   1.247 +	3. Transform according to transform attribute
   1.248 +	4. Position image
   1.249 +	5. Supersample 15x17 per destination pixel.
   1.250 +	6. Clip
   1.251 +
   1.252 +Possibly interesting additions:
   1.253 +	- More general transformations, such as warping, or general
   1.254 +	  shading.
   1.255 +
   1.256 +	- Shader image where a function is called to generate the
   1.257 +          pixel (ie., uploading assembly code).
   1.258 +
   1.259 +	- Resampling kernels
   1.260 +
   1.261 +	  In principle the polygon image uses a 15x17 box filter for
   1.262 +	  resampling. If we allow general resampling filters, then we
   1.263 +	  get all the various antialiasing types for free. 
   1.264 +
   1.265 +	  Bilinear downsampling looks terrible and could be much 
   1.266 +	  improved by a resampling filter. NEAREST reconstruction
   1.267 +	  combined with a box resampling filter is what GdkPixbuf
   1.268 +	  does, I believe.
   1.269 +
   1.270 +	  Useful for high frequency gradients as well.
   1.271 +
   1.272 +	  (Note that the difference between a reconstruction and a
   1.273 +	  resampling filter is mainly where in the pipeline they
   1.274 +	  occur. High quality resampling should use a correctly
   1.275 +	  oriented kernel so it should happen after transformation.
   1.276 +
   1.277 +	  An implementation can transform the resampling kernel and
   1.278 +	  convolve it with the reconstruction if it so desires, but it
   1.279 +	  will need to deal with the fact that the resampling kernel
   1.280 +	  will not necessarily be pixel aligned.
   1.281 +
   1.282 +	  "Output kernels"
   1.283 +
   1.284 +	  One could imagine doing the resampling after compositing,
   1.285 +	  ie., for each destination pixel sample each source image 16
   1.286 +	  times, then composite those subpixels individually, then
   1.287 +	  finally apply a kernel.
   1.288 +
   1.289 +	  However, this is effectively the same as full screen
   1.290 +	  antialiasing, which is a simpler way to think about it. So
   1.291 +	  resampling kernels may make sense for individual images, but
   1.292 +	  not as a post-compositing step.
   1.293 +	  
   1.294 +	  Fullscreen AA is inefficient without chained compositing
   1.295 +	  though. Consider an (image scaled up to oversample size IN
   1.296 +	  some polygon) scaled down to screen size. With the current
   1.297 +	  implementation, there will be a huge temporary. With chained
   1.298 +	  compositing, the whole thing ends up being equivalent to the
   1.299 +	  output kernel from above.
   1.300 +
   1.301 +	- Color space conversion
   1.302 +
   1.303 +	  The complete model here is that each surface has a color
   1.304 +	  space associated with it and that the compositing operation
   1.305 +	  also has one associated with it. Note also that gradients
   1.306 +	  should have associcated colorspaces.
   1.307 +
   1.308 +	- Dithering
   1.309 +
   1.310 +	  If people dither something that is already dithered, it will
   1.311 +	  look terrible, but don't do that, then. (Dithering happens
   1.312 +	  after resampling if at all - what is the relationship
   1.313 +	  with color spaces? Presumably dithering should happen in linear
   1.314 +	  intensity space).
   1.315 +
   1.316 +	- Floating point surfaces, 16, 32 and possibly 64 bit per
   1.317 +	  channel.
   1.318 +
   1.319 +	Maybe crack:
   1.320 +
   1.321 +	- Glyph polygons
   1.322 +
   1.323 +	  If glyphs could be given as polygons, they could be
   1.324 +	  positioned and rasterized more accurately. The glyph
   1.325 +	  structure would need subpixel positioning though.
   1.326 +
   1.327 +	- Luminance vs. coverage for the alpha channel
   1.328 +
   1.329 +	  Whether the alpha channel should be interpreted as luminance
   1.330 +          modulation or as coverage (intensity modulation). This is a
   1.331 +          bit of a departure from the rendering model though. It could
   1.332 +	  also be considered whether it should be possible to have 
   1.333 +	  both channels in the same drawable.
   1.334 +
   1.335 +	- Alternative for component alpha
   1.336 +
   1.337 +	  - Set component-alpha on the output image.
   1.338 +
   1.339 +	    - This means each of the components are sampled
   1.340 +	      independently and composited in the corresponding
   1.341 +	      channel only.
   1.342 +
   1.343 +	  - Have 3 x oversampled mask
   1.344 +
   1.345 +	  - Scale it down by 3 horizontally, with [ 1/3, 1/3, 1/3 ]
   1.346 +            resampling filter. 
   1.347 +
   1.348 +	    Is this equivalent to just using a component alpha mask?
   1.349 +
   1.350 +	Incompatible changes:
   1.351 +
   1.352 +	- Gradients could be specified with premultiplied colors. (You
   1.353 +	  can use a mask to get things like gradients from solid red to
   1.354 +	  transparent red.
   1.355 +
   1.356 +Refactoring pixman
   1.357 +
   1.358 +The pixman code is not particularly nice to put it mildly. Among the
   1.359 +issues are
   1.360 +
   1.361 +- inconsistent naming style (fb vs Fb, camelCase vs
   1.362 +  underscore_naming). Sometimes there is even inconsistency *within*
   1.363 +  one name.
   1.364 +
   1.365 +      fetchProc32 ACCESS(pixman_fetchProcForPicture32)
   1.366 +
   1.367 +  may be one of the uglies names ever created.
   1.368 +
   1.369 +  coding style: 
   1.370 +  	 use the one from cairo except that pixman uses this brace style:
   1.371 +	 
   1.372 +		while (blah)
   1.373 +		{
   1.374 +		}
   1.375 +
   1.376 +	Format do while like this:
   1.377 +
   1.378 +	       do 
   1.379 +	       {
   1.380 +
   1.381 +	       } 
   1.382 +	       while (...);
   1.383 +
   1.384 +- PIXMAN_COMPOSITE_RECT_GENERAL() is horribly complex
   1.385 +
   1.386 +- switch case logic in pixman-access.c
   1.387 +
   1.388 +  Instead it would be better to just store function pointers in the
   1.389 +  image objects themselves,
   1.390 +
   1.391 +  	get_pixel()
   1.392 +	get_scanline()
   1.393 +
   1.394 +- Much of the scanline fetching code is for formats that no one 
   1.395 +  ever uses. a2r2g2b2 anyone?
   1.396 +
   1.397 +  It would probably be worthwhile having a generic fetcher for any
   1.398 +  pixman format whatsoever.
   1.399 +
   1.400 +- Code related to particular image types should be split into individual
   1.401 +  files.
   1.402 +
   1.403 +	pixman-bits-image.c
   1.404 +	pixman-linear-gradient-image.c
   1.405 +	pixman-radial-gradient-image.c
   1.406 +	pixman-solid-image.c
   1.407 +
   1.408 +- Fast path code should be split into files based on architecture:
   1.409 +
   1.410 +       pixman-mmx-fastpath.c
   1.411 +       pixman-sse2-fastpath.c
   1.412 +       pixman-c-fastpath.c
   1.413 +
   1.414 +       etc.
   1.415 +
   1.416 +  Each of these files should then export a fastpath table, which would
   1.417 +  be declared in pixman-private.h. This should allow us to get rid
   1.418 +  of the pixman-mmx.h files.
   1.419 +
   1.420 +  The fast path table should describe each fast path. Ie there should
   1.421 +  be bitfields indicating what things the fast path can handle, rather than
   1.422 +  like now where it is only allowed to take one format per src/mask/dest. Ie., 
   1.423 +
   1.424 +  { 
   1.425 +    FAST_a8r8g8b8 | FAST_x8r8g8b8,
   1.426 +    FAST_null,
   1.427 +    FAST_x8r8g8b8,
   1.428 +    FAST_repeat_normal | FAST_repeat_none,
   1.429 +    the_fast_path
   1.430 +  }
   1.431 +
   1.432 +There should then be *one* file that implements pixman_image_composite(). 
   1.433 +This should do this:
   1.434 +
   1.435 +     optimize_operator();
   1.436 +
   1.437 +     convert 1x1 repeat to solid (actually this should be done at
   1.438 +     image creation time).
   1.439 +     
   1.440 +     is there a useful fastpath?
   1.441 +
   1.442 +There should be a file called pixman-cpu.c that contains all the
   1.443 +architecture specific stuff to detect what CPU features we have.
   1.444 +
   1.445 +Issues that must be kept in mind:
   1.446 +
   1.447 +       - we need accessor code to be preserved
   1.448 +
   1.449 +       - maybe there should be a "store_scanline" too?
   1.450 +
   1.451 +         Is this sufficient?
   1.452 +
   1.453 +	 We should preserve the optimization where the
   1.454 +	 compositing happens directly in the destination
   1.455 +	 whenever possible.
   1.456 +
   1.457 +	- It should be possible to create GPU samplers from the
   1.458 +	  images.
   1.459 +
   1.460 +The "horizontal" classification should be a bit in the image, the
   1.461 +"vertical" classification should just happen inside the gradient
   1.462 +file. Note though that
   1.463 +
   1.464 +      (a) these will change if the tranformation/repeat changes.
   1.465 +
   1.466 +      (b) at the moment the optimization for linear gradients
   1.467 +          takes the source rectangle into account. Presumably
   1.468 +	  this is to also optimize the case where the gradient
   1.469 +	  is close enough to horizontal?
   1.470 +
   1.471 +Who is responsible for repeats? In principle it should be the scanline
   1.472 +fetch. Right now NORMAL repeats are handled by walk_composite_region()
   1.473 +while other repeats are handled by the scanline code.
   1.474 +
   1.475 +
   1.476 +(Random note on filtering: do you filter before or after
   1.477 +transformation?  Hardware is going to filter after transformation;
   1.478 +this is also what pixman does currently). It's not completely clear
   1.479 +what filtering *after* transformation means. One thing that might look
   1.480 +good would be to do *supersampling*, ie., compute multiple subpixels
   1.481 +per destination pixel, then average them together.

mercurial